2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection.
CoRR, February, 2025
Position-Guided Point Cloud Panoptic Segmentation Transformer.
Int. J. Comput. Vis., January, 2025
2024
3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting.
ACM Trans. Graph., December, 2024
Transformer-Based Visual Segmentation: A Survey.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024
Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation.
CoRR, 2024
Learning Humanoid Locomotion with Perceptive Internal Model.
CoRR, 2024
VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding.
CoRR, 2024
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness.
CoRR, 2024
GRUtopia: Dream General Robots in a City at Scale.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
OVExp: Open Vocabulary Exploration for Object-Oriented Navigation.
CoRR, 2024
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights.
CoRR, 2024
Grounded 3D-LLM with Referent Tokens.
CoRR, 2024
Learning H-Infinity Locomotion Control.
CoRR, 2024
RoboDuet: A Framework Affording Mobile-Manipulation and Cross-Embodiment.
CoRR, 2024
Mixed Gaussian Flow for Diverse Trajectory Prediction.
CoRR, 2024
MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
X-neuron: Interpreting, Locating and Editing of Neurons in Reinforcement Learning Policy.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
Multi-Object Tracking by Hierarchical Visual Representations.
Proceedings of the IEEE International Conference on Robotics and Automation, 2024
Unified Human-Scene Interaction via Prompted Chain-of-Contacts.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
PointLLM: Empowering Large Language Models to Understand Point Clouds.
Proceedings of the Computer Vision - ECCV 2024, 2024
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023
Self-Adversarial Disentangling for Specific Domain Adaptation.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023
Context-Aware Mixup for Domain Adaptive Semantic Segmentation.
IEEE Trans. Circuits Syst. Video Technol., February, 2023
Understanding Masked Autoencoders From a Local Contrastive Perspective.
CoRR, 2023
Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation.
CoRR, 2023
OV-PARTS: Towards Open-Vocabulary Part Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Dense Distinct Query for End-to-End Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking.
Proceedings of the Conference on Robot Learning, 2023
2022
What Are Expected Queries in End-to-End Object Detection?
CoRR, 2022
Dense Siamese Network for Dense Unsupervised Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022
Monocular 3D Object Detection with Depth from Motion.
Proceedings of the Computer Vision - ECCV 2022, 2022
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Towards Balanced Learning for Instance Recognition.
Int. J. Comput. Vis., 2021
Self-Adversarial Disentangling for Specific Domain Adaptation.
CoRR, 2021
K-Net: Towards Unified Image Segmentation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021
Seesaw Loss for Long-Tailed Instance Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Quasi-Dense Similarity Learning for Multiple Object Tracking.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Probabilistic and Geometric Depth: Detecting Objects in Perspective.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021
2020
Quasi-Dense Instance Similarity Learning.
CoRR, 2020
Side-Aware Boundary Localization for More Precise Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020
2019
ℛ <sup>2</sup>-CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing Images.
IEEE Trans. Geosci. Remote. Sens., 2019
MMDetection: Open MMLab Detection Toolbox and Benchmark.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2019
$\mathcal{R}^2$-CNN: Fast Tiny Object Detection in Large-scale Remote Sensing Images.
CoRR, 2019
Adapting Object Detectors via Selective Cross-Domain Alignment.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Libra R-CNN: Towards Balanced Learning for Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Hybrid Task Cascade for Instance Segmentation.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018