2025

Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection.

[DOI]

Boyu Mi

Hanqing Wang

Tai Wang

Yilun Chen

Jiangmiao Pang

CoRR, February, 2025

Position-Guided Point Cloud Panoptic Segmentation Transformer.

[DOI]

Int. J. Comput. Vis., January, 2025

2024

3DGSR: Implicit Surface Reconstruction with 3D Gaussian Splatting.

[DOI]

ACM Trans. Graph., December, 2024

Transformer-Based Visual Segmentation: A Survey.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation.

[DOI]

CoRR, 2024

Learning Humanoid Locomotion with Perceptive Internal Model.

[DOI]

CoRR, 2024

VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding.

[DOI]

CoRR, 2024

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness.

[DOI]

CoRR, 2024

GRUtopia: Dream General Robots in a City at Scale.

[DOI]

CoRR, 2024

OVExp: Open Vocabulary Exploration for Object-Oriented Navigation.

[DOI]

CoRR, 2024

Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights.

[DOI]

CoRR, 2024

Grounded 3D-LLM with Referent Tokens.

[DOI]

CoRR, 2024

Learning H-Infinity Locomotion Control.

[DOI]

CoRR, 2024

RoboDuet: A Framework Affording Mobile-Manipulation and Cross-Embodiment.

[DOI]

CoRR, 2024

Mixed Gaussian Flow for Diverse Trajectory Prediction.

[DOI]

CoRR, 2024

MMScan: A Multi-Modal 3D Scene Dataset with Hierarchical Grounded Language Annotations.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

MGF: Mixed Gaussian Flow for Diverse Trajectory Prediction.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

What Makes CLIP More Robust to Long-Tailed Pre-Training Data? A Controlled Study for Transferable Insights.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

X-neuron: Interpreting, Locating and Editing of Neurons in Reinforcement Learning Policy.

[DOI]

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024

RoboKeyGen: Robot Pose and Joint Angles Estimation via Diffusion-based 3D Keypoint Generation.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Multi-Object Tracking by Hierarchical Visual Representations.

[DOI]

Jinkun Cao

Jiangmiao Pang

Kris Kitani

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Unified Human-Scene Interaction via Prompted Chain-of-Contacts.

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Hybrid Internal Model: Learning Agile Legged Locomotion with Simulated Robot Response.

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

PointLLM: Empowering Large Language Models to Understand Point Clouds.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

QDTrack: Quasi-Dense Similarity Learning for Appearance-Only Multiple Object Tracking.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Self-Adversarial Disentangling for Specific Domain Adaptation.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Context-Aware Mixup for Domain Adaptive Semantic Segmentation.

[DOI]

IEEE Trans. Circuits Syst. Video Technol., February, 2023

Understanding Masked Autoencoders From a Local Contrastive Perspective.

[DOI]

CoRR, 2023

Tube-Link: A Flexible Cross Tube Baseline for Universal Video Segmentation.

[DOI]

CoRR, 2023

OV-PARTS: Towards Open-Vocabulary Part Segmentation.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Dense Distinct Query for End-to-End Object Detection.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

DORT: Modeling Dynamic Objects in Recurrent for Multi-Camera 3D Object Detection and Tracking.

[DOI]

Proceedings of the Conference on Robot Learning, 2023

2022

What Are Expected Queries in End-to-End Object Detection?

[DOI]

CoRR, 2022

Dense Siamese Network.

[DOI]

CoRR, 2022

Dense Siamese Network for Dense Unsupervised Learning.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Monocular 3D Object Detection with Depth from Motion.

[DOI]

Tai Wang

Jiangmiao Pang

Dahua Lin

Proceedings of the Computer Vision - ECCV 2022, 2022

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Towards Balanced Learning for Instance Recognition.

[DOI]

Int. J. Comput. Vis., 2021

Self-Adversarial Disentangling for Specific Domain Adaptation.

[DOI]

CoRR, 2021

K-Net: Towards Unified Image Segmentation.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2021

Seesaw Loss for Long-Tailed Instance Segmentation.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Quasi-Dense Similarity Learning for Multiple Object Tracking.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Probabilistic and Geometric Depth: Detecting Objects in Perspective.

[DOI]

Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2020

Quasi-Dense Instance Similarity Learning.

[DOI]

CoRR, 2020

Side-Aware Boundary Localization for More Precise Object Detection.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

2019

ℛ <sup>2</sup>-CNN: Fast Tiny Object Detection in Large-Scale Remote Sensing Images.

[DOI]

IEEE Trans. Geosci. Remote. Sens., 2019

MMDetection: Open MMLab Detection Toolbox and Benchmark.

[DOI]

CoRR, 2019

$\mathcal{R}^2$-CNN: Fast Tiny Object Detection in Large-scale Remote Sensing Images.

[DOI]

CoRR, 2019

Adapting Object Detectors via Selective Cross-Domain Alignment.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Libra R-CNN: Towards Balanced Learning for Object Detection.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Hybrid Task Cascade for Instance Segmentation.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

FishNet: A Versatile Backbone for Image, Region, and Pixel Level Prediction.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018