2025

ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos.

[DOI]

Peiran Wu

Yunze Liu

Chonghan Liu

Miao Liu

Junxiao Shen

CoRR, March, 2025

VideoMAP: Toward Scalable Mamba-based Video Autoregressive Pretraining.

[DOI]

CoRR, March, 2025

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data.

[DOI]

CoRR, January, 2025

2024

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining.

[DOI]

Yunze Liu

Li Yi

CoRR, 2024

PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation.

[DOI]

CoRR, 2024

PhysReaction: Physically Plausible Real-Time Humanoid Reaction Synthesis via Forward Dynamics Guided 4D Imitation.

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

CrossVideo: Self-supervised Cross-modal Contrastive Learning for Point Cloud Video Understanding.

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2024

Physics-Aware Hand-Object Interaction Denoising.

[DOI]

Haowen Luo

Yunze Liu

Li Yi

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2023

Empowering biologists to decode omics data: the Genekitr R package and web server.

[DOI]

Yunze Liu

Gang Li

BMC Bioinform., December, 2023

Interactive Humanoid: Online Full-Body Motion Reaction Synthesis with Social Affordance Canonicalization and Forecasting.

[DOI]

Yunze Liu

Changxi Chen

Li Yi

CoRR, 2023

NSM4D: Neural Scene Model Based Online 4D Point Cloud Sequence Understanding.

[DOI]

CoRR, 2023

LeaF: Learning Frames for 4D Point Cloud Sequence Understanding.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Complete-to-Partial 4D Distillation for Self-Supervised Point Cloud Sequence Representation Learning.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2022

Point Primitive Transformer for Long-Term 4D Point Cloud Video Understanding.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

HOI4D: A 4D Egocentric Dataset for Category-Level Human-Object Interaction.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021

Contrastive Multimodal Fusion with TupleInfoNCE.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020

P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding.

[DOI]

CoRR, 2020