Sipeng Zheng

Orcid: 0000-0001-5331-6314

According to our database1, Sipeng Zheng authored at least 20 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Quo Vadis, Motion Generation? From Large Language Models to Large Motion Models.
CoRR, 2024

From Pixels to Tokens: Byte-Pair Encoding on Quantized Visual Modalities.
CoRR, 2024

QuadrupedGPT: Towards a Versatile Quadruped Agent in Open-ended Worlds.
CoRR, 2024

EgoNCE++: Do Egocentric Video-Language Models Really Understand Hand-Object Interactions?
CoRR, 2024

SPAFormer: Sequential 3D Part Assembly with Transformers.
CoRR, 2024

LLaMA-Rider: Spurring Large Language Models to Explore the Open World.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

UniCode: Learning a Unified Codebook for Multimodal Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
No-frills Temporal Video Grounding: Multi-Scale Neighboring Attention and Zoom-in Boundary Detection.
CoRR, 2023

POV: Prompt-Oriented View-Agnostic Learning for Egocentric Hand-Object Interaction in the Multi-view World.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

Anchor-Based Detection for Natural Language Localization in Ego-Centric Videos.
Proceedings of the IEEE International Conference on Consumer Electronics, 2023

Open-Category Human-Object Interaction Pre-training via Language Modeling Framework.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Accommodating Audio Modality in CLIP for Multimodal Processing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Exploring Anchor-based Detection for Ego4D Natural Language Query.
CoRR, 2022

Few-Shot Action Recognition with Hierarchical Matching and Contrastive Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

VRDFormer: End-to-End Video Visual Relation Detection with Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
MR imaging for the quantitative assessment of brain iron in aceruloplasminemia: A postmortem validation study.
NeuroImage, 2021

2020
Skeleton-Based Interactive Graph Network For Human Object Interaction Detection.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2020

2019
Visual Relation Detection with Multi-Level Attention.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

Relation Understanding in Videos.
Proceedings of the 27th ACM International Conference on Multimedia, 2019


  Loading...