Jianjian Sun

Orcid: 0000-0002-1216-9626

According to our database1, Jianjian Sun authored at least 19 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Exploring Recurrent Long-Term Temporal Fusion for Multi-View 3D Perception.
IEEE Robotics Autom. Lett., July, 2024

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model.
CoRR, 2024

DistTrain: Addressing Model and Data Heterogeneity with Disaggregated Training for Multimodal Large Language Models.
CoRR, 2024

Focus Anywhere for Fine-grained Multi-page Document Understanding.
CoRR, 2024

Small Language Model Meets with Reinforced Vision Vocabulary.
CoRR, 2024

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

ChatSpot: Bootstrapping Multimodal LLMs via Precise Referring Instruction Tuning.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

DreamLLM: Synergistic Multimodal Comprehension and Creation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Vary: Scaling up the Vision Vocabulary for Large Vision-Language Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models.
CoRR, 2023

The 1st-place Solution for CVPR 2023 OpenLane Topology in Autonomous Driving Challenge.
CoRR, 2023

BEVStereo++: Accurate Depth Estimation in Multi-view 3D Object Detection via Dynamic Temporal Stereo.
CoRR, 2023

Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection.
CoRR, 2023

Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Reversible Column Networks.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Cross Modal Transformer: Towards Fast and Robust 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

BEVStereo: Enhancing Depth Estimation in Multi-View 3D Object Detection with Temporal Stereo.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo.
CoRR, 2022


  Loading...