Deyao Zhu

Orcid: 0000-0001-8014-7309

According to our database1, Deyao Zhu authored at least 18 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
ChatGPT Asks, BLIP-2 Answers: Automatic Questioning Towards Enriched Visual Descriptions.
Trans. Mach. Learn. Res., 2024

How Well Can Vision Language Models See Image Details?
CoRR, 2024

MiniGPT-Med: Large Language Model as a General Interface for Radiology Diagnosis.
CoRR, 2024

MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens.
CoRR, 2024

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Goldfish: Vision-Language Understanding of Arbitrarily Long Videos.
Proceedings of the Computer Vision - ECCV 2024, 2024

2023
MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning.
CoRR, 2023

Exploring Open-Vocabulary Semantic Segmentation without Human Labels.
CoRR, 2023

Video ChatCaptioner: Towards Enriched Spatiotemporal Descriptions.
CoRR, 2023

Guiding Online Reinforcement Learning with Action-Free Offline Pretraining.
CoRR, 2023

Value Memory Graph: A Graph-Structured World Model for Offline Reinforcement Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Exploring Open-Vocabulary Semantic Segmentation from CLIP Vision Encoder Distillation Only.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Social-Implicit: Rethinking Trajectory Prediction Evaluation and The Effectiveness of Implicit Maximum Likelihood Estimation.
Proceedings of the Computer Vision - ECCV 2022, 2022

RelTransformer: A Transformer-Based Long-Tail Visual Relationship Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
RelTransformer: Balancing the Visual Relationship Detection from Local Context, Scene and Memory.
CoRR, 2021

HalentNet: Multimodal Trajectory Forecasting with Hallucinative Intents.
Proceedings of the 9th International Conference on Learning Representations, 2021

Motion Forecasting with Unlikelihood Training in Continuous Space.
Proceedings of the Conference on Robot Learning, 8-11 November 2021, London, UK., 2021

2019
Learning to Disentangle Latent Physical Factors for Video Prediction.
Proceedings of the Pattern Recognition, 2019


  Loading...