Le Zhuo

Orcid: 0000-0001-7895-091X

According to our database1, Le Zhuo authored at least 20 papers between 2022 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models.
CoRR, January, 2025

2024
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation.
CoRR, 2024

VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection.
CoRR, 2024

Customize Your Visual Autoregressive Recipe with Set Autoregressive Modeling.
CoRR, 2024

I-Max: Maximize the Resolution Potential of Pre-trained Rectified Flow Transformers with Projected Flow.
CoRR, 2024

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions.
CoRR, 2024

LLaVA-MoD: Making LLaVA Tiny via MoE Knowledge Distillation.
CoRR, 2024

Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining.
CoRR, 2024

Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT.
CoRR, 2024

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers.
CoRR, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.
CoRR, 2024

Lumina-Next : Making Lumina-T2X Stronger and Faster with Next-DiT.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions.
CoRR, 2023

GraphText: Graph Reasoning in Text Space.
CoRR, 2023

MARBLE: Music Audio Representation Benchmark for Universal Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DiffDance: Cascaded Human Motion Diffusion Model for Dance Generation.
Proceedings of the 31st ACM International Conference on Multimedia, 2023

LyricWhiz: Robust Multilingual Zero-Shot Lyrics Transcription by Whispering to ChatGPT.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023

Video Background Music Generation: Dataset, Method and Evaluation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
Video Background Music Generation: Dataset, Method and Evaluation.
CoRR, 2022


  Loading...