Mu Cai

Orcid: 0009-0008-7967-9752

According to our database1, Mu Cai authored at least 22 papers between 2020 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Vinoground: Scrutinizing LMMs over Dense Temporal Reasoning with Short Videos.
CoRR, 2024

Interpolating Video-LLMs: Toward Longer-sequence LMMs in a Training-free Manner.
CoRR, 2024

Cross-Modal Self-Supervised Learning with Effective Contrastive Units for LiDAR Point Clouds.
CoRR, 2024

VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation.
CoRR, 2024

LLaRA: Supercharging Robot Learning Data for Vision-Language Policy.
CoRR, 2024

Yo'LLaVA: Your Personalized Language and Vision Assistant.
CoRR, 2024

Matryoshka Multimodal Models.
CoRR, 2024

LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models.
CoRR, 2024

VGBench: A Comprehensive Benchmark of Vector Graphics Understanding and Generation for Large Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Removing Distributional Discrepancies in Captions Improves Image-Text Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024

ViP-LLaVA: Making Large Multimodal Models Understand Arbitrary Visual Prompts.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

CounterCurate: Enhancing Physical and Semantic Visio-Linguistic Compositional Reasoning via Counterfactual Examples.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Making Large Multimodal Models Understand Arbitrary Visual Prompts.
CoRR, 2023

Investigating the Catastrophic Forgetting in Multimodal Large Language Models.
CoRR, 2023

Leveraging Large Language Models for Scalable Vector Graphics-Driven Image Understanding.
CoRR, 2023

Out-of-distribution Detection via Frequency-regularized Generative Models.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2022
VOS: Learning What You Don't Know by Virtual Outlier Synthesis.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Masked Discrimination for Self-supervised Learning on Point Clouds.
Proceedings of the Computer Vision - ECCV 2022, 2022

2021
Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

2020
Frequency Domain Image Translation: More Photo-realistic, Better Identity-preserving.
CoRR, 2020

A Game-Theoretic Strategy-Aware Interaction Algorithm with Validation on Real Traffic Data.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020


  Loading...