LiveVQA: Live Visual Knowledge Seeking.
CoRR, April, 2025
GMValuator: Similarity-based Data Valuation for Generative Models.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Coarse Correspondence Elicit 3D Spacetime Understanding in Multimodal Language Model.
CoRR, 2024
Efficient Inference of Vision Instruction-Following Models with Elastic Cache.
Proceedings of the Computer Vision - ECCV 2024, 2024
Matching-based Data Valuation for Generative Model.
CoRR, 2023
Unleashing Text-to-Image Diffusion Models for Visual Perception.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Robust Object Detection via Instance-Level Temporal Cycle Confusion.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Multi-Proxy Wasserstein Classifier for Image Classification.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation.
Proceedings of the Computer Vision - ECCV 2020, 2020