2025
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models.
CoRR, March, 2025

2024
ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis.
CoRR, 2024

Multimodal Generalized Category Discovery.
CoRR, 2024

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation.
CoRR, 2024

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis.
CoRR, 2024

ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024

Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024