2025
DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation.
CoRR, April, 2025

CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation.
CoRR, March, 2025

RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification.
CoRR, March, 2025

MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation.
CoRR, March, 2025

Synth-CLIP: Synthetic data make CLIP generalize better in data-limited scenarios.
Neural Networks, 2025

Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024
Tolerant Self-Distillation for image classification.
Neural Networks, 2024

Similar norm more transferable: Rethinking feature norms discrepancy in adversarial domain adaptation.
Knowl. Based Syst., 2024

RestorerID: Towards Tuning-Free Face Restoration with ID Preservation.
CoRR, 2024

Hybrid Mask Generation for Infrared Small Target Detection with Single-Point Supervision.
CoRR, 2024

Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners.
CoRR, 2024

LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation.
CoRR, 2024

CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation.
CoRR, 2024

HOGDA: Boosting Semi-supervised Graph Domain Adaptation via High-Order Structure-Guided Adaptive Feature Alignment.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Improving Zero-Shot Generalization for CLIP with Variational Adapter.
Proceedings of the Computer Vision - ECCV 2024, 2024

OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024

2023
Lightweight MIMO-WNet for single image deblurring.
Neurocomputing, 2023

SYNC-CLIP: Synthetic Data Make CLIP Generalize Better in Data-Limited Scenarios.
CoRR, 2023