DyST-XL: Dynamic Layout Planning and Content Control for Compositional Text-to-Video Generation.
CoRR, April, 2025
CMMCoT: Enhancing Complex Multi-Image Comprehension via Multi-Modal Chain-of-Thought and Memory Augmentation.
,
,
,
,
,
,
,
,
,
,
CoRR, March, 2025
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification.
CoRR, March, 2025
MINT: Multi-modal Chain of Thought in Unified Generative Models for Enhanced Image Generation.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, March, 2025
Synth-CLIP: Synthetic data make CLIP generalize better in data-limited scenarios.
Neural Networks, 2025
Envisioning Class Entity Reasoning by Large Language Models for Few-shot Learning.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
Tolerant Self-Distillation for image classification.
Neural Networks, 2024
Similar norm more transferable: Rethinking feature norms discrepancy in adversarial domain adaptation.
Knowl. Based Syst., 2024
RestorerID: Towards Tuning-Free Face Restoration with ID Preservation.
CoRR, 2024
Hybrid Mask Generation for Infrared Small Target Detection with Single-Point Supervision.
CoRR, 2024
Fully Fine-tuned CLIP Models are Efficient Few-Shot Learners.
CoRR, 2024
LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation.
CoRR, 2024
CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation.
CoRR, 2024
HOGDA: Boosting Semi-supervised Graph Domain Adaptation via High-Order Structure-Guided Adaptive Feature Alignment.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Improving Zero-Shot Generalization for CLIP with Variational Adapter.
Proceedings of the Computer Vision - ECCV 2024, 2024
OmniCLIP: Adapting CLIP for Video Recognition with Spatial-Temporal Omni-Scale Feature Learning.
Proceedings of the ECAI 2024 - 27th European Conference on Artificial Intelligence, 19-24 October 2024, Santiago de Compostela, Spain, 2024
Lightweight MIMO-WNet for single image deblurring.
Neurocomputing, 2023
SYNC-CLIP: Synthetic Data Make CLIP Generalize Better in Data-Limited Scenarios.
CoRR, 2023