2025
Continuous Knowledge-Preserving Decomposition for Few-Shot Continual Learning.
CoRR, January, 2025
LipGen: Viseme-Guided Lip Video Generation for Enhancing Visual Speech Recognition.
CoRR, January, 2025
2024
Detecting and Grounding Multi-Modal Media Manipulation and Beyond.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2024
Self-Training Boosted Multi-Factor Matching Network for Composed Image Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2024
SNP-S<sup>3</sup>: Shared Network Pre-Training and Significant Semantic Strengthening for Various Video-Text Tasks.
IEEE Trans. Circuits Syst. Video Technol., April, 2024
TryonCM2: Try-on-Enhanced Fashion Compatibility Modeling Framework.
IEEE Trans. Neural Networks Learn. Syst., January, 2024
The co-evolution mechanism of policy mixes and innovation ecosystems: a case study of the new energy vehicle industry in China.
Int. J. Technol. Manag., 2024
ReTaKe: Reducing Temporal and Knowledge Redundancy for Long Video Understanding.
CoRR, 2024
Efficient Dataset Distillation via Diffusion-Driven Patch Selection for Improved Generalization.
CoRR, 2024
RA-BLIP: Multimodal Adaptive Retrieval-Augmented Bootstrapping Language-Image Pre-training.
CoRR, 2024
Preview-based Category Contrastive Learning for Knowledge Distillation.
CoRR, 2024
Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding.
CoRR, 2024
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture.
CoRR, 2024
Mamba-FSCIL: Dynamic Adaptation with Selective State Space Model for Few-Shot Class-Incremental Learning.
CoRR, 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models.
CoRR, 2024
WKVQuant: Quantizing Weight and Key/Value Cache for Large Language Models Gains More.
CoRR, 2024
SNP-S3: Shared Network Pre-training and Significant Semantic Strengthening for Various Video-Text Tasks.
CoRR, 2024
CorDA: Context-Oriented Decomposition Adaptation of Large Language Models for Task-Aware Parameter-Efficient Fine-tuning.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Differential-Perceptive and Retrieval-Augmented MLLM for Change Captioning.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
GenView: Enhancing View Quality with Pretrained Generative Model for Self-Supervised Learning.
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
Multi-Granularity Interaction and Integration Network for Video Question Answering.
IEEE Trans. Circuits Syst. Video Technol., December, 2023
Guest Editorial Introduction to the Special Issue on Video Transformers.
IEEE Trans. Circuits Syst. Video Technol., September, 2023
DualGNN: Dual Graph Neural Network for Multimedia Recommendation.
IEEE Trans. Multim., 2023
Micro-Influencer Recommendation by Multi-Perspective Account Representation Learning.
IEEE Trans. Multim., 2023
Self-Supervised Correlation Learning for Cross-Modal Retrieval.
IEEE Trans. Multim., 2023
Neighbor-Guided Consistent and Contrastive Learning for Semi-Supervised Action Recognition.
IEEE Trans. Image Process., 2023
Semantic-Aware Modular Capsule Routing for Visual Question Answering.
IEEE Trans. Image Process., 2023
Neural Collapse Terminus: A Unified Solution for Class Incremental Learning and Its Variants.
CoRR, 2023
Self-Training Boosted Multi-Faceted Matching Network for Composed Image Retrieval.
CoRR, 2023
OFAR: A Multimodal Evidence Retrieval Framework for Illegal Live-streaming Identification.
Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023
Fine-grained Key-Value Memory Enhanced Predictor for Video Representation Learning.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Mask Again: Masked Knowledge Distillation for Masked Video Modeling.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Temporal Sentence Grounding in Streaming Videos.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
CHMATCH: Contrastive Hierarchical Matching and Robust Adaptive Threshold Boosted Semi-Supervised Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Self-adaptive Context and Modal-interaction Modeling For Multimodal Emotion Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Discover Micro-Influencers for Brands via Better Understanding.
IEEE Trans. Multim., 2022
Special issue on cross-modal retrieval and analysis.
Int. J. Multim. Inf. Retr., 2022
Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem.
CoRR, 2022
Distributed Multi-Attention Generative Adversarial Network for Surrounding Vehicles Trajectories Prediction Based On Comprehensive Social Repulsion.
IEEE Access, 2022
Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Ingredient-enriched Recipe Generation from Cooking Videos.
Proceedings of the ICMR '22: International Conference on Multimedia Retrieval, Newark, NJ, USA, June 27, 2022
HEAD: HEtero-Assists Distillation for Heterogeneous Object Detectors.
Proceedings of the Computer Vision - ECCV 2022, 2022
High Quality Segmentation for Ultra High-resolution Images.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
Reconstruction regularized low-rank subspace learning for cross-modal retrieval.
Pattern Recognit., 2021
Dynamic Modality Interaction Modeling for Image-Text Retrieval.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
Graph Contrastive Clustering.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
2020
Fashion Compatibility Modeling through a Multi-modal Try-on-guided Scheme.
Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, 2020
Agree to Disagree: Adaptive Ensemble Knowledge Distillation in Gradient Space.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Multi-modal Cooking Workflow Construction for Food Recipes.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Maximum-and-Concatenation Networks.
Proceedings of the 37th International Conference on Machine Learning, 2020
Local Correlation Consistency for Knowledge Distillation.
Proceedings of the Computer Vision - ECCV 2020, 2020
Dynamical System Inspired Adaptive Time Stepping Controller for Residual Network Families.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
SOGNet: Scene Overlap Graph Network for Panoptic Segmentation.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Unified Graph and Low-Rank Tensor Learning for Multi-View Clustering.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
Essential Tensor Learning for Multi-View Spectral Clustering.
IEEE Trans. Image Process., 2019
Matrix recovery with implicitly low-rank data.
Neurocomputing, 2019
R ^2 2 -Net: Recurrent and Recursive Network for Sparse-View CT Artifacts Removal.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019
Differentiable Linearized ADMM.
Proceedings of the 36th International Conference on Machine Learning, 2019
Deep Comprehensive Correlation Mining for Image Clustering.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Expectation-Maximization Attention Networks for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
2018
Semi-Markov Based Maintenance Decision for Production System.
Proceedings of the 3rd International Conference on System Reliability and Safety, 2018
Recurrent Squeeze-and-Excitation Context Aggregation Net for Single Image Deraining.
Proceedings of the Computer Vision - ECCV 2018, 2018
Joint Dictionary Learning and Semantic Constrained Latent Subspace Projection for Cross-Modal Retrieval.
Proceedings of the 27th ACM International Conference on Information and Knowledge Management, 2018
2017
Locality-constrained linear coding based bi-layer model for multi-view facial expression recognition.
Neurocomputing, 2017
Joint Latent Subspace Learning and Regression for Cross-Modal Retrieval.
Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2017
2016
Multi-view common space learning for emotion recognition in the wild.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016
2015
Multiple Models Fusion for Emotion Recognition in the Wild.
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09, 2015