2025
VideoQA in the Era of LLMs: An Empirical Study.
,
,
,
,
,
,
,
,
,
,
Int. J. Comput. Vis., July, 2025
ChronoTailor: Harnessing Attention Guidance for Fine-Grained Video Virtual Try-On.
CoRR, June, 2025
EgoBlind: Towards Egocentric Visual Assistance for the Blind People.
CoRR, March, 2025
Integrate Temporal Graph Learning into LLM-based Temporal Knowledge Graph Model.
CoRR, January, 2025
Multimodal understanding of human values in videos: A benchmark dataset and PLM-based method.
Neurocomputing, 2025
Fine-tuning Multimodal Large Language Models for Product Bundling.
Proceedings of the 31st ACM SIGKDD Conference on Knowledge Discovery and Data Mining, V.1, 2025
2024
BiSTNet: Semantic Image Prior Guided Bidirectional Temporal Feature Fusion for Deep Exemplar-Based Video Colorization.
IEEE Trans. Pattern Anal. Mach. Intell., August, 2024
Hierarchical RNNs with graph policy and attention for drone swarm.
J. Comput. Des. Eng., March, 2024
Attention-enhanced joint learning network for micro-video venue classification.
Multim. Tools Appl., February, 2024
Dance-Conditioned Artistic Music Generation by Creative-GAN.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2024
Harnessing Large Language Models for Multimodal Product Bundling.
CoRR, 2024
A Comprehensive Evaluation of Large Language Models on Temporal Event Forecasting.
CoRR, 2024
ToDA: Target-oriented Diffusion Attacker against Recommendation System.
CoRR, 2024
D2MNet for music generation joint driven by facial expressions and dance movements.
Array, 2024
Leveraging Multimodal Features and Item-level User Feedback for Bundle Construction.
Proceedings of the 17th ACM International Conference on Web Search and Data Mining, 2024
2023
Graph MADDPG with RNN for multiagent cooperative environment.
Frontiers Neurorobotics, June, 2023
Self-Supervised Learning for Multimedia Recommendation.
IEEE Trans. Multim., 2023
UA-FedRec: Untargeted Attack on Federated News Recommendation.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023
Flow-Guided Transformer for Video Colorization.
Proceedings of the IEEE International Conference on Image Processing, 2023
INO at Factify 2: Structure Coherence based Multi-Modal Fact Verification.
Proceedings of De-Factify 2: 2nd Workshop on Multimodal Fact Checking and Hate Speech Detection, 2023
2022
Hybrid-attention and frame difference enhanced network for micro-video venue recognition.
J. Intell. Fuzzy Syst., 2022
A Rating Prediction Recommendation Model Combined with the Optimizing Allocation for Information Granularity of Attributes.
Inf., 2022
Attention-enhanced and trusted multimodal learning for micro-video venue recognition.
Comput. Electr. Eng., 2022
EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
2021
Salient Region Guided Blind Image Sharpness Assessment.
Sensors, 2021
Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments.
CoRR, 2021
2020
MGAT: Multimodal Graph Attention Network for Recommendation.
Inf. Process. Manag., 2020
HoAFM: A High-order Attentive Factorization Machine for CTR Prediction.
Inf. Process. Manag., 2020
Hybrid Attention-Based Prototypical Network for Unfamiliar Restaurant Food Image Few-Shot Recognition.
IEEE Access, 2020
2019
Better Word Representations with Word Weight.
Proceedings of the 21st IEEE International Workshop on Multimedia Signal Processing, 2019
Spatial Feature Collaborative Network for Trademark Image Retrieval.
Proceedings of the 6th IEEE International Conference on Cloud Computing and Intelligence Systems, 2019
2016
An improved coupled Multi-Index for accurate image retrieval.
Proceedings of the 2nd International Conference on Frontiers of Signal Processing, 2016