2025
Global Semantic-Guided Sub-image Feature Weight Allocation in High-Resolution Large Vision-Language Models.
CoRR, January, 2025

Instruction-Guided Fusion of Multi-Layer Visual Features in Large Vision-Language Models.
CoRR, January, 2025

2024
Object-Centric Cross-Modal Knowledge Reasoning for Future Event Prediction in Videos.
IEEE Trans. Circuits Syst. Video Technol., December, 2024

Leveraging Smooth Deformation Augmentation for LiDAR Point Cloud Semantic Segmentation.
IEEE Trans. Intell. Veh., February, 2024

DADL: Double Asymmetric Distribution Learning for head pose estimation in wisdom museum.
J. King Saud Univ. Comput. Inf. Sci., January, 2024

Weakly Supervised Gaussian Contrastive Grounding with Large Multimodal Models for Video Question Answering.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

2023
GCANet: Geometry cues-aware facial expression recognition based on graph convolutional networks.
J. King Saud Univ. Comput. Inf. Sci., July, 2023

Weakly Supervised Learning of Semantic Correspondence through Cascaded Online Correspondence Refinement.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

2020
Infrared facial expression recognition via Gaussian-based label distribution learning in the dark illumination environment for human emotion detection.
Neurocomputing, 2020