2025
VLSG-net: Vision-Language Scene Graphs network for Paragraph Video Captioning.
Neurocomputing, 2025

2024
Cross-modal learning with multi-modal model for video action recognition based on adaptive weight training.
Connect. Sci., December, 2024

Development of a Time Projection Chamber Readout with Hybrid Pixel Sensors for Beam Monitoring.
Sensors, April, 2024

Contrastive masked auto-encoders based self-supervised hashing for 2D image and 3D point cloud cross-modal retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

2023
EHSS: An Efficient Hybrid-supervised Symmetric Stereo Matching Network.
Proceedings of the 26th IEEE International Conference on Intelligent Transportation Systems, 2023