VLSG-net: Vision-Language Scene Graphs network for Paragraph Video Captioning.
Neurocomputing, 2025
Cross-modal learning with multi-modal model for video action recognition based on adaptive weight training.
Connect. Sci., December, 2024
Development of a Time Projection Chamber Readout with Hybrid Pixel Sensors for Beam Monitoring.
,
,
,
,
,
,
,
,
,
,
Sensors, April, 2024
Contrastive masked auto-encoders based self-supervised hashing for 2D image and 3D point cloud cross-modal retrieval.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
EHSS: An Efficient Hybrid-supervised Symmetric Stereo Matching Network.
Proceedings of the 26th IEEE International Conference on Intelligent Transportation Systems, 2023