Learning Spatial Similarity Distribution for Few-shot Object Counting.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Memory-Augmented Transformer for Efficient End-to-End Video Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
CAMG: Context-Aware Moment Graph Network for Multimodal Temporal Activity Localization via Language.
Proceedings of the Natural Language Processing and Chinese Computing, 2023
SPTNET: Span-based Prompt Tuning for Video Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Conditional Video-Text Reconstruction Network with Cauchy Mask for Weakly Supervised Temporal Sentence Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
STDNet: Spatio-Temporal Decomposed Network for Video Grounding.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2022