2024
Planar Reconstruction of Indoor Scenes from Sparse Views and Relative Camera Poses.
Remote. Sens., May, 2024

Sparse Multi-Relational Graph Convolutional Network for Multi-type Object Trajectory Prediction.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Radiance Field Learners As UAV First-Person Viewers.
Proceedings of the Computer Vision - ECCV 2024, 2024

LAtt-Yolov8-seg: Video Real-time Instance Segmentation for Urban Street Scenes Based on Focused Linear Attention Mechanism.
Proceedings of the International Conference on Computer Vision and Deep Learning, 2024

Sewer-MoE: A tuned Mixture of Experts Model for Sewer Defect Classification.
Proceedings of the International Conference on Computer Vision and Deep Learning, 2024

DroneGPT: Zero-shot Video Question Answering For Drones.
Proceedings of the International Conference on Computer Vision and Deep Learning, 2024

2023
Solve the Puzzle of Instance Segmentation in Videos: A Weakly Supervised Framework With Spatio-Temporal Collaboration.
IEEE Trans. Circuits Syst. Video Technol., 2023

Prompt Learns Prompt: Exploring Knowledge-Aware Generative Prompt Collaboration For Video Captioning.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

2022
Video Captioning Using Global-Local Representation.
IEEE Trans. Circuits Syst. Video Technol., 2022

GL-RG: Global-Local Representation Granularity for Video Captioning.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021
TF-Blender: Temporal Feature Blender for Video Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Hierarchical Attention Fusion for Geo-Localization.
Proceedings of the IEEE International Conference on Acoustics, 2021

DenserNet: Weakly Supervised Visual Localization Using Multi-Scale Feature Aggregation.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Multimodal Aggregation Approach for Memory Vision-Voice Indoor Navigation with Meta-Learning.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

2019
Crowd Video Captioning.
CoRR, 2019