DAVE: Diverse Atomic Visual Elements Dataset with High Representation of Vulnerable Road Users in Complex and Unpredictable Environments.
CoRR, 2024
SOAR: Self-supervision Optimized UAV Action Recognition with Efficient Object-Aware Pretraining.
CoRR, 2024
Deep Stochastic Kinematic Models for Probabilistic Motion Forecasting in Traffic.
CoRR, 2024
MITFAS: Mutual Information based Temporal Feature Alignment and Sampling for Aerial Video Action Recognition.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024
PMI Sampler: Patch Similarity Guided Frame Selection For Aerial Action Recognition.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024
Deep Stochastic Kinematic Models for Probabilistic Motion Forecasting in Traffic.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
AGL-Net: Aerial-Ground Cross-Modal Global Localization with Varying Scales.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
SCP: Soft Conditional Prompt Learning for Aerial Video Action Recognition.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2024
AutoHallusion: Automatic Generation of Hallucination Benchmarks for Vision-Language Models.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
ViLA: Efficient Video-Language Alignment for Video Question Answering.
Proceedings of the Computer Vision - ECCV 2024, 2024
Hallusionbench: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual Illusion in Large Vision-Language Models.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
ICAR: Image-Based Complementary Auto Reasoning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
VLAP: Efficient Video-Language Alignment via Frame Prompting and Distilling for Video Question Answering.
CoRR, 2023
Triplet Knowledge Distillation.
CoRR, 2023
Prompt Learning for Action Recognition.
CoRR, 2023
AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023
Small-shot Multi-modal Distillation for Vision-based Autonomous Steering.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023
METEOR: A Dense, Heterogeneous, and Unstructured Traffic Dataset with Rare Behaviors.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023
Auxiliary Modality Learning with Generalized Curriculum Distillation.
Proceedings of the International Conference on Machine Learning, 2023
SCSC: Spatial Cross-scale Convolution Module to Strengthen both CNNs and Transformers.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
CrossLoc3D: Aerial-Ground Cross-Source 3D Place Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Fourier Disentangled Space-Time Attention for Aerial Video Recognition.
CoRR, 2022
FAR: Fourier Aerial Video Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022
Dynamic Region-Aware Convolution.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Fully Learnable Group Convolution for Acceleration of Deep Neural Networks.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019