2025
Attention-disentangled Uniform Orthogonal Feature Space Optimization for Few-shot Object Detection.
CoRR, June, 2025
Cognition Transferring and Decoupling for Text-Supervised Egocentric Semantic Segmentation.
IEEE Trans. Circuits Syst. Video Technol., May, 2025
Unsupervised Ego- and Exo-centric Dense Procedural Activity Captioning via Gaze Consensus Adaptation.
CoRR, April, 2025
Challenges and Trends in Egocentric Vision: A Survey.
CoRR, March, 2025
Class Incremental Learning With Less Forgetting Direction and Equilibrium Point.
IEEE Trans. Circuits Syst. Video Technol., February, 2025
MCCE-REC: MLLM-Driven Cross-Modal Contrastive Entropy Model for Zero-Shot Referring Expression Comprehension.
IEEE Trans. Circuits Syst. Video Technol., January, 2025
EgoMe: Follow Me via Egocentric View in Real World.
CoRR, January, 2025
Distribution-Level Memory Recall for Continual Learning: Preserving Knowledge and Avoiding Confusion.
IEEE Trans. Multim., 2025
Geodesic-Aligned Gradient Projection for Continual Task Learning.
IEEE Trans. Image Process., 2025
Adaptively forget with crossmodal and textual distillation for class-incremental video captioning.
Neurocomputing, 2025
2024
Continual Cross-Domain Image Compression via Entropy Prior Guided Knowledge Distillation and Scalable Decoding.
IEEE Trans. Circuits Syst. Video Technol., September, 2024
Robust Unpaired Image Dehazing via Adversarial Deformation Constraint.
IEEE Trans. Circuits Syst. Video Technol., September, 2024
Learning Offset Probability Distribution for Accurate Object Detection.
ACM Trans. Multim. Comput. Commun. Appl., May, 2024
TridentCap: Image-Fact-Style Trident Semantic Framework for Stylized Image Captioning.
IEEE Trans. Circuits Syst. Video Technol., May, 2024
CrowdCaption++: Collective-Guided Crowd Scenes Captioning.
IEEE Trans. Multim., 2024
Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation and Beyond.
IEEE Trans. Multim., 2024
Oriented-DINO: Angle Decoupling Prediction and Consistency Optimizing for Oriented Detection Transformer.
IEEE Trans. Geosci. Remote. Sens., 2024
VLM-guided Explicit-Implicit Complementary novel class semantic learning for few-shot object detection.
Expert Syst. Appl., 2024
ARIC: An Activity Recognition Dataset in Classroom Surveillance Images.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Region Prompt Tuning: Fine-grained Scene Text Detection Utilizing Region Text Prompt.
CoRR, 2024
Slightly Shift New Classes to Remember Old Classes for Video Class-Incremental Learning.
CoRR, 2024
MCF-VC: Mitigate Catastrophic Forgetting in Class-Incremental Learning for Multimodal Video Captioning.
CoRR, 2024
Proposal-level Correction Guided by CLIP for Few-shot Object Detection.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2024
IoU-CLIP: IoU-Aware Language-Image Model Tuning for Open Vocabulary Object Detection.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2024
DP-RSCAP: Dual Prompt-Based Scene and Entity Network for Remote Sensing Image Captioning.
Proceedings of the IGARSS 2024, 2024
Attribute-Prompting Multi-Modal Object Reasoning Transformer for Remote Sensing Visual Grounding.
Proceedings of the IGARSS 2024, 2024
Video Class-Incremental Learning With Clip Based Transformer.
Proceedings of the IEEE International Conference on Image Processing, 2024
A Text Detector Based on the Specific Text Prompt.
Proceedings of the IEEE International Conference on Image Processing, 2024
Class Incremental Learning with Multi-Teacher Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Prompt-Driven Referring Image Segmentation with Instance Contrasting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
HumanFormer: Human-centric Prompting Multi-modal Perception Transformer for Referring Crowd Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Must Unsupervised Continual Learning Relies on Previous Information?
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
Disturbed Augmentation Invariance for Unsupervised Visual Representation Learning.
IEEE Trans. Circuits Syst. Video Technol., November, 2023
Cross-Modal Recurrent Semantic Comprehension for Referring Image Segmentation.
IEEE Trans. Circuits Syst. Video Technol., July, 2023
CrossDet++: Growing Crossline Representation for Object Detection.
IEEE Trans. Circuits Syst. Video Technol., March, 2023
Bias-Correction Feature Learner for Semi-Supervised Instance Segmentation.
IEEE Trans. Multim., 2023
What Happens in Crowd Scenes: A New Dataset About Crowd Scenes for Image Captioning.
IEEE Trans. Multim., 2023
Unsupervised Visual Representation Learning via Multi-Dimensional Relationship Alignment.
IEEE Trans. Image Process., 2023
DRDet: Dual-Angle Rotated Line Representation for Oriented Object Detection.
IEEE Trans. Geosci. Remote. Sens., 2023
GRSDet: Learning to Generate Local Reverse Samples for Few-shot Object Detection.
CoRR, 2023
CFS: Character Feature Summarization Model for Real-time End-to-end Text Spotting.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2023
Novel-Registrable Weights and Region-Level Contrastive Learning for Incremental Few-shot Object Detection.
Proceedings of the Neural Information Processing - 30th International Conference, 2023
PTCP: Alleviate Layer Collapse in Pruning at Initialization via Parameter Threshold Compensation and Preservation.
Proceedings of the Neural Information Processing - 30th International Conference, 2023
Optimizing Mode Connectivity for Class Incremental Learning.
Proceedings of the International Conference on Machine Learning, 2023
Confusion Mixup Regularized Multimodal Fusion Network for Continual Egocentric Activity Recognition.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Contrastive Continuity on Augmentation Stability Rehearsal for Continual Self-Supervised Learning.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Incrementer: Transformer for Class-Incremental Semantic Segmentation with Knowledge Distillation Focusing on Old Class.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
CafeBoost: Causal Feature Boost to Eliminate Task-Induced Bias for Class Incremental Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
2022
Bal-R$^2$CNN: High Quality Recurrent Object Detection With Balance Optimization.
IEEE Trans. Multim., 2022
POS-Trends Dynamic-Aware Model for Video Caption.
IEEE Trans. Circuits Syst. Video Technol., 2022
Real-time panoptic segmentation with relationship between adjacent pixels and boundary prediction.
Neurocomputing, 2022
Instance-level Context Attention Network for instance segmentation.
Neurocomputing, 2022
Mining Regional Relation from Pixel-wise Annotation for Scene Parsing.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2022
DE-CrossDet: Divisible and Extensible Crossline Representation for Object Detection.
Proceedings of the IEEE International Conference on Visual Communications and Image Processing, 2022
Cross-Domain Object Detection with Missing Classes in Target Domain.
Proceedings of the 24th IEEE International Workshop on Multimedia Signal Processing, 2022
RefCrowd: Grounding the Target in Crowd with Referring Expressions.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Pedestrian Attribute Recognition Based on Association Rules.
Proceedings of the 8th IEEE International Conference on Cloud Computing and Intelligent Systems, 2022
2021
CrossDet: Crossline Representation for Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
2020
Hierarchical Context Features Embedding for Object Detection.
IEEE Trans. Multim., 2020
A multi-scale language embedding network for proposal-free referring expression comprehension.
Proceedings of the MMAsia 2020: ACM Multimedia Asia, 2020
Multi-stage Tag Guidance Network in Video Caption.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Language-Aware Fine-Grained Object Representation for Referring Expression Comprehension.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
VisDrone-DET2020: The Vision Meets Drone Object Detection in Image Challenge Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
Offset Bin Classification Network for Accurate Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
A<sup>2</sup>RMNet: Adaptively Aspect Ratio Multi-Scale Network for Object Detection in Remote Sensing Images.
Remote. Sens., 2019
VisDrone-DET2019: The Vision Meets Drone Object Detection in Image Challenge Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019
2018
VisDrone-DET2018: The Vision Meets Drone Object Detection in Image Challenge Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018