2025
LiDAR-guided Geometric Pretraining for Vision-Centric 3D Object Detection.
Int. J. Comput. Vis., July, 2025
CLIP-Driven Transformer for Weakly Supervised Object Localization.
IEEE Trans. Pattern Anal. Mach. Intell., June, 2025
Pseudo-Label Quality Decoupling and Correction for Semi-Supervised Instance Segmentation.
CoRR, May, 2025
Adaptive Zone Learning for Weakly Supervised Object Localization.
IEEE Trans. Neural Networks Learn. Syst., April, 2025
Purifying, Labeling, and Utilizing: A High-Quality Pipeline for Small Object Detection.
CoRR, April, 2025
More Clear, More Flexible, More Precise: A Comprehensive Oriented Object Detection benchmark for UAV.
CoRR, April, 2025
SynergyAmodal: Deocclude Anything with Text Control.
CoRR, April, 2025
Training-Free Hierarchical Scene Understanding for Gaussian Splatting with Superpoint Graphs.
CoRR, April, 2025
Zooming In on Fakes: A Novel Dataset for Localized AI-Generated Image Detection with Forgery Amplification Approach.
CoRR, April, 2025
S<sup>2</sup>Teacher: Step-by-step Teacher for Sparsely Annotated Oriented Object Detection.
CoRR, April, 2025
An Efficient and Mixed Heterogeneous Model for Image Restoration.
CoRR, April, 2025
WildSeg3D: Segment Any 3D Objects in the Wild from 2D Images.
CoRR, March, 2025
AnomalyPainter: Vision-Language-Diffusion Synergy for Zero-Shot Realistic and Diverse Industrial Anomaly Synthesis.
CoRR, March, 2025
LightMotion: A Light and Tuning-free Method for Simulating Camera Motion in Video Generation.
CoRR, March, 2025
Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization.
CoRR, March, 2025
Unsupervised Domain Adaptation on Person Reidentification Via Dual-Level Asymmetric Mutual Learning.
IEEE Trans. Neural Networks Learn. Syst., January, 2025
Drag Your Gaussian: Effective Drag-Based Editing with Score Distillation for 3D Gaussian Splatting.
CoRR, January, 2025
TextRefiner: Internal Visual Feature as Efficient Refiner for Vision-Language Models Prompt Tuning.
Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025
2024
Bilateral Knowledge Interaction Network for Referring Image Segmentation.
IEEE Trans. Multim., 2024
ISTR: Mask-Embedding-Based Instance Segmentation Transformer.
IEEE Trans. Image Process., 2024
Knowing Where to Focus: Attention-Guided Alignment for Text-based Person Search.
CoRR, 2024
Breaking the Bias: Recalibrating the Attention of Industrial Anomaly Detection.
CoRR, 2024
UniVST: A Unified Framework for Training-free Localized Video Style Transfer.
CoRR, 2024
Boosting CLIP Adaptation for Image Quality Assessment via Meta-Prompt Learning and Gradient Regularization.
CoRR, 2024
PartFormer: Awakening Latent Diverse Representation from Vision Transformer for Object Re-Identification.
CoRR, 2024
HRSAM: Efficiently Segment Anything in High-Resolution Images.
CoRR, 2024
HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection.
CoRR, 2024
Local Manifold Learning for No-Reference Image Quality Assessment.
CoRR, 2024
Depth-Guided Semi-Supervised Instance Segmentation.
CoRR, 2024
Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion.
CoRR, 2024
CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method.
CoRR, 2024
Multi-Modal Prompt Learning on Blind Image Quality Assessment.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation.
CoRR, 2024
DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery Analysis.
CoRR, 2024
DMAD: Dual Memory Bank for Real-World Anomaly Detection.
CoRR, 2024
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Adaptive Selection based Referring Image Segmentation.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Advancing Multimodal Large Language Models with Quantization-Aware Scale Learning for Efficient Adaptation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
GOI: Find 3D Gaussians of Interest with an Optimizable Open-vocabulary Semantic-space Hyperplane.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Prompting to Adapt Foundational Segmentation Models.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Cantor: Inspiring Multimodal Chain-of-Thought of MLLM.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
CamoTeacher: Dual-Rotation Consistency Learning for Semi-supervised Camouflaged Object Detection.
Proceedings of the Computer Vision - ECCV 2024, 2024
Enhancing Tampered Text Detection Through Frequency Feature Fusion and Decomposition.
Proceedings of the Computer Vision - ECCV 2024, 2024
UniPTS: A Unified Framework for Proficient Post-Training Sparsity.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
FocSAM: Delving Deeply into Focused Objects in Segmenting Anything.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
RepAn: Enhanced Annealing through Re-parameterization.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Attention Disturbance and Dual-Path Constraint Network for Occluded Person Re-identification.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Occluded Person Re-identification via Saliency-Guided Patch Transfer.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Weakly Supervised Open-Vocabulary Object Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
CAM R-CNN: End-to-End Object Detection with Class Activation Maps.
Neural Process. Lett., December, 2023
Super Vision Transformer.
Int. J. Comput. Vis., December, 2023
Pruning Networks With Cross-Layer Ranking & k-Reciprocal Nearest Filters.
IEEE Trans. Neural Networks Learn. Syst., November, 2023
Prioritized Subnet Sampling for Resource-Adaptive Supernet Training.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2023
Multi-Branch Distance-Sensitive Self-Attention Network for Image Captioning.
IEEE Trans. Multim., 2023
A Unified Framework for 3D Point Cloud Visual Grounding.
CoRR, 2023
Geometric-aware Pretraining for Vision-centric 3D Object Detection.
CoRR, 2023
Unsupervised Domain Adaptation on Person Re-Identification via Dual-level Asymmetric Mutual Learning.
CoRR, 2023
Few-Shot Object Detection via Classify-Free RPN.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023
Hierarchical Focused Feature Pyramid Network for Small Object Detection.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023
Global Selection and Local Attention Network for Referring Image Segmentation.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023
Beyond the Label Distribution Prior for Long-Tailed Recognition.
Proceedings of the Advanced Intelligent Computing Technology and Applications, 2023
InterFormer Real-time Interactive Image Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Pseudo-label Alignment for Semi-supervised Instance Segmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Category-aware Allocation Transformer for Weakly Supervised Object Localization.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
CANDY: Category-Kernelized Dynamic Convolution for Instance Segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Self-Paced Partial Domain-Aware Learning for Face Anti-Spoofing.
Proceedings of the IEEE International Conference on Acoustics, 2023
DistilPose: Tokenized Pose Regression with Heatmap Distillation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
You Only Segment Once: Towards Real-Time Panoptic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Attack Can Benefit: An Adversarial Approach to Recognizing Facial Expressions under Noisy Annotations.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Practical Cross-System Shilling Attacks with Limited Access to Data.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
Filter Sketch for Network Pruning.
IEEE Trans. Neural Networks Learn. Syst., 2022
Knowledge-Driven Generative Adversarial Network for Text-to-Image Synthesis.
IEEE Trans. Multim., 2022
Towards Lightweight Transformer Via Group-Wise Transformation for Vision-and-Language Tasks.
IEEE Trans. Image Process., 2022
Towards Robust Adversarial Training via Dual-label Supervised and Geometry Constraint.
Int. J. Softw. Informatics, 2022
Deepwalk-aware graph convolutional networks.
Sci. China Inf. Sci., 2022
Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
SeqTR: A Simple Yet Universal Network for Visual Grounding.
Proceedings of the Computer Vision - ECCV 2022, 2022
Knowledge Condensation Distillation.
Proceedings of the Computer Vision - ECCV 2022, 2022
Privacy-Preserving Face Recognition with Learnable Privacy Budgets in Frequency Domain.
Proceedings of the Computer Vision - ECCV 2022, 2022
ARM: Any-Time Super-Resolution Method.
Proceedings of the Computer Vision - ECCV 2022, 2022
Active Teacher for Semi-Supervised Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
GuidedMix-Net: Semi-supervised Semantic Segmentation by Using Labeled Images as Reference.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
LCTR: On Awakening the Local Continuity of Transformer for Weakly Supervised Object Localization.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Prioritized Subnet Sampling for Resource-Adaptive Supernet Training.
CoRR, 2021
ISTR: End-to-End Instance Segmentation with Transformers.
CoRR, 2021
DeeperForensics Challenge 2020 on Real-World Face Forgery Detection: Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
E2Net: Excitative-Expansile Learning for Weakly Supervised Object Localization.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
EC-DARTS: Inducing Equalized and Consistent Optimization into DARTS.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Parallel Detection-and-Segmentation Learning for Weakly Supervised Instance Segmentation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Architecture Disentanglement for Deep Neural Networks.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Toward Joint Thing-and-Stuff Mining for Weakly Supervised Panoptic Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
SDD-FIQA: Unsupervised Face Image Quality Assessment With Similarity Distribution Distance.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Image-to-Image Translation via Hierarchical Style Disentanglement.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Dual-level Collaborative Transformer for Image Captioning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Link-aware semi-supervised hypergraph.
Inf. Sci., 2020
Exploring Language Prior for Mode-Sensitive Visual Attention Modeling.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
API-Net: Robust Generative Classifier via a Single Discriminator.
Proceedings of the Computer Vision - ECCV 2020, 2020
Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
Cross-Modality Microblog Sentiment Prediction via Bi-Layer Multimodal Hypergraph Learning.
IEEE Trans. Multim., 2019
Hadamard Codebook Based Deep Hashing.
CoRR, 2019
Semantic-aware Image Deblurring.
CoRR, 2019
Supervised Online Hashing via Similarity Distribution Learning.
CoRR, 2019
Many-to-One Gesture-to-Command Flexible Mapping Approach for Smart Teaching Interface Interaction.
IEEE Access, 2019
Hypergraph Induced Convolutional Manifold Networks.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Generalized Zero-Shot Vehicle Detection in Remote Sensing Imagery via Coarse-to-Fine Framework.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Towards Cross-modality Topic Modelling via Deep Topical Correlation Analysis.
Proceedings of the IEEE International Conference on Acoustics, 2019
Learning Similarity-specific Dictionary for Zero-shot Fine-grained Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Towards Optimal Structured CNN Pruning via Generative Adversarial Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Weakly Supervised Vehicle Detection in Satellite Images via Multiple Instance Ranking.
Proceedings of the 24th International Conference on Pattern Recognition, 2018
Joint Denoising and Super-Resolution via Generative Adversarial Training.
Proceedings of the 24th International Conference on Pattern Recognition, 2018
2017
Toward Optimal Manifold Hashing via Discrete Locally Linear Embedding.
IEEE Trans. Image Process., 2017
Weakly supervised vehicle detection in satellite images via multi-instance discriminative learning.
Pattern Recognit., 2017
Hypergraph regularized sparse feature learning.
Neurocomputing, 2017
Multimodal media data understanding and analysis.
Neurocomputing, 2017
2016
Vehicle Detection in High-Resolution Aerial Images Based on Fast Sparse Representation Classification and Multiorder Feature.
IEEE Trans. Intell. Transp. Syst., 2016
Road Network Extraction via Aperiodic Directional Structure Measurement.
IEEE Trans. Geosci. Remote. Sens., 2016
Vehicle Detection in High-Resolution Aerial Images via Sparse Representation and Superpixels.
IEEE Trans. Geosci. Remote. Sens., 2016
Human behavior recognition based on 3D features and hidden markov models.
Signal Image Video Process., 2016
Joint Depth and Semantic Inference from a Single Image via Elastic Conditional Random Field.
Pattern Recognit., 2016
Question microblog identification and answer recommendation.
Multim. Syst., 2016
Vehicle detection from highway satellite images via transfer learning.
Inf. Sci., 2016
A novel features ranking metric with application to scalable visual and bioinformatics data classification.
Neurocomputing, 2016
Person re-identification based on multi-instance multi-label learning.
Neurocomputing, 2016
Multimodal learning for view-based 3D object classification.
Neurocomputing, 2016
Robust vehicle detection by combining deep features with exemplar classification.
Neurocomputing, 2016
Superpixel-based coastline extraction in SAR images with speckle noise removal.
Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium, 2016
Towards Domain Adaptive Vehicle Detection in Satellite Image by Supervised Super-Resolution Transfer.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016
3D Object Retrieval with Multimodal Views.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 9th Eurographics Workshop on 3D Object Retrieval, 2016
2015
High-capacity reversible watermarking scheme of 2D-vector data.
Signal Image Video Process., 2015
Robust depth-based object tracking from a moving binocular camera.
Signal Process., 2015
Localizing web videos using social images.
Inf. Sci., 2015
Estimation of human body shape and cloth field in front of a kinect.
Neurocomputing, 2015
Shape completion for depth image via repeated objects registration.
Neurocomputing, 2015
Robust latent semantic exploration for image retrieval in social media.
Neurocomputing, 2015
Interactive on-device Mobile Landmark Recognition with compact binary codes.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Single/cross-camera multiple-person tracking by graph matching.
Neurocomputing, 2014
News videos anchor person detection by shot clustering.
Neurocomputing, 2014
Oil spill detection based on a superpixel segmentation method for SAR image.
Proceedings of the 2014 IEEE Geoscience and Remote Sensing Symposium, 2014
Vehicle Detection from Remote Sensing Image Based on Superpixel Segmentation and Image Enhancement.
Proceedings of the International Conference on Internet Multimedia Computing and Service, 2014
2013
Nonlinear scrambling-based reversible watermarking for 2D-vector maps.
Vis. Comput., 2013
Weakly supervised codebook learning by iterative label propagation with graph quantization.
Signal Process., 2013
Mining spatiotemporal video patterns towards robust action retrieval.
Neurocomputing, 2013
A recursive embedding algorithm towards lossless 2D vector map watermarking.
Digit. Signal Process., 2013
Quality Assessment on User Generated Image for Mobile Search Application.
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013
Proceedings of the Advances in Multimedia Modeling, 19th International Conference, 2013
2012
Visual Vocabulary Learning and Its Application to 3D and Mobile Visual Search
CoRR, 2012
Weakly supervised topic grouping of YouTube search results.
Proceedings of the 19th IEEE International Conference on Image Processing, 2012
Weakly supervised sparse coding with geometric consistency pooling.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012
2010
Perception-based reversible watermarking for 2D vector maps.
Proceedings of the Visual Communications and Image Processing 2010, 2010
Perception-driven watermarking with evolutionary block mapping.
Proceedings of the Visual Communications and Image Processing 2010, 2010
Iterative embedding-based reversible watermarking for 2D-vector maps.
Proceedings of the International Conference on Image Processing, 2010