2025
Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models.
CoRR, January, 2025
2024
Multi-Modal 3D Object Detection by Box Matching.
IEEE Trans. Intell. Transp. Syst., December, 2024
CSDG-FAS: Closed-Space Domain Generalization for Face Anti-spoofing.
Int. J. Comput. Vis., November, 2024
MaskOCR: Scene Text Recognition with Masked Vision-Language Pre-training.
Trans. Mach. Learn. Res., 2024
MAFormer: A transformer network with multi-scale attention fusion for visual recognition.
Neurocomputing, 2024
Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization.
CoRR, 2024
ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts.
CoRR, 2024
Splatter-360: Generalizable 360° Gaussian Splatting for Wide-baseline Panoramic Images.
CoRR, 2024
TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting.
CoRR, 2024
TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior.
CoRR, 2024
R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction.
CoRR, 2024
Uni<sup>2</sup>Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection.
CoRR, 2024
MonoFormer: One Transformer for Both Diffusion and Autoregression.
CoRR, 2024
FullAnno: A Data Engine for Enhancing Image Comprehension of MLLMs.
CoRR, 2024
Add-SD: Rational Generation without Manual Reference.
CoRR, 2024
OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer.
CoRR, 2024
XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis.
CoRR, 2024
VDG: Vision-Only Dynamic Gaussian for Driving Simulation.
CoRR, 2024
Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting.
CoRR, 2024
LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
TexRO: Generating Delicate Textures of 3D Models by Recursive Optimization.
CoRR, 2024
GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos.
CoRR, 2024
HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024
TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024
Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Uni4DAL: A Unified Baseline for Multi-dataset 4D Auto-Labeling.
Proceedings of the Pattern Recognition - 27th International Conference, 2024
Towards Unified Multi-granularity Text Detection with Interactive Attention.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Interactive 3D Object Detection with Prompts.
Proceedings of the Computer Vision - ECCV 2024, 2024
GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time.
Proceedings of the Computer Vision - ECCV 2024, 2024
OPEN: Object-Wise Position Embedding for Multi-view 3D Object Detection.
Proceedings of the Computer Vision - ECCV 2024, 2024
ReSyncer: Rewiring Style-Based Generator for Unified Audio-Visually Synced Facial Performer.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2024, 2024
LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction.
Proceedings of the Computer Vision - ECCV 2024, 2024
MS-DETR: Efficient DETR Training with Mixed Supervision.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
VRP-SAM: SAM with Visual Reference Prompt.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
TexOct: Generating Textures of 3D Models with Octree-based Diffusion.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Multi-Domain Incremental Learning for Face Presentation Attack Detection.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
Dual-Affinity Style Embedding Network for Semantic-Aligned Image Style Transfer.
IEEE Trans. Neural Networks Learn. Syst., October, 2023
Adversarial Dual-Student With Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation.
IEEE Trans. Circuits Syst. Video Technol., February, 2023
CAE v2: Context Autoencoder with CLIP Latent Alignment.
,
,
,
,
,
,
,
,
,
,
,
,
Trans. Mach. Learn. Res., 2023
GIR: 3D Gaussian Inverse Rendering for Relightable Scene Factorization.
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis.
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Accelerating Vision Transformers Based on Heterogeneous Attention Patterns.
CoRR, 2023
VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation.
CoRR, 2023
Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation.
CoRR, 2023
Building an Invisible Shield for Your Portrait against Deepfakes.
CoRR, 2023
ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box.
CoRR, 2023
LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution.
CoRR, 2023
Temporal Segment Transformer for Action Segmentation.
CoRR, 2023
Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Effective Invertible Arbitrary Image Rescaling.
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023
Efficient Video Portrait Reenactment via Grid-based Codebook.
,
,
,
,
,
,
,
,
,
,
Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023
HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
MSAbox: A spatially stable face detector.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
Graph Contrastive Learning for Skeleton-based Action Recognition.
Proceedings of the Eleventh International Conference on Learning Representations, 2023
ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023
LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Forward Flow for Novel View Synthesis of Dynamic Scenes.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023
Semi-DETR: Semi-Supervised Object Detection with Detection Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
CAPE: Camera View Position Embedding for Multi-View 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
PSVT: End-to-End Multi-Person 3D Pose and Shape Estimation with Progressive Video Transformers.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Cyclically Disentangled Feature Translation for Face Anti-spoofing.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Robust Video Portrait Reenactment via Personalized Representation Quantization.
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-Based 3D Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
AGO-Net: Association-Guided 3D Point Cloud Object Detection Network.
IEEE Trans. Pattern Anal. Mach. Intell., 2022
CAE v2: Context Autoencoder with CLIP Target.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling.
CoRR, 2022
Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2022
U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction.
CoRR, 2022
MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition.
CoRR, 2022
Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption.
CoRR, 2022
MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining.
CoRR, 2022
Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task.
CoRR, 2022
Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers.
Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022
Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Delving into Sequential Patches for Deepfake Detection.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Boosting Video-Text Retrieval with Explicit High-Level Semantics.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Repainting and Imitating Learning for Lane Detection.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022
Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022
StyleSwap: Style-Based Generator Empowers Robust Face Swapping.
Proceedings of the Computer Vision - ECCV 2022, 2022
UFO: Unified Feature Optimization.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2022, 2022
Neural Color Operators for Sequential Image Retouching.
Proceedings of the Computer Vision - ECCV 2022, 2022
CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval.
Proceedings of the Computer Vision - ECCV 2022, 2022
Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection.
Proceedings of the Computer Vision - ECCV 2022, 2022
GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022
Action Quality Assessment with Temporal Parsing Transformer.
Proceedings of the Computer Vision - ECCV 2022, 2022
Human-Object Interaction Detection via Disentangled Transformer.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Implicit Sample Extension for Unsupervised Person Re-Identification.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
A Multi-granularity Retrieval System for Natural Language-based Vehicle Retrieval.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022
Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Box-Grained Reranking Matching for Multi-Camera Multi-Target Tracking.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022
Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Few-Shot Font Generation by Learning Fine-Grained Local Styles.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Few-Shot Head Swapping in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Expressive Talking Head Generation with Granular Audio-Visual Control.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
MixFormer: Mixing Features across Windows and Dimensions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis.
Proceedings of the Computer Vision - ACCV 2022, 2022
MobileFaceSwap: A Lightweight Framework for Video Face Swapping.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
Image Inpainting by End-to-End Cascaded Refinement With Mask Awareness.
IEEE Trans. Image Process., 2021
SGM3D: Stereo Guided Monocular 3D Object Detection.
CoRR, 2021
An Information Theory-inspired Strategy for Automatic Network Pruning.
CoRR, 2021
StrucTexT: Structured Text Understanding with Multi-Modal Transformers.
CoRR, 2021
Oriented Object Detection with Transformer.
CoRR, 2021
Dual-stream Network for Visual Recognition.
CoRR, 2021
Probabilistic Ranking-Aware Ensembles for Enhanced Object Detections.
CoRR, 2021
PAFNet: An Efficient Anchor-Free Object Detector Guidance.
CoRR, 2021
Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2021
Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection.
CoRR, 2021
Dual-stream Network for Visual Recognition.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021
CDP: Towards Optimal Filter Pruning via Class-wise Discriminative Power.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
StrucTexT: Structured Text Understanding with Multi-Modal Transformers.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Lifting the Veil of Frequency in Joint Segmentation and Depth Estimation.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
AggNet for Self-supervised Monocular Depth Estimation: Go An Aggressive Step Furthe.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021
DANet: Dimension Apart Network for Radar Object Detection.
Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021
Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021
The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
EC-DARTS: Inducing Equalized and Consistent Optimization into DARTS.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Paint Transformer: Feed Forward Neural Painting with Stroke Prediction.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Revealing the Reciprocal Relations between Self-Supervised Stereo and Monocular Depth Estimation.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Robust and Online Vehicle Counting at Crowded Intersections.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021
Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Robust Vehicle Re-Identification via Rigid Structure Prior.
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021
Unsupervised Multi-Source Domain Adaptation for Person Re-Identification.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Dynamic Class Queue for Large Scale Face Recognition in the Wild.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Student-Teacher Feature Pyramid Matching for Anomaly Detection.
Proceedings of the 32nd British Machine Vision Conference 2021, 2021
FaceController: Controllable Attribute Editing for Face in the Wild.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
MVFNet: Multi-View Fusion Network for Efficient Video Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
TPM: Multiple object tracking with tracklet-plane matching.
Pattern Recognit., 2020
Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective.
CoRR, 2020
Coherent Loss: A Generic Framework for Stable Video Segmentation.
CoRR, 2020
LID 2020: The Learning from Imperfect Data Challenge Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network.
CoRR, 2020
Real Image Super Resolution Via Heterogeneous Model using GP-NAS.
CoRR, 2020
PP-YOLO: An Effective and Efficient Implementation of Object Detector.
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
PointTrack++ for Effective Online Multi-Object Tracking and Segmentation.
CoRR, 2020
NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2020
Learning Generalized Spoof Cues for Face Anti-spoofing.
CoRR, 2020
Towards Accurate Scene Text Recognition with Semantic Reasoning Networks.
CoRR, 2020
Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020
Modularized Framework with Category-Sensitive Abnormal Filter for City Anomaly Detection.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Learning Global Structure Consistency for Robust Object Tracking.
Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020
Monocular 3D Object Detection via Feature Domain Adaptation.
Proceedings of the Computer Vision - ECCV 2020, 2020
Segment as Points for Efficient Online Multi-Object Tracking and Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020
AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement.
Proceedings of the Computer Vision - ECCV 2020, 2020
Real Image Super Resolution via Heterogeneous Model Ensemble Using GP-NAS.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
AIM 2020 Challenge on Image Extreme Inpainting.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
Leaping from 2D Detection to Efficient 6DoF Object Pose Estimation.
Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020
Going Beyond Real Data: A Robust Visual Representation for Vehicle Re-identification.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Towards Accurate Scene Text Recognition With Semantic Reasoning Networks.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Robust Movement-Specific Vehicle Counting at Crowded Intersections.
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Multi-Granularity Tracking with Modularlized Components for Unsupervised Vehicles Anomaly Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Dynamic Instance Normalization for Arbitrary Style Transfer.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
2019
HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces.
CoRR, 2019
Detecting Text in the Wild with Deep Character Embedding Network.
CoRR, 2019
Editing Text in the Wild.
Proceedings of the 27th ACM International Conference on Multimedia, 2019
A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning.
Proceedings of the 27th ACM International Conference on Multimedia, 2019
An End-to-End Video Text Detector with Online Tracking.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019
ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling - RRC-LSVT.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019
EATEN: Entity-Aware Attention for Single Shot Visual Text Extraction.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019
ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT.
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019
ACFNet: Attentional Class Feature Network for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Perspective-Guided Convolution Networks for Crowd Counting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Image Inpainting With Learnable Bidirectional Attention Maps.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
BMN: Boundary-Matching Network for Temporal Action Proposal Generation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019
Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
Multi-camera vehicle tracking and re-identification based on visual and spatial-temporal features.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019
Attentive Feedback Network for Boundary-Aware Salient Object Detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019
2018
Compact Generalized Non-local Network.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Group Re-Identification: Leveraging and Integrating Multi-Grain Information.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018
Fine-Grained Video Categorization with Redundancy Reduction Attention.
Proceedings of the Computer Vision - ECCV 2018, 2018
3D Pose Estimation for Fine-Grained Object Categories.
Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018
Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition.
Proceedings of the Computer Vision - ECCV 2018, 2018
TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network.
Proceedings of the Computer Vision - ACCV 2018, 2018
Detecting Text in the Wild with Deep Character Embedding Network.
Proceedings of the Computer Vision - ACCV 2018, 2018
2017
WordSup: Exploiting Word Annotations for Character Based Text Detection.
Proceedings of the IEEE International Conference on Computer Vision, 2017
Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017
2016
Context-aware mathematical expression recognition: An end-to-end framework and a benchmark.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016