2025

Interpretable Face Anti-Spoofing: Enhancing Generalization with Multimodal Large Language Models.

[DOI]

Guosheng Zhang

Keyao Wang

CoRR, January, 2025

2024

Multi-Modal 3D Object Detection by Box Matching.

[DOI]

IEEE Trans. Intell. Transp. Syst., December, 2024

CSDG-FAS: Closed-Space Domain Generalization for Face Anti-spoofing.

[DOI]

Int. J. Comput. Vis., November, 2024

MaskOCR: Scene Text Recognition with Masked Vision-Language Pre-training.

[DOI]

Trans. Mach. Learn. Res., 2024

MAFormer: A transformer network with multi-scale attention fusion for visual recognition.

[DOI]

Neurocomputing, 2024

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization.

[DOI]

CoRR, 2024

ALoRE: Efficient Visual Adaptation via Aggregating Low Rank Experts.

[DOI]

CoRR, 2024

Splatter-360: Generalizable 360° Gaussian Splatting for Wide-baseline Panoramic Images.

[DOI]

CoRR, 2024

TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting.

[DOI]

CoRR, 2024

TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior.

[DOI]

CoRR, 2024

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models.

[DOI]

CoRR, 2024

MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction.

[DOI]

CoRR, 2024

Uni<sup>2</sup>Det: Unified and Universal Framework for Prompt-Guided Multi-dataset 3D Detection.

[DOI]

CoRR, 2024

MonoFormer: One Transformer for Both Diffusion and Autoregression.

[DOI]

CoRR, 2024

FullAnno: A Data Engine for Enhancing Image Comprehension of MLLMs.

[DOI]

CoRR, 2024

Add-SD: Rational Generation without Manual Reference.

[DOI]

CoRR, 2024

OVLW-DETR: Open-Vocabulary Light-Weighted Detection Transformer.

[DOI]

CoRR, 2024

XLD: A Cross-Lane Dataset for Benchmarking Novel Driving View Synthesis.

[DOI]

CoRR, 2024

VDG: Vision-Only Dynamic Gaussian for Driving Simulation.

[DOI]

CoRR, 2024

Skim then Focus: Integrating Contextual and Fine-grained Views for Repetitive Action Counting.

[DOI]

CoRR, 2024

LW-DETR: A Transformer Replacement to YOLO for Real-Time Detection.

[DOI]

CoRR, 2024

StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond.

[DOI]

CoRR, 2024

TexRO: Generating Delicate Textures of 3D Models by Recursive Optimization.

[DOI]

CoRR, 2024

GVA: Reconstructing Vivid 3D Gaussian Avatars from Monocular Videos.

[DOI]

CoRR, 2024

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation.

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2024

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model.

[DOI]

Proceedings of the SIGGRAPH Asia 2024 Conference Papers, 2024

Octopus: A Multi-modal LLM with Parallel Recognition and Sequential Understanding.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

ShowMaker: Creating High-Fidelity 2D Human Video via Fine-Grained Diffusion Modeling.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Uni4DAL: A Unified Baseline for Multi-dataset 4D Auto-Labeling.

[DOI]

Proceedings of the Pattern Recognition - 27th International Conference, 2024

Towards Unified Multi-granularity Text Detection with Interactive Attention.

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Interactive 3D Object Detection with Prompts.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

GGRt: Towards Pose-Free Generalizable 3D Gaussian Splatting in Real-Time.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

OPEN: Object-Wise Position Embedding for Multi-view 3D Object Detection.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

ReSyncer: Rewiring Style-Based Generator for Unified Audio-Visually Synced Facial Performer.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

LaMI-DETR: Open-Vocabulary Detection with Language Model Instruction.

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

MS-DETR: Efficient DETR Training with Mixed Supervision.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Decoupled Pseudo-Labeling for Semi-Supervised Monocular 3D Object Detection.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

VRP-SAM: SAM with Visual Reference Prompt.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

TexOct: Generating Textures of 3D Models with Octree-based Diffusion.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Multi-Domain Incremental Learning for Face Presentation Attack Detection.

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Dual-Affinity Style Embedding Network for Semantic-Aligned Image Style Transfer.

[DOI]

IEEE Trans. Neural Networks Learn. Syst., October, 2023

Adversarial Dual-Student With Differentiable Spatial Warping for Semi-Supervised Semantic Segmentation.

[DOI]

IEEE Trans. Circuits Syst. Video Technol., February, 2023

CAE v2: Context Autoencoder with CLIP Latent Alignment.

[DOI]

Trans. Mach. Learn. Res., 2023

GIR: 3D Gaussian Inverse Rendering for Relightable Scene Factorization.

[DOI]

CoRR, 2023

Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis.

[DOI]

CoRR, 2023

Accelerating Vision Transformers Based on Heterogeneous Attention Patterns.

[DOI]

CoRR, 2023

VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation.

[DOI]

CoRR, 2023

Learning Structure-Guided Diffusion Model for 2D Human Pose Estimation.

[DOI]

CoRR, 2023

Building an Invisible Shield for Your Portrait against Deepfakes.

[DOI]

CoRR, 2023

ByteTrackV2: 2D and 3D Multi-Object Tracking by Associating Every Detection Box.

[DOI]

CoRR, 2023

LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution.

[DOI]

CoRR, 2023

Temporal Segment Transformer for Action Segmentation.

[DOI]

CoRR, 2023

Make Your Brief Stroke Real and Stereoscopic: 3D-Aware Simplified Sketch to Portrait Generation.

[DOI]

CoRR, 2023

Effective Invertible Arbitrary Image Rescaling.

[DOI]

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023

Efficient Video Portrait Reenactment via Grid-based Codebook.

[DOI]

Proceedings of the ACM SIGGRAPH 2023 Conference Proceedings, 2023

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MSAbox: A spatially stable face detector.

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2023

StrucTexTv2: Masked Visual-Textual Prediction for Document Image Pre-training.

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

Graph Contrastive Learning for Skeleton-based Action Recognition.

[DOI]

Proceedings of the Eleventh International Conference on Learning Representations, 2023

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images.

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

LMR: A Large-Scale Multi-Reference Dataset for Reference-based Super-Resolution.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Group Pose: A Simple Baseline for End-to-End Multi-person Pose Estimation.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

CFCG: Semi-Supervised Semantic Segmentation via Cross-Fusion and Contour Guidance Supervision.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Forward Flow for Novel View Synthesis of Dynamic Scenes.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Group DETR: Fast DETR Training with Group-Wise One-to-Many Assignment.

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Part-aware Prototypical Graph Network for One-shot Skeleton-based Action Recognition.

[DOI]

Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

Semi-DETR: Semi-Supervised Object Detection with Detection Transformers.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

PSVT: End-to-End Multi-Person 3D Pose and Shape Estimation with Progressive Video Transformers.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-Based Generator.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Cyclically Disentangled Feature Translation for Face Anti-spoofing.

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Robust Video Portrait Reenactment via Personalized Representation Quantization.

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-Based 3D Object Detection.

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

AGO-Net: Association-Guided 3D Point Cloud Object Detection Network.

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2022

CAE v2: Context Autoencoder with CLIP Target.

[DOI]

CoRR, 2022

Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling.

[DOI]

CoRR, 2022

Group DETR v2: Strong Object Detector with Encoder-Decoder Pretraining.

[DOI]

CoRR, 2022

U-HRNet: Delving into Improving Semantic Representation of High Resolution Network for Dense Prediction.

[DOI]

CoRR, 2022

MAFormer: A Transformer Network with Multi-scale Attention Fusion for Visual Recognition.

[DOI]

CoRR, 2022

Detecting Deepfake by Creating Spatio-Temporal Regularity Disruption.

[DOI]

CoRR, 2022

MaskOCR: Text Recognition with Masked Encoder-Decoder Pretraining.

[DOI]

CoRR, 2022

Rope3D: TheRoadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task.

[DOI]

CoRR, 2022

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers.

[DOI]

Proceedings of the SIGGRAPH Asia 2022 Conference Papers, 2022

Singular Value Fine-tuning: Few-shot Segmentation requires Few-parameters Fine-tuning.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Delving into Sequential Patches for Deepfake Detection.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

RTFormer: Efficient Design for Real-Time Semantic Segmentation with Transformer.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Boosting Video-Text Retrieval with Explicit High-Level Semantics.

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network.

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Repainting and Imitating Learning for Lane Detection.

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification.

[DOI]

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

StyleSwap: Style-Based Generator Empowers Robust Face Swapping.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

UFO: Unified Feature Optimization.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Neural Color Operators for Sequential Image Retouching.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Diverse Learner: Exploring Diverse Supervision for Semi-supervised Object Detection.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Action Quality Assessment with Temporal Parsing Transformer.

[DOI]

Proceedings of the Computer Vision - ECCV 2022, 2022

Human-Object Interaction Detection via Disentangled Transformer.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Implicit Sample Extension for Unsupervised Person Re-Identification.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

A Multi-granularity Retrieval System for Natural Language-based Vehicle Retrieval.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Rope3D: The Roadside Perception Dataset for Autonomous Driving and Monocular 3D Object Detection Task.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Box-Grained Reranking Matching for Multi-Camera Multi-Target Tracking.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2022

Predict, Prevent, and Evaluate: Disentangled Text-Driven Image Manipulation Empowered by Pre-Trained Vision-Language Model.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few-Shot Font Generation by Learning Fine-Grained Local Styles.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few-Shot Head Swapping in the Wild.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Expressive Talking Head Generation with Granular Audio-Visual Control.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

MixFormer: Mixing Features across Windows and Dimensions.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence.

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Neural Deformable Voxel Grid for Fast Optimization of Dynamic View Synthesis.

[DOI]

Proceedings of the Computer Vision - ACCV 2022, 2022

MobileFaceSwap: A Lightweight Framework for Video Face Swapping.

[DOI]

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021

Image Inpainting by End-to-End Cascaded Refinement With Mask Awareness.

[DOI]

IEEE Trans. Image Process., 2021

SGM3D: Stereo Guided Monocular 3D Object Detection.

[DOI]

CoRR, 2021

An Information Theory-inspired Strategy for Automatic Network Pruning.

[DOI]

CoRR, 2021

StrucTexT: Structured Text Understanding with Multi-Modal Transformers.

[DOI]

CoRR, 2021

Oriented Object Detection with Transformer.

[DOI]

CoRR, 2021

Dual-stream Network for Visual Recognition.

[DOI]

CoRR, 2021

Probabilistic Ranking-Aware Ensembles for Enhanced Object Detections.

[DOI]

CoRR, 2021

PAFNet: An Efficient Anchor-Free Object Detector Guidance.

[DOI]

CoRR, 2021

Beyond Self-Supervision: A Simple Yet Effective Network Distillation Alternative to Improve Backbones.

[DOI]

CoRR, 2021

Student-Teacher Feature Pyramid Matching for Unsupervised Anomaly Detection.

[DOI]

CoRR, 2021

Dual-stream Network for Visual Recognition.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

CDP: Towards Optimal Filter Pruning via Class-wise Discriminative Power.

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

StrucTexT: Structured Text Understanding with Multi-Modal Transformers.

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Lifting the Veil of Frequency in Joint Segmentation and Depth Estimation.

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition.

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

AggNet for Self-supervised Monocular Depth Estimation: Go An Aggressive Step Furthe.

[DOI]

Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

DANet: Dimension Apart Network for Radar Object Detection.

[DOI]

Proceedings of the ICMR '21: International Conference on Multimedia Retrieval, 2021

Weakly-Supervised Spatio-Temporal Anomaly Detection in Surveillance Video.

[DOI]

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

The Devil is in the Task: Exploiting Reciprocal Appearance-Localization Features for Monocular 3D Object Detection.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

EC-DARTS: Inducing Equalized and Consistent Optimization into DARTS.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DOLG: Single-Stage Image Retrieval with Deep Orthogonal Fusion of Local and Global Features.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Paint Transformer: Feed Forward Neural Painting with Stroke Prediction.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

ASCNet: Self-supervised Video Representation Learning with Appearance-Speed Consistency.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Revealing the Reciprocal Relations between Self-Supervised Stereo and Monocular Depth Estimation.

[DOI]

Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

Robust and Online Vehicle Counting at Crowded Intersections.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Robust Vehicle Re-Identification via Rigid Structure Prior.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2021

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Dynamic Class Queue for Large Scale Face Recognition in the Wild.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Student-Teacher Feature Pyramid Matching for Anomaly Detection.

[DOI]

Proceedings of the 32nd British Machine Vision Conference 2021, 2021

FaceController: Controllable Attribute Editing for Face in the Wild.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

MVFNet: Multi-View Fusion Network for Efficient Video Recognition.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

TPM: Multiple object tracking with tracklet-plane matching.

[DOI]

Pattern Recognit., 2020

Understanding Image Retrieval Re-Ranking: A Graph Neural Network Perspective.

[DOI]

CoRR, 2020

Coherent Loss: A Generic Framework for Stable Video Segmentation.

[DOI]

CoRR, 2020

LID 2020: The Learning from Imperfect Data Challenge Results.

[DOI]

CoRR, 2020

HS-ResNet: Hierarchical-Split Block on Convolutional Neural Network.

[DOI]

CoRR, 2020

Real Image Super Resolution Via Heterogeneous Model using GP-NAS.

[DOI]

CoRR, 2020

PP-YOLO: An Effective and Efficient Implementation of Object Detector.

[DOI]

CoRR, 2020

PointTrack++ for Effective Online Multi-Object Tracking and Segmentation.

[DOI]

CoRR, 2020

NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results.

[DOI]

Abdelrahman Abdelhamed

Krzysztof Trojanowski

Yanhong Wu

Pablo Navarrete Michelini

CoRR, 2020

Learning Generalized Spoof Cues for Face Anti-spoofing.

[DOI]

CoRR, 2020

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks.

[DOI]

CoRR, 2020

Discriminative Sounding Objects Localization via Self-supervised Audiovisual Matching.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Modularized Framework with Category-Sensitive Abnormal Filter for City Anomaly Detection.

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Learning Global Structure Consistency for Robust Object Tracking.

[DOI]

Proceedings of the MM '20: The 28th ACM International Conference on Multimedia, 2020

Monocular 3D Object Detection via Feature Domain Adaptation.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Segment as Points for Efficient Online Multi-Object Tracking and Segmentation.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results.

[DOI]

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Graph-PCNN: Two Stage Human Pose Estimation with Graph Pose Refinement.

[DOI]

Proceedings of the Computer Vision - ECCV 2020, 2020

Real Image Super Resolution via Heterogeneous Model Ensemble Using GP-NAS.

[DOI]

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

AIM 2020 Challenge on Image Extreme Inpainting.

[DOI]

Pranjal Singh Chauhan

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Leaping from 2D Detection to Efficient 6DoF Object Pose Estimation.

[DOI]

Proceedings of the Computer Vision - ECCV 2020 Workshops, 2020

Going Beyond Real Data: A Robust Visual Representation for Vehicle Re-identification.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Towards Accurate Scene Text Recognition With Semantic Reasoning Networks.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Robust Movement-Specific Vehicle Counting at Crowded Intersections.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Multi-Granularity Tracking with Modularlized Components for Unsupervised Vehicles Anomaly Detection.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection.

[DOI]

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results.

[DOI]

Abdelrahman Abdelhamed

Krzysztof Trojanowski

Yanhong Wu

Pablo Navarrete Michelini

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection.

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Dynamic Instance Normalization for Arbitrary Style Transfer.

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces.

[DOI]

CoRR, 2019

Detecting Text in the Wild with Deep Character Embedding Network.

[DOI]

CoRR, 2019

Editing Text in the Wild.

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning.

[DOI]

Proceedings of the 27th ACM International Conference on Multimedia, 2019

An End-to-End Video Text Detector with Online Tracking.

[DOI]

Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

ICDAR 2019 Competition on Large-Scale Street View Text with Partial Labeling - RRC-LSVT.

[DOI]

Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

EATEN: Entity-Aware Attention for Single Shot Visual Text Extraction.

[DOI]

Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text - RRC-ArT.

[DOI]

Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

ACFNet: Attentional Class Feature Network for Semantic Segmentation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Perspective-Guided Convolution Networks for Crowd Counting.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Image Inpainting With Learnable Bidirectional Attention Maps.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Chinese Street View Text: Large-Scale Chinese Text Reading With Partially Supervised Learning.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

BMN: Boundary-Matching Network for Temporal Action Proposal Generation.

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Multi-camera vehicle tracking and re-identification based on visual and spatial-temporal features.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

Attentive Feedback Network for Boundary-Aware Salient Object Detection.

[DOI]

Mengyang Feng

Huchuan Lu

Errui Ding

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing.

[DOI]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018

Compact Generalized Non-local Network.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Group Re-Identification: Leveraging and Integrating Multi-Grain Information.

[DOI]

Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Fine-Grained Video Categorization with Redundancy Reduction Attention.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

3D Pose Estimation for Fine-Grained Object Categories.

[DOI]

Proceedings of the Computer Vision - ECCV 2018 Workshops, 2018

Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition.

[DOI]

Proceedings of the Computer Vision - ECCV 2018, 2018

TextNet: Irregular Text Reading from Images with an End-to-End Trainable Network.

[DOI]

Proceedings of the Computer Vision - ACCV 2018, 2018

Detecting Text in the Wild with Deep Character Embedding Network.

[DOI]

Proceedings of the Computer Vision - ACCV 2018, 2018

2017

WordSup: Exploiting Word Annotations for Character Based Text Detection.

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Localizing by Describing: Attribute-Guided Attention Localization for Fine-Grained Recognition.

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Context-aware mathematical expression recognition: An end-to-end framework and a benchmark.

[DOI]

Proceedings of the 23rd International Conference on Pattern Recognition, 2016