Xiang Bai

Orcid: 0000-0002-3449-5940

According to our database1, Xiang Bai authored at least 381 papers between 2006 and 2025.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2025
A large cross-modal video retrieval dataset with reading comprehension.
Pattern Recognit., 2025

Toward real text manipulation detection: New dataset and new solution.
Pattern Recognit., 2025

Enhancing scene text detectors with realistic text image synthesis using diffusion models.
Comput. Vis. Image Underst., 2025

2024
Dual-Grained Lightweight Strategy.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2024

Turning a CLIP Model Into a Scene Text Spotter.
IEEE Trans. Pattern Anal. Mach. Intell., September, 2024

A Discrepancy Aware Framework for Robust Anomaly Detection.
IEEE Trans. Ind. Informatics, March, 2024

A truncated test scheme design method for success-failure in-orbit tests.
Reliab. Eng. Syst. Saf., March, 2024

Sequential visual and semantic consistency for semi-supervised text recognition.
Pattern Recognit. Lett., 2024

Class-Aware Mask-guided feature refinement for scene text recognition.
Pattern Recognit., 2024

DSText V2: A comprehensive video text spotting dataset for dense and small text.
Pattern Recognit., 2024

RSL-SQL: Robust Schema Linking in Text-to-SQL Generation.
CoRR, 2024

R-CoT: Reverse Chain-of-Thought Problem Generation for Geometric Reasoning in Large Multimodal Models.
CoRR, 2024

LLaVA-KD: A Framework of Distilling Multimodal Large Language Models.
CoRR, 2024

MCTBench: Multimodal Cognition towards Text-Rich Visual Scenes Benchmark.
CoRR, 2024

Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning.
CoRR, 2024

VIRT: Vision Instructed Transformer for Robotic Manipulation.
CoRR, 2024

PDF-WuKong: A Large Multimodal Model for Efficient Long PDF Reading with End-to-End Sparse Sampling.
CoRR, 2024

Stochastic Real-Time Economic Dispatch for Integrated Electric and Gas Systems Considering Uncertainty Propagation and Pipeline Leakage.
CoRR, 2024

Attention-Guided Perturbation for Unsupervised Image Anomaly Detection.
CoRR, 2024

Mini-Monkey: Multi-Scale Adaptive Cropping for Multimodal Large Language Models.
CoRR, 2024

LION: Linear Group RNN for 3D Object Detection in Point Clouds.
CoRR, 2024

A Unified Framework for 3D Scene Understanding.
CoRR, 2024

SOOD++: Leveraging Unlabeled Data to Boost Oriented Object Detection.
CoRR, 2024

MoE Jetpack: From Dense Checkpoints to Adaptive Mixture of Experts for Vision Tasks.
CoRR, 2024

MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering.
CoRR, 2024

VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-domain Generalization.
CoRR, 2024

TextSquare: Scaling up Text-Centric Visual Instruction Tuning.
CoRR, 2024

Anomaly Detection by Adapting a pre-trained Vision Language Model.
CoRR, 2024

TextMonkey: An OCR-Free Large Multimodal Model for Understanding Document.
CoRR, 2024

PointMamba: A Simple State Space Model for Point Cloud Analysis.
CoRR, 2024

CauESC: A Causal Aware Model for Emotional Support Conversation.
CoRR, 2024

An open dataset for oracle bone script recognition and decipherment.
CoRR, 2024

An open dataset for the evolution of oracle bone characters: EVOBC.
CoRR, 2024

SwinTextSpotter v2: Towards Better Synergy for Scene Text Spotting.
CoRR, 2024

SAM3D: zero-shot 3D object detection via the segment anything model.
Sci. China Inf. Sci., 2024

Research on the spacecraft ground equivalence test assessment problem: A comprehensive assessment method combining interval-type evaluation and prospect-two-dimensional cloud.
Appl. Soft Comput., 2024

Exploring the Capabilities of Large Multimodal Models on Dense Text.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

Puzzle Pieces Picker: Deciphering Ancient Chinese Characters with Radical Reconstruction.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

Dataset and Benchmark for Urdu Natural Scenes Text Detection, Recognition and Visual Question Answering.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

Knowledge Mining of Scene Text for Referring Expression Comprehension.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

The First Swahili Language Scene Text Detection and Recognition Dataset.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

Progressive Evolution from Single-Point to Polygon for Scene Text.
Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

Maskstr: Guide Scene Text Recognition Models with Masking.
Proceedings of the IEEE International Conference on Acoustics, 2024

PSALM: Pixelwise SegmentAtion with Large Multi-modal Model.
Proceedings of the Computer Vision - ECCV 2024, 2024

Make Your ViT-Based Multi-view 3D Detectors Faster via Token Compression.
Proceedings of the Computer Vision - ECCV 2024, 2024

WAS: Dataset and Methods for Artistic Text Segmentation.
Proceedings of the Computer Vision - ECCV 2024, 2024

SC4D: Sparse-Controlled Video-to-4D Generation and Motion Transfer.
Proceedings of the Computer Vision - ECCV 2024, 2024

SEED: A Simple and Effective 3D DETR in Point Clouds.
Proceedings of the Computer Vision - ECCV 2024, 2024

PartGLEE: A Foundation Model for Recognizing and Parsing Any Objects.
Proceedings of the Computer Vision - ECCV 2024, 2024

OPEN: Object-Wise Position Embedding for Multi-view 3D Object Detection.
Proceedings of the Computer Vision - ECCV 2024, 2024

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

General Object Foundation Model for Images and Videos at Scale.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

OMNIPARSER: A Unified Framework for Text Spotting, Key Information Extraction and Table Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Monkey: Image Resolution and Text Label are Important Things for Large Multi-Modal Models.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Bridging the Gap Between End-to-End and Two-Step Text Spotting.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Deciphering Oracle Bone Language with Diffusion Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
SAN: Side Adapter Network for Open-Vocabulary Semantic Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

SPTS v2: Single-Point Scene Text Spotting.
IEEE Trans. Pattern Anal. Mach. Intell., December, 2023

Correction to: YOLOP: You Only Look Once for Panoptic Driving Perception.
Mach. Intell. Res., December, 2023

You Only Look Bottom-Up for Monocular 3D Object Detection.
IEEE Robotics Autom. Lett., November, 2023

CycMuNet+: Cycle-Projected Mutual Learning for Spatial-Temporal Video Super-Resolution.
IEEE Trans. Pattern Anal. Mach. Intell., November, 2023

Confidence-weighted mutual supervision on dual networks for unsupervised cross-modality image segmentation.
Sci. China Inf. Sci., November, 2023

Stochastic programming based multi-arm bandit offloading strategy for internet of things.
Digit. Commun. Networks, October, 2023

Affinity Feature Strengthening for Accurate, Complete and Robust Vessel Segmentation.
IEEE J. Biomed. Health Informatics, August, 2023

EPNet++: Cascade Bi-Directional Fusion for Multi-Modal 3D Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., July, 2023

Searching a High Performance Feature Extractor for Text Recognition Network.
IEEE Trans. Pattern Anal. Mach. Intell., May, 2023

Content-Adaptive Auto-Occlusion Network for Occluded Person Re-Identification.
IEEE Trans. Image Process., 2023

Real-Time Scene Text Detection With Differentiable Binarization and Adaptive Scale Fusion.
IEEE Trans. Pattern Anal. Mach. Intell., 2023

K-ESConv: Knowledge Injection for Emotional Support Dialogue Systems via Prompt Learning.
CoRR, 2023

DISC-FinLLM: A Chinese Financial Large Language Model based on Multiple Experts Fine-tuning.
CoRR, 2023

SingleInsert: Inserting New Concepts from a Single Image into Text-to-Image Models for Flexible Editing.
CoRR, 2023

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition.
CoRR, 2023

SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth.
CoRR, 2023

Looking and Listening: Audio Guided Text Recognition.
CoRR, 2023

On the Hidden Mystery of OCR in Large Multimodal Models.
CoRR, 2023

Multi-Modal 3D Object Detection by Box Matching.
CoRR, 2023

Visual Information Extraction in the Wild: Practical Dataset and End-to-end Solution.
CoRR, 2023

ICDAR 2023 Video Text Reading Competition for Dense and Small Text.
CoRR, 2023

Diffusion-Based 3D Object Detection with Random Boxes.
Proceedings of the Pattern Recognition and Computer Vision - 6th Chinese Conference, 2023

Query-based Temporal Fusion with Explicit Motion for 3D Object Detection.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich Document Images.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Reading the Seal Title.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Born Digital Video Text Question Answering.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Video Text Reading for Dense and Small Text.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Detecting Tampered Text in Images.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

Visual Information Extraction in the Wild: Practical Dataset and End-to-End Solution.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

TextREC: A Dataset for Referring Expression Comprehension with Reading Comprehension.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

ICDAR 2023 Competition on Recognition of Multi-line Handwritten Mathematical Expressions.
Proceedings of the Document Analysis and Recognition - ICDAR 2023, 2023

A Simple Vision Transformer for Weakly Semi-supervised 3D Object Detection.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Turning a CLIP Model into a Scene Text Detector.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Modeling Entities as Semantic Points for Visual Information Extraction in the Wild.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Side Adapter Network for Open-Vocabulary Semantic Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CAPE: Camera View Position Embedding for Multi-View 3D Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

InstMove: Instance Motion for Object-centric Video Segmentation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

CrowdCLIP: Unsupervised Crowd Counting via Vision-Language Model.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

SOOD: Towards Semi-Supervised Oriented Object Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

StereoDistill: Pick the Cream from LiDAR for Distilling Stereo-Based 3D Object Detection.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Cell Localization and Counting Using Direction Field Map.
IEEE J. Biomed. Health Informatics, 2022

Boundary TextSpotter: Toward Arbitrary-Shaped Scene Text Spotting.
IEEE Trans. Image Process., 2022

End-to-End Temporal Action Detection With Transformer.
IEEE Trans. Image Process., 2022

Conditional Feature Learning Based Transformer for Text-Based Person Search.
IEEE Trans. Image Process., 2022

Progressive and Aligned Pose Attention Transfer for Person Image Generation.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges.
IEEE Trans. Pattern Anal. Mach. Intell., 2022

Author Correction: Advancing COVID-19 diagnosis with privacy-preserving collaboration in artificial intelligence.
Nat. Mach. Intell., 2022

AutoScale: Learning to Scale for Crowd Counting.
Int. J. Comput. Vis., 2022

Occluded Video Instance Segmentation: A Benchmark.
Int. J. Comput. Vis., 2022

YOLOP: You Only Look Once for Panoptic Driving Perception.
Int. J. Autom. Comput., 2022

The Runner-up Solution for YouTube-VIS Long Video Challenge 2022.
CoRR, 2022

TransCrowd: weakly-supervised crowd counting with transformers.
Sci. China Inf. Sci., 2022

Comprehensive benchmark datasets for Amharic scene text detection and recognition.
Sci. China Inf. Sci., 2022

Smart Electronic Nose Enabled by an All-Feature Olfactory Algorithm.
Adv. Intell. Syst., 2022

Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

SPTS: Single-Point Text Spotting.
Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Origin-Destination Traffic Prediction based on Hybrid Spatio-Temporal Network.
Proceedings of the IEEE International Conference on Data Mining, 2022

A Simple Baseline for Open-Vocabulary Semantic Segmentation with Pre-trained Vision-Language Model.
Proceedings of the Computer Vision - ECCV 2022, 2022

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022

CCPL: Contrastive Coherence Preserving Loss for Versatile Style Transfer.
Proceedings of the Computer Vision - ECCV 2022, 2022

In Defense of Online Models for Video Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

SeqFormer: Sequential Transformer for Video Instance Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Optimal Boxes: Boosting End-to-End Scene Text Recognition by Adjusting Annotated Bounding Boxes via Reinforcement Learning.
Proceedings of the Computer Vision - ECCV 2022, 2022

An End-to-End Transformer Model for Crowd Localization.
Proceedings of the Computer Vision - ECCV 2022, 2022

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition.
Proceedings of the Computer Vision - ECCV 2022, 2022

GitNet: Geometric Prior-Based Transformation for Birds-Eye-View Segmentation.
Proceedings of the Computer Vision - ECCV 2022, 2022

Syntax-Aware Network for Handwritten Mathematical Expression Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Knowledge Mining with Scene Text for Fine-Grained Recognition.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Few Could Be Better Than All: Feature Sampling and Grouping for Scene Text Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

Vision-Language Pre-Training for Boosting Scene Text Detectors.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

An Empirical Study of End-to-End Temporal Action Detection.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2021
Affinity Space Adaptation for Semantic Segmentation Across Domains.
IEEE Trans. Image Process., 2021

Video Text Tracking With a Spatio-Temporal Complementary Model.
IEEE Trans. Image Process., 2021

PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis.
IEEE Trans. Image Process., 2021

MASTER: Multi-aspect non-local network for scene text recognition.
Pattern Recognit., 2021

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes.
IEEE Trans. Pattern Anal. Mach. Intell., 2021

Advancing COVID-19 diagnosis with privacy-preserving collaboration in artificial intelligence.
Nat. Mach. Intell., 2021

Deep learning for predicting COVID-19 malignant progression.
Medical Image Anal., 2021

DeepFlux for Skeleton Detection in the Wild.
Int. J. Comput. Vis., 2021

A Simple Baseline for Zero-shot Semantic Segmentation with Pre-trained Vision-language Model.
CoRR, 2021

Anomaly Discovery in Semantic Segmentation via Distillation Comparison Networks.
CoRR, 2021

SeqFormer: a Frustratingly Simple Model for Video Instance Segmentation.
CoRR, 2021

Advancing COVID-19 Diagnosis with Privacy-Preserving Collaboration in Artificial Intelligence.
CoRR, 2021

CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis.
CoRR, 2021

End-to-end Temporal Action Detection with Transformer.
CoRR, 2021

TransCrowd: Weakly-Supervised Crowd Counting with Transformer.
CoRR, 2021

InsertGNN: Can Graph Neural Networks Outperform Humans in TOEFL Sentence Insertion Problem?
CoRR, 2021

Occluded Video Instance Segmentation.
CoRR, 2021

WDNet: Watermark-Decomposition Network for Visible Watermark Removal.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2021

Bootstrap Your Object Detector via Mixed Training.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Occluded Video Instance Segmentation: Dataset and ICCV 2021 Challenge.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Deep Interactive Video Inpainting: An Invisibility Cloak for Harry Potter.
Proceedings of the MM '21: ACM Multimedia Conference, Virtual Event, China, October 20, 2021

Scene Text Detection with Scribble Line.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021



End-to-End Semi-Supervised Object Detection with Soft Teacher.
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021

DAG-FL: Direct Acyclic Graph-based Blockchain Empowers On-Device Federated Learning.
Proceedings of the ICC 2021, 2021

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Scene Text Retrieval via Joint Text Detection and Similarity Learning.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Multi-Shot Temporal Event Localization: A Benchmark.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

MOST: A Multi-Oriented Scene Text Detector With Localization Refinement.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

FaceController: Controllable Attribute Editing for Face in the Wild.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Few-Shot Text Style Transfer via Deep Feature Similarity.
IEEE Trans. Image Process., 2020

Learning Sparse and Identity-Preserved Hidden Attributes for Person Re-Identification.
IEEE Trans. Image Process., 2020

An Improved Multi-View Convolutional Neural Network for 3D Object Retrieval.
IEEE Trans. Image Process., 2020

Progressive Object Transfer Detection.
IEEE Trans. Image Process., 2020

Deep-Person: Learning discriminative deep features for person Re-Identification.
Pattern Recognit., 2020

PCL: Proposal Cluster Learning for Weakly Supervised Object Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Blockchain Enabled Federated Slicing for 5G Networks with AI Accelerated Optimization.
IEEE Netw., 2020

A comparison of methods for 3D scene shape retrieval.
Comput. Vis. Image Underst., 2020

Scene Text Detection with Scribble Lines.
CoRR, 2020

FedOCR: Communication-Efficient Federated Learning for Scene Text Recognition.
CoRR, 2020

Efficient Backbone Search for Scene Text Recognition.
CoRR, 2020

SynthText3D: synthesizing scene text images from 3D virtual worlds.
Sci. China Inf. Sci., 2020

Cost-Effective Adversarial Attacks against Scene Text Recognition.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

AutoSTR: Efficient Backbone Search for Scene Text Recognition.
Proceedings of the Computer Vision - ECCV 2020, 2020

Intra-class Feature Variation Distillation for Semantic Segmentation.
Proceedings of the Computer Vision - ECCV 2020, 2020

Scene Text Image Super-Resolution in the Wild.
Proceedings of the Computer Vision - ECCV 2020, 2020

Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting.
Proceedings of the Computer Vision - ECCV 2020, 2020

EPNet: Enhancing Point Features with Image Semantics for 3D Object Detection.
Proceedings of the Computer Vision - ECCV 2020, 2020


Maximum Entropy Regularization and Chinese Text Recognition.
Proceedings of the Document Analysis Systems - 14th IAPR International Workshop, 2020

Semantically Multi-Modal Image Synthesis.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Super-BPD: Super Boundary-to-Pixel Direction for Fast Image Segmentation.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

All You Need Is Boundary: Toward Arbitrary-Shaped Text Spotting.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

TextScanner: Reading Characters in Order for Robust Scene Text Recognition.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

TANet: Robust 3D Object Detection from Point Clouds with Triple Attention.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Real-Time Scene Text Detection with Differentiable Binarization.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Deep FisherNet for Image Classification.
IEEE Trans. Neural Networks Learn. Syst., 2019

TextField: Learning a Deep Direction Field for Irregular Scene Text Detection.
IEEE Trans. Image Process., 2019

Automatic Ensemble Diffusion for 3D Shape and Image Retrieval.
IEEE Trans. Image Process., 2019

Image Caption Generation with Part of Speech Guidance.
Pattern Recognit. Lett., 2019

SegLink++: Detecting Dense and Arbitrary-shaped Scene Text by Instance-aware Component Grouping.
Pattern Recognit., 2019

ASTER: An Attentional Scene Text Recognizer with Flexible Rectification.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Richer Convolutional Features for Edge Detection.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Regularized Diffusion Process on Bidirectional Context for Object Retrieval.
IEEE Trans. Pattern Anal. Mach. Intell., 2019

Action recognition for depth video using multi-view dynamic images.
Inf. Sci., 2019

VD-SAN: Visual-Densely Semantic Attention Network for Image Caption Generation.
Neurocomputing, 2019

ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard.
CoRR, 2019

Asymmetric Non-local Neural Networks for Semantic Segmentation.
CoRR, 2019

Learn to Scale: Generating Multipolar Normalized Density Map for Crowd Counting.
CoRR, 2019

2D-CTC for Scene Text Recognition.
CoRR, 2019

Special focus on deep learning for computer vision.
Sci. China Inf. Sci., 2019

Feature context learning for human parsing.
Sci. China Inf. Sci., 2019

Editing Text in the Wild.
Proceedings of the 27th ACM International Conference on Multimedia, 2019

ICDAR 2019 Robust Reading Challenge on Reading Chinese Text on Signboard.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Multiple Comparative Attention Network for Offline Handwritten Chinese Character Recognition.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

Patch Aggregator for Scene Text Script Identification.
Proceedings of the 2019 International Conference on Document Analysis and Recognition, 2019

The Seventh Visual Object Tracking VOT2019 Challenge Results.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Asymmetric Non-Local Neural Networks for Semantic Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Symmetry-Constrained Rectification Network for Scene Text Recognition.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

View N-Gram Network for 3D Object Retrieval.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Progressive Pose Attention Transfer for Person Image Generation.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

iSAID: A Large-scale Dataset for Instance Segmentation in Aerial Images.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019

DeepFlux for Skeletons in the Wild.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

Scene Text Recognition from Two-Dimensional Perspective.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Human-Like Delicate Region Erasing Strategy for Weakly Supervised Detection.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Non-stationary texture synthesis by adversarial expansion.
ACM Trans. Graph., 2018

Face Alignment With Deep Regression.
IEEE Trans. Neural Networks Learn. Syst., 2018

Anisotropic-Scale Junction Detection and Matching for Indoor Images.
IEEE Trans. Image Process., 2018

TextBoxes++: A Single-Shot Oriented Scene Text Detector.
IEEE Trans. Image Process., 2018

Image stitching by line-guided local warping with global similarity constraint.
Pattern Recognit., 2018

Revisiting multiple instance neural networks.
Pattern Recognit., 2018

Improving context-sensitive similarity via smooth neighborhood for object retrieval.
Pattern Recognit., 2018

Information processing for unmanned aerial vehicles (UAVs) in surveying, mapping, and navigation.
Geo spatial Inf. Sci., 2018

Action Recognition for Depth Video using Multi-view Dynamic Images.
CoRR, 2018

A Deep End-to-End Model for Transient Stability Assessment With PMU Data.
IEEE Access, 2018

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification.
IEEE Access, 2018

Incremental Deep Hidden Attribute Learning.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Cascaded SR-GAN for Scale-Adaptive Low Resolution Person Re-identification.
Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, 2018

Learning Training Samples for Occlusion Edge Detection and Its Application in Depth Ordering Inference.
Proceedings of the 24th International Conference on Pattern Recognition, 2018

ICPR2018 Contest on Object Detection in Aerial Images (ODAI-18).
Proceedings of the 24th International Conference on Pattern Recognition, 2018

Hard-Aware Point-to-Set Deep Metric for Person Re-identification.
Proceedings of the Computer Vision - ECCV 2018, 2018

Adaptively Transforming Graph Matching.
Proceedings of the Computer Vision - ECCV 2018, 2018

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes.
Proceedings of the Computer Vision - ECCV 2018, 2018

Feature Fusion for Scene Text Detection.
Proceedings of the 13th IAPR International Workshop on Document Analysis Systems, 2018

DOTA: A Large-Scale Dataset for Object Detection in Aerial Images.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Rotation-Sensitive Regression for Oriented Scene Text Detection.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018

Triplet-Center Loss for Multi-View 3D Object Retrieval.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2018



2017
GIFT: Towards Scalable 3D Shape Retrieval.
IEEE Trans. Multim., 2017

Texture Characterization Using Shape Co-Occurrence Patterns.
IEEE Trans. Image Process., 2017

DeepSkeleton: Learning Multi-Task Scale-Associated Deep Side Outputs for Object Skeleton Extraction in Natural Images.
IEEE Trans. Image Process., 2017

Mixed Noise Removal via Laplacian Scale Mixture Modeling and Nonlocal Low-Rank Approximation.
IEEE Trans. Image Process., 2017

AID: A Benchmark Data Set for Performance Evaluation of Aerial Scene Classification.
IEEE Trans. Geosci. Remote. Sens., 2017

Editorial of the Special Issue on Multi-instance Learning in Pattern Recognition and Vision.
Pattern Recognit., 2017

Deep patch learning for weakly supervised object classification and discovery.
Pattern Recognit., 2017

Text/non-text image classification in the wild with convolutional neural networks.
Pattern Recognit., 2017

An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition.
IEEE Trans. Pattern Anal. Mach. Intell., 2017

Preface.
J. Comput. Sci. Technol., 2017

Directional Edge Boxes: Exploiting Inner Normal Direction Cues for Effective Object Proposal Generation.
J. Comput. Sci. Technol., 2017

Texture Characterization by Using Shape Co-occurrence Patterns.
CoRR, 2017

Integrating Scene Text and Visual Appearance for Fine-Grained Image Classification with Convolutional Neural Networks.
CoRR, 2017

DeepCADx: Automated Prostate Cancer Detection and Diagnosis in mp-MRI based on Multimodal Convolutional Neural Networks.
Proceedings of the 2017 ACM on Multimedia Conference, 2017

Dynamic Multi-Task Learning with Convolutional Neural Network.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Joint Classification Loss and Histogram Loss for Sketch-Based Image Retrieval.
Proceedings of the Image and Graphics - 9th International Conference, 2017

Max-Pooling Based Scene Text Proposal for Scene Text Detection.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17).
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Auto-Encoder Guided GAN for Chinese Calligraphy Synthesis.
Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Fusing Image and Segmentation Cues for Skeleton Extraction in the Wild.
Proceedings of the 2017 IEEE International Conference on Computer Vision Workshops, 2017

Ensemble Diffusion for Retrieval.
Proceedings of the IEEE International Conference on Computer Vision, 2017

Multiple Instance Detection Network with Online Instance Classifier Refinement.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Detecting Oriented Text in Natural Images by Linking Segments.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Richer Convolutional Features for Edge Detection.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Scalable Person Re-identification on Supervised Smoothed Manifold.
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017

Divide and Fuse: A Re-ranking Approach for Person Re-identification.
Proceedings of the British Machine Vision Conference 2017, 2017

TextBoxes: A Fast Text Detector with a Single Deep Neural Network.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Regularized Diffusion Process for Visual Retrieval.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multidimensional Scaling on Multiple Input Distance Matrices.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017


2016
Multiple Stage Residual Model for Image Classification and Vector Compression.
IEEE Trans. Multim., 2016

Strokelets: A Learned Multi-Scale Mid-Level Representation for Scene Text Recognition.
IEEE Trans. Image Process., 2016

Sparse Contextual Activation for Efficient Visual Re-Ranking.
IEEE Trans. Image Process., 2016

Co-spectral for robust shape clustering.
Pattern Recognit. Lett., 2016

Efficient shape representation, matching, ranking, and its applications.
Pattern Recognit. Lett., 2016

Script identification in the wild via discriminative convolutional neural network.
Pattern Recognit., 2016

Multiple instance subspace learning via partial random projection tree for local reflection symmetry in natural images.
Pattern Recognit., 2016

Traffic sign detection and recognition using fully convolutional network guided proposals.
Neurocomputing, 2016

Deep Learning Representation using Autoencoder for 3D Shape Retrieval.
Neurocomputing, 2016

Deep sketch feature for cross-domain image retrieval.
Neurocomputing, 2016

Similarity Fusion for Visual Tracking.
Int. J. Comput. Vis., 2016

Scene text detection and recognition: recent advances and future trends.
Frontiers Comput. Sci., 2016

Scene Text Detection via Holistic, Multi-Channel Prediction.
CoRR, 2016

AID: A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification.
CoRR, 2016

Deep FisherNet for Object Classification.
CoRR, 2016

Symmetry-based object proposal for text detection.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Scene text script identification with Convolutional Recurrent Neural Networks.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Distinguishing text/non-text natural images with Multi-Dimensional Recurrent Neural Networks.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Smooth Neighborhood Structure Mining on Multiple Affinity Graphs with Applications to Context-Sensitive Similarity.
Proceedings of the Computer Vision - ECCV 2016, 2016

Multi-oriented Text Detection with Fully Convolutional Networks.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Robust Scene Text Recognition with Automatic Rectification.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

Object Skeleton Extraction in Natural Images by Fusing Scale-Associated Deep Side Outputs.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016

GIFT: A Real-Time and Scalable 3D Shape Search Engine.
Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016


2015
Learning Discriminative Pattern for Real-Time Car Brand Recognition.
IEEE Trans. Intell. Transp. Syst., 2015

Vehicle Color Recognition With Spatial Pyramid Deep Learning.
IEEE Trans. Intell. Transp. Syst., 2015

DeepPano: Deep Panoramic Representation for 3-D Shape Recognition.
IEEE Signal Process. Lett., 2015

Neural shape codes for 3D model retrieval.
Pattern Recognit. Lett., 2015

3D Shape Matching via Two Layer Coding.
IEEE Trans. Pattern Anal. Mach. Intell., 2015

Beyond diffusion process: Neighbor set similarity for fast re-ranking.
Inf. Sci., 2015

Automatic discrimination of text and non-text natural images.
Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

Automatic script identification in the wild.
Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

Relaxed Multiple-Instance SVM with Application to Object Discovery.
Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015

Symmetry-based text line detection in natural scenes.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

DeepContour: A deep convolutional feature learned by positive-sharing loss for contour detection.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015

2014
Vehicle Color Recognition on Urban Road by Feature Context.
IEEE Trans. Intell. Transp. Syst., 2014

A Unified Framework for Multioriented Text Detection and Recognition.
IEEE Trans. Image Process., 2014

Shape Vocabulary: A Robust and Efficient Shape Representation for Shape Matching.
IEEE Trans. Image Process., 2014

Exemplar-Based Human Action Pose Correction.
IEEE Trans. Cybern., 2014

Bag of contour fragments for robust shape classification.
Pattern Recognit., 2014

Robust Subspace Discovery via Relaxed Rank Minimization.
Neural Comput., 2014

Online Multiple targets Detection and Tracking from Mobile robot in Cluttered indoor Environments with Depth Camera.
Int. J. Pattern Recognit. Artif. Intell., 2014

Deep Regression for Face Alignment.
CoRR, 2014

Scale-Space SIFT flow.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2014

Real-time object tracking via optimal feature subspace.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Aggregating contour fragments for shape classification.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Human Detection Using Learned Part Alphabet and Pose Dictionary.
Proceedings of the Computer Vision - ECCV 2014, 2014

Strokelets: A Learned Multi-scale Representation for Scene Text Recognition.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Shape Recognition by Combining Contour and Skeleton into a Mid-Level Representation.
Proceedings of the Pattern Recognition - 6th Chinese Conference, 2014

Multiple Stage Residual Model for Accurate Image Classification.
Proceedings of the Computer Vision - ACCV 2014, 2014

2013
Distance Transform-Based Skeleton Extraction and Its Applications in Sensor Networks.
IEEE Trans. Parallel Distributed Syst., 2013

Shape clustering: Common structure discovery.
Pattern Recognit., 2013

Regularized vector field learning with sparse approximation for mismatch removal.
Pattern Recognit., 2013

Densifying Distance Spaces for Shape and Image Retrieval.
J. Math. Imaging Vis., 2013

Face identification using reference-based features with message passing model.
Neurocomputing, 2013

Skeleton pruning as trade-off between skeleton simplicity and reconstruction error.
Sci. China Inf. Sci., 2013

Max-Margin Multiple-Instance Dictionary Learning.
Proceedings of the 30th International Conference on Machine Learning, 2013

Traffic sign classification using two-layer image representation.
Proceedings of the IEEE International Conference on Image Processing, 2013

2012
Co-Transduction for Shape Retrieval.
IEEE Trans. Image Process., 2012

Shape matching and classification using height functions.
Pattern Recognit. Lett., 2012

Fusion with Diffusion for Robust Visual Tracking.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Adjacent coding for image classification.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Online Random Ferns for robust visual tracking.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Skeleton Extraction from Incomplete Boundaries in Sensor Networks Based on Distance Transform.
Proceedings of the 2012 IEEE 32nd International Conference on Distributed Computing Systems, 2012

Color image segmentation using mean shift and improved spectral clustering.
Proceedings of the 12th International Conference on Control Automation Robotics & Vision, 2012

Detecting texts of arbitrary orientations in natural images.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Fan Shape Model for object detection.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

Exemplar-based human action pose correction and tagging.
Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012

One-Class Multiple Instance Learning via Robust PCA for Common Object Discovery.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
Learning context-sensitive similarity by shortest path propagation.
Pattern Recognit., 2011

Skeleton growing and pruning with bending potential ratio.
Pattern Recognit., 2011

Shape Matching and Recognition Using Group-Wised Points.
Proceedings of the Advances in Image and Video Technology - 5th Pacific Rim Symposium, 2011

Maximal Cliques that Satisfy Hard Constraints with Application to Deformable Object Model Learning.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Multiple Feature Fusion for Object Tracking.
Proceedings of the Intelligent Science and Intelligent Data Engineering, 2011

Image labeling by multiple segmentation.
Proceedings of the 18th IEEE International Conference on Image Processing, 2011

Shape Matching Using Points Co-occurrence Pattern.
Proceedings of the Sixth International Conference on Image and Graphics, 2011

Feature context for image classification and object detection.
Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition, 2011

Class-specific object contour detection by iteratively combining context information.
Proceedings of the 8th International Conference on Information, 2011

2010
Connectivity-Based Skeleton Extraction in Wireless Sensor Networks.
IEEE Trans. Parallel Distributed Syst., 2010

Auto-Context and Its Application to High-Level Vision Tasks and 3D Brain Image Segmentation.
IEEE Trans. Pattern Anal. Mach. Intell., 2010

Learning Context-Sensitive Shape Similarity by Graph Transduction.
IEEE Trans. Pattern Anal. Mach. Intell., 2010

Skeletonization with Particle Filters.
Int. J. Pattern Recognit. Artif. Intell., 2010

Shape Classification Using Tree -Unions.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Object Recognition Using Junctions.
Proceedings of the Computer Vision - ECCV 2010, 2010

Co-transduction for Shape Retrieval.
Proceedings of the Computer Vision, 2010

2009
A Simple Adaptive Optimization Scheme for IEEE 802.11 with Differentiated Channel Access.
IEEE Commun. Lett., 2009

Contour Grouping with Partial Shape Similarity.
Proceedings of the Advances in Image and Video Technology, Third Pacific Rim Symposium, 2009

CASE: Connectivity-Based Skeleton Extraction in Wireless Sensor Networks.
Proceedings of the INFOCOM 2009. 28th IEEE International Conference on Computer Communications, 2009

Integrating contour and skeleton for shape classification.
Proceedings of the 12th IEEE International Conference on Computer Vision Workshops, 2009

Active skeleton for non-rigid object detection.
Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2009, Kyoto, Japan, September 27, 2009

Shape band: A deformable object detection approach.
Proceedings of the 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2009), 2009

Skeleton Graph Matching Based on Critical Points Using Path Similarity.
Proceedings of the Computer Vision, 2009

2008
Detection and recognition of contour parts based on shape similarity.
Pattern Recognit., 2008

Path Similarity Skeleton Graph Matching.
IEEE Trans. Pattern Anal. Mach. Intell., 2008

Skeleton-Based Shape Classification Using Path Similarity.
Int. J. Pattern Recognit. Artif. Intell., 2008

Computing Stable Skeletons with Particle Filters.
Proceedings of the PRICAI 2008: Trends in Artificial Intelligence, 2008

Multiscale Random Fields with Application to Contour Grouping.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Symmetry of Shapes Via Self-similarity.
Proceedings of the Advances in Visual Computing, 4th International Symposium, 2008

Skeletonization of gray-scale image from incomplete boundaries.
Proceedings of the International Conference on Image Processing, 2008

Improving Shape Retrieval by Learning Graph Transduction.
Proceedings of the Computer Vision, 2008

2007
Skeleton Pruning by Contour Partitioning with Discrete Curve Evolution.
IEEE Trans. Pattern Anal. Mach. Intell., 2007

Skeletonization using SSM of the Distance Transform.
Proceedings of the International Conference on Image Processing, 2007

Contour Grouping Based on Local Symmetry.
Proceedings of the IEEE 11th International Conference on Computer Vision, 2007

Shape Classification Based on Skeleton Path Similarity.
Proceedings of the Energy Minimization Methods in Computer Vision and Pattern Recognition, 2007

Discrete Skeleton Evolution.
Proceedings of the Energy Minimization Methods in Computer Vision and Pattern Recognition, 2007

Visual Curvature.
Proceedings of the 2007 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2007), 2007

2006
Skeleton Pruning by Contour Partitioning.
Proceedings of the Discrete Geometry for Computer Imagery, 13th International Conference, 2006


  Loading...