Cong Liu

Orcid: 0009-0003-0328-423X

Affiliations:

IFLYTEK Research, Hefei, China
Microsoft Research Asia, Beijing, China (former)
University of Science and Technology of China, Hefei, China (former)

According to our database¹, Cong Liu authored at least 64 papers between 2006 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Hi-SAM: Marrying Segment Anything Model for Hierarchical Text Segmentation.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., March, 2025

Imprints: Mitigating Watermark Removal Attacks With Defensive Watermarks.

[BibT_eX]

[DOI]

IEEE Trans. Inf. Forensics Secur., 2025

VGTS: Visually Guided Text Spotting for novel categories in historical manuscripts.

[BibT_eX]

[DOI]

Expert Syst. Appl., 2025

2024

Multi-Dimensional Medical Image Fusion With Complex Sparse Representation.

[BibT_eX]

[DOI]

IEEE Trans. Biomed. Eng., September, 2024

Weakly supervised scene text generation for low-resource languages.

[BibT_eX]

[DOI]

Yangchen Xie

Xinyuan Chen

Hongjian Zhan

Palaiahnakote Shivakumara

Bing Yin

Cong Liu

Yue Lu

Expert Syst. Appl., March, 2024

Dynamic facial expression recognition with pseudo-label guided multi-modal pre-training.

[BibT_eX]

[DOI]

IET Comput. Vis., February, 2024

Syntax-Augmented Hierarchical Interactive Encoder for Zero-Shot Cross-Lingual Information Extraction.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

SEMv2: Table separation line detection based on instance segmentation.

[BibT_eX]

[DOI]

Pattern Recognit., 2024

NDOrder: Exploring a novel decoding order for scene text recognition.

[BibT_eX]

[DOI]

Palaiahnakote Shivakumara

Umapada Pal

Yue Lu

Expert Syst. Appl., 2024

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion.

[BibT_eX]

[DOI]

CoRR, 2024

Cross-modulated Attention Transformer for RGBT Tracking.

[BibT_eX]

[DOI]

CoRR, 2024

1DFormer: A Transformer Architecture Learning 1D Landmark Representations for Facial Landmark Tracking.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

ICDAR 2024 Competition on Recognition of Chemical Structures.

[BibT_eX]

[DOI]

Proceedings of the Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Athens, Greece, August 30, 2024

NAMER: Non-autoregressive Modeling for Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Computer Vision - ECCV 2024, 2024

Image as a Language: Revisiting Scene Text Recognition via Balanced, Unified and Synchronized Vision-Language Reasoning Network.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Generative Input: Towards Next-Generation Input Methods Paradigm.

[BibT_eX]

[DOI]

CoRR, 2023

1DFormer: Learning 1D Landmark Representations via Transformer for Facial Landmark Tracking.

[BibT_eX]

[DOI]

CoRR, 2023

Untying the Reversal Curse via Bidirectional Language Model Editing.

[BibT_eX]

[DOI]

CoRR, 2023

Exploring Part-Informed Visual-Language Learning for Person Re-Identification.

[BibT_eX]

[DOI]

CoRR, 2023

MADNet: Maximizing Addressee Deduction Expectation for Multi-Party Conversation Generation.

[BibT_eX]

[DOI]

CoRR, 2023

SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction.

[BibT_eX]

[DOI]

CoRR, 2023

OTS: A One-shot Learning Approach for Text Spotting in Historical Manuscripts.

[BibT_eX]

[DOI]

CoRR, 2023

SEMv2: Table Separation Line Detection Based on Conditional Convolution.

[BibT_eX]

[DOI]

CoRR, 2023

X-Adv: Physical Adversarial Object Attacks against X-ray Prohibited Item Detection.

[BibT_eX]

[DOI]

Proceedings of the 32nd USENIX Security Symposium, 2023

Handwritten Chemical Structure Image to Structure-Specific Markup Using Random Conditional Guided Decoder.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

JiuZhang 2.0: A Unified Chinese Pre-trained Language Model for Multi-task Mathematical Problem Solving.

[BibT_eX]

[DOI]

Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

End-to-End Multilingual Text Recognition Based on Byte Modeling.

[BibT_eX]

[DOI]

Proceedings of the Image and Graphics - 12th International Conference, 2023

A Multimodal Text Block Segmentation Framework for Photo Translation.

[BibT_eX]

[DOI]

Proceedings of the Image and Graphics - 12th International Conference, 2023

Speech4Mesh: Speech-Assisted Monocular 3D Facial Reconstruction for Speech-Driven 3D Facial Animation.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Self-Supervised Audio-Visual Speech Representations Learning by Multimodal Self-Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2023

Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2023

MADNet: Maximizing Addressee Deduction Expectation for Multi-Party Conversation Generation.

[BibT_eX]

[DOI]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

Bi-LRFusion: Bi-Directional LiDAR-Radar Fusion for 3D Dynamic Object Detection.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

GIFT: Graph-Induced Fine-Tuning for Multi-Party Conversation Understanding.

[BibT_eX]

[DOI]

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

HRDoc: Dataset and Baseline Method toward Hierarchical Reconstruction of Document Structures.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022

RCIT: An RSVP-Based Concealed Information Test Framework Using EEG Signals.

[BibT_eX]

[DOI]

IEEE Trans. Cogn. Dev. Syst., 2022

AFA: adversarial frequency alignment for domain generalized lung nodule detection.

[BibT_eX]

[DOI]

Neural Comput. Appl., 2022

InterHT: Knowledge Graph Embeddings by Interaction between Head and Tail Entities.

[BibT_eX]

[DOI]

CoRR, 2022

Few-shot X-ray Prohibited Item Detection: A Benchmark and Weak-feature Enhancement Network.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

JiuZhang: A Chinese Pre-trained Language Model for Mathematical Problem Understanding.

[BibT_eX]

[DOI]

Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2022

Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

Generative domain adaptation for chest X-ray image analysis.

[BibT_eX]

[DOI]

IET Image Process., 2021

A Deep Analysis of Speech Separation Guided Diarization Under Realistic Conditions.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2018

Fast and Robust Detection of Anatomical Landmarks Using Cascaded 3D Convolutional Networks Guided by Linear Square Regression.

[BibT_eX]

[DOI]

Proceedings of the Biometric Recognition - 13th Chinese Conference, 2018

2017

Nonrecurrent Neural Structure for Long-Term Dependence.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

2016

Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2016

2015

Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency.

[BibT_eX]

[DOI]

CoRR, 2015

Multi-task deep neural network acoustic models with model adaptation using discriminative speaker identity for whisper recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2015

A unified speaker-dependent speech separation and enhancement system based on deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

2013

A cluster-based multiple deep neural networks method for large vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Investigation of deep neural networks (DNN) for large vocabulary continuous speech recognition: Why DNN surpasses GMMS in acoustic modeling.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

2011

Trust Region-Based Optimization for Maximum Mutual Information Estimation of HMMs in Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2011

2010

Phonetic clustering based confidence measure for embedded speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

A bounded trust region optimization for discriminative training of HMMS in speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

A trust region based optimization for maximum mutual information estimation of HMMS in speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

A Constrained Line Search Optimization Method for Discriminative Training of HMMs.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2008

Exploiting Non-Target Region Information for Confidence Measure Based on Bayesian Information Criterion.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

2007

A Constrained Line Search Optimization for Discriminative Training in Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

A constrained line search approach to general discriminative HMM training.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

A Comparative Study on Confidence Measure in Mandarin Command Word Recognition.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Cong Liu

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...