Xie Chen
Orcid: 0000-0001-7423-617XAffiliations:
- Shanghai Jiao Tong University, China
- Microsoft, Redmond, WA, USA (former)
- University of Cambridge, UK (former)
According to our database1,
Xie Chen
authored at least 110 papers
between 2011 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Why Do Speech Language Models Fail to Generate Semantically Coherent Outputs? A Modality Evolving Perspective.
CoRR, 2024
CoRR, 2024
SLAM-AAC: Enhancing Audio Captioning with Paraphrasing Augmentation and CLAP-Refine through LLMs.
CoRR, 2024
DRCap: Decoding CLAP Latents with Retrieval-augmented Generation for Zero-shot Audio Captioning.
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
CoRR, 2024
Progressive Residual Extraction based Pre-training for Speech Representation Learning.
CoRR, 2024
TacoLM: GaTed Attention Equipped Codec Language Model are Efficient Zero-Shot Text to Speech Synthesizers.
CoRR, 2024
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement.
CoRR, 2024
CoRR, 2024
CoRR, 2024
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting.
CoRR, 2024
The X-LANCE Technical Report for Interspeech 2024 Speech Processing Using Discrete Speech Unit Challenge.
CoRR, 2024
VALL-T: Decoder-Only Generative Transducer for Robust and Decoding-Controllable Text-to-Speech.
CoRR, 2024
ELLA-V: Stable Neural Codec Language Modeling with Alignment-guided Sequence Reordering.
CoRR, 2024
1st Place Solution to Odyssey Emotion Recognition Challenge Task1: Tackling Class Imbalance Problem.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024
MER 2024: Semi-Supervised Learning, Noise Robustness, and Open-Vocabulary Multimodal Emotion Recognition.
Proceedings of the 2nd International Workshop on Multimodal and Responsible Affective Computing, 2024
AniTalker: Animate Vivid and Diverse Talking Faces through Identity-Decoupled Facial Motion Encoding.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
UniCATS: A Unified Context-Aware Text-to-Speech Framework with Contextual VQ-Diffusion and Vocoding.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
CoRR, 2023
CoRR, 2023
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer.
CoRR, 2023
Improving Code-Switching and Named Entity Recognition in ASR with Speech Editing based Data Augmentation.
CoRR, 2023
DAE-Talker: High Fidelity Speech-Driven Talking Face Generation with Diffusion Autoencoder.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Unsupervised Active Learning: Optimizing Labeling Cost-Effectiveness for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MT4SSL: Boosting Self-Supervised Speech Representation Learning by Integrating Multiple Targets.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Improving Code-Switching and Name Entity Recognition in ASR with Speech Editing based Data Augmentation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Fast-Hubert: an Efficient Training Framework for Self-Supervised Speech Representation Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
Exploring Effective Fusion Algorithms for Speech Based Self-Supervised Learning Models.
CoRR, 2022
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
VQTTS: High-Fidelity Text-to-Speech Synthesis with Self-Supervised VQ Acoustic Feature.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 38th International Conference on Machine Learning, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Developing Real-Time Streaming Transformer Transducer for Speech Recognition on Large-Scale Dataset.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
LSTM-LM with Long-Term History for First-Pass Decoding in Conversational Speech Recognition.
CoRR, 2020
Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications@NAACL-HLT 2018, 2018
2017
Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Exploiting the Tibetan Radicals in Recurrent Neural Network for Low-Resource Language Models.
Proceedings of the Neural Information Processing - 24th International Conference, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Recurrent neural network language model adaptation for multi-genre broadcast speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Recurrent neural network language model training with noise contrastive estimation for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Improving the training and evaluation efficiency of recurrent neural network language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Investigation of back-off based interpolation between recurrent neural network and n-gram language models.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Impact of single-microphone dereverberation on DNN-based meeting transcription systems.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
2011
Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011