Xunying Liu

Orcid: 0000-0001-6725-1160

According to our database1, Xunying Liu authored at least 200 papers between 2003 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Personalized Adversarial Data Augmentation for Dysarthric and Elderly Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Self-Supervised ASR Models and Features for Dysarthric and Elderly Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Disentangling Speakers in Multi-Talker Speech Recognition with Speaker-Aware CTC.
CoRR, 2024

Exploring SSL Discrete Tokens for Multilingual ASR.
CoRR, 2024

Exploring SSL Discrete Speech Features for Zipformer-based Contextual ASR.
CoRR, 2024

Large Language Model Can Transcribe Speech in Multi-Talker Scenarios with Versatile Instructions.
CoRR, 2024

Empowering Whisper as a Joint Multi-Talker and Target-Talker Speech Recognition System.
CoRR, 2024

Homogeneous Speaker Features for On-the-Fly Dysarthric and Elderly Speaker Adaptation.
CoRR, 2024

GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement.
CoRR, 2024

One-pass Multiple Conformer and Foundation Speech Systems Compression and Quantization Using An All-in-one Neural Model.
CoRR, 2024

Joint Speaker Features Learning for Audio-visual Multichannel Speech Separation and Recognition.
CoRR, 2024

Towards Effective and Efficient Non-autoregressive Decoding Using Block-based Attention Mask.
CoRR, 2024

Perceiver-Prompt: Flexible Speaker Adaptation in Whisper for Chinese Disordered Speech Recognition.
CoRR, 2024

WavLLM: Towards Robust and Adaptive Speech Large Language Model.
CoRR, 2024

Enhancing Pre-Trained ASR System Fine-Tuning for Dysarthric Speech Recognition Using Adversarial Data Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards Automatic Data Augmentation for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards High-Performance and Low-Latency Feature-Based Speaker Adaptation of Conformer Speech Recognition Systems.
Proceedings of the IEEE International Conference on Acoustics, 2024

Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Cross-Speaker Encoding Network for Multi-Talker Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

WavLLM: Towards Robust and Adaptive Speech Large Language Model.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Hiformer: Sequence Modeling Networks With Hierarchical Attention Mechanisms.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Audio-Visual End-to-End Multi-Channel Speech Separation, Dereverberation and Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Confidence Score Based Speaker Adaptation of Conformer Speech Recognition Systems.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Hyper-parameter Adaptation of Conformer ASR Systems for Elderly and Dysarthric Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Integrated and Enhanced Pipeline System to Support Spoken Language Analytics for Screening Neurocognitive Disorders.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Lossless 4-bit Quantization of Architecture Compressed Conformer ASR Systems on the 300-hr Switchboard Corpus.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Exploiting Cross-Domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

On-the-Fly Feature Based Rapid Speaker Adaptation for Dysarthric and Elderly Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Use of Speech Impairment Severity for Dysarthric Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Factorised Speaker-environment Adaptive Training of Conformer Speech Recognition Systems.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Towards Effective and Compact Contextual Representation for Conformer Transducer Speech Recognition Systems.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Unsupervised Model-Based Speaker Adaptation of End-To-End Lattice-Free MMI Model for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer's Disease Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Pretrained Representations With Task-Related Keywords for Alzheimer's Disease Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

Exploring Self-Supervised Pre-Trained ASR Models for Dysarthric and Elderly Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Bayesian Neural Network Language Modeling for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Speaker Adaptation Using Spectro-Temporal Deep Features for Dysarthric and Elderly Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

On the similarities of representations in artificial and brain neural networks for speech recognition.
Frontiers Comput. Neurosci., 2022

Disentangled Speech Representation Learning for One-Shot Cross-lingual Voice Conversion Using β-VAE.
CoRR, 2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Switchboard Corpus.
CoRR, 2022

Exploiting Cross-domain And Cross-Lingual Ultrasound Tongue Imaging Features For Elderly And Dysarthric Speech Recognition.
CoRR, 2022

On-the-fly Feature Based Speaker Adaptation for Dysarthric and Elderly Speech Recognition.
CoRR, 2022

Investigation of Deep Neural Network Acoustic Modelling Approaches for Low Resource Accented Mandarin Speech Recognition.
CoRR, 2022

Disentangled Speech Representation Learning for One-Shot Cross-Lingual Voice Conversion Using ß-VAE.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Towards Green ASR: Lossless 4-bit Quantization of a Hybrid TDNN System on the 300-hr Swithboard Corpus.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Exploring linguistic feature and model combination for speech recognition based automatic AD detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Conformer Based Elderly Speech Recognition System for Alzheimer's Disease Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Context-aware Multimodal Fusion for Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Confidence Score Based Conformer Speaker Adaptation for Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Two-pass Decoding and Cross-adaptation Based System Combination of End-to-end Conformer and Hybrid TDNN ASR Systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multi-Channel Speaker Diarization Using Spatial Features for Meetings.
Proceedings of the IEEE International Conference on Acoustics, 2022

Mixed Precision DNN Quantization for Overlapped Speech Separation and Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Neural Architecture Search for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2022

Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2022

A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Audio-Visual Multi-Channel Speech Separation, Dereverberation and Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Audio-Visual Multi-Channel Integration and Recognition of Overlapped Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning for Deep Neural Network Adaptation.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Speech Emotion Recognition Using Sequential Capsule Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Exemplar-Based Emotive Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Recent Progress in the CUHK Dysarthric Speech Recognition System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Any-to-Many Voice Conversion With Location-Relative Sequence-to-Sequence Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Bayesian Learning of LF-MMI Trained Time Delay Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Mixed Precision DNN Qunatization for Overlapped Speech Separation and Recognition.
CoRR, 2021

Improved End-to-End Dysarthric Speech Recognition via Meta-learning Based Model Re-initialization.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Exploring Cross-lingual Singing Voice Synthesis Using Speech Data.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Adversarial Data Augmentation for Disordered Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Joint Training Framework of Multi-Look Separator and Speaker Embedding Extractor for Overlapped Speech.
Proceedings of the IEEE International Conference on Acoustics, 2021

Development of the Cuhk Elderly Speech Recognition System for Neurocognitive Disorder Detection Using the Dementiabank Corpus.
Proceedings of the IEEE International Conference on Acoustics, 2021

Bayesian Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Mixed Precision Quantization of Transformer Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Fcl-Taco2: Towards Fast, Controllable and Lightweight Text-to-Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021

A Comparative Study of Acoustic and Linguistic Features Classification for Alzheimer's Disease Detection.
Proceedings of the IEEE International Conference on Acoustics, 2021

Replay and Synthetic Speech Detection with Res2Net Architecture.
Proceedings of the IEEE International Conference on Acoustics, 2021

Neural Architecture Search for LF-MMI Trained Time Delay Neural Networks.
Proceedings of the IEEE International Conference on Acoustics, 2021

Understanding the wiring evolution in differentiable neural architecture search.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020
Cross-Domain Deep Visual Feature Generation for Mandarin Audio-Visual Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Neural Architecture Search for Speech Recognition.
CoRR, 2020

Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech.
CoRR, 2020

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Speaker-Aware Linear Discriminant Analysis in Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audio-Visual Multi-Channel Recognition of Overlapped Speech.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Exploiting Cross-Domain Visual Feature Generation for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transferring Source Style in Non-Parallel Voice Conversion.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Investigation of Data Augmentation Techniques for Disordered Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Audio-Visual Recognition of Overlapped Speech for the LRS2 Dataset.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-To-End Voice Conversion Via Cross-Modal Knowledge Distillation for Dysarthric Speech Reconstruction.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-To-End Accent Conversion Without Using Native Utterances.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Adversarial Attacks on GMM I-Vector Based Speaker Verification Systems.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Code-Switched Speech Synthesis Using Bilingual Phonetic Posteriorgram with Only Monolingual Corpora.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DSNAS: Direct Neural Architecture Search Without Parameter Retraining.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

2019
Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Fast DNN Acoustic Model Speaker Adaptation by Learning Hidden Unit Contribution Features.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unsupervised Methods for Audio Classification from Lecture Discussion Recordings.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Exploiting Visual Features Using Bayesian Gated Neural Networks for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On the Use of Pitch Features for Disordered Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Jointly Trained Conversion Model and WaveNet Vocoder for Non-Parallel Voice Conversion Using Mel-Spectrograms and Phonetic Posteriorgrams.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Extract, Adapt and Recognize: An End-to-End Neural Network for Corrupted Monaural Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

LF-MMI Training of Bayesian and Gaussian Process Time Delay Neural Networks for Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The CUHK Dysarthric Speech Recognition Systems for English and Cantonese.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Recurrent Neural Network Language Model Training Using Natural Gradient.
Proceedings of the IEEE International Conference on Acoustics, 2019

BLHUC: Bayesian Learning of Hidden Unit Contributions for Deep Neural Network Speaker Adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speech Emotion Recognition Using Capsule Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

CNN-RNN-CTC Based End-to-end Mispronunciation Detection and Diagnosis.
Proceedings of the IEEE International Conference on Acoustics, 2019

Gaussian Process Lstm Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Bayesian and Gaussian Process Neural Networks for Large Vocabulary Continuous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Code-switched TTS with Mix of Monolingual Recordings.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
The HCCL-CUHK System for the Voice Conversion Challenge 2018.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Development of the CUHK Dysarthric Speech Recognition System for the UA Speech Corpus.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Rapid Style Adaptation Using Residual Error Embedding for Expressive Speech Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Semi-supervised Cross-domain Visual Feature Learning for Audio-Visual Broadcast Speech Transcription.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Gaussian Process Neural Networks for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Feature Based Adaptation for Speaking Style Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Unsupervised Discovery of an Extended Phoneme Set in L2 English Speech for Mispronunciation Detection and Diagnosis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Limited-Memory BFGS Optimization of Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Drawing-Based Automatic Dementia Screening Using Gaussian Process Markov Chains.
Proceedings of the 51st Hawaii International Conference on System Sciences, 2018

2017
Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem.
PLoS Comput. Biol., 2017

Future Word Contexts in Neural Network Language Models.
CoRR, 2017

RNN-LDA Clustering for Feature Based DNN Adaptation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Multi-task learning of structured output layer bidirectional LSTMS for speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Recurrent neural network language models for keyword search.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Multimodal learning using 3D audio-visual data for audio-visual speech recognition.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017

2016
Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Deep Neural Network Based Acoustic-to-Articulatory Inversion Using Phone Sequence Information.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Convolutional neural network bottleneck features for bi-directional generalized variable parameter HMMs.
Proceedings of the IEEE International Conference on Information and Automation, 2016

Improved DNN-based segmentation for multi-genre broadcast audio.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Generalized variable parameter HMMs based acoustic-to-articulatory inversion.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Efficient use of DNN bottleneck features in generalized variable parameter HMMs for noise robust speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Investigations of low resource multi-accent mandarin speech recognition.
Proceedings of the IEEE International Conference on Information and Automation, 2015

Paraphrastic recurrent neural network language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Recurrent neural network language model training with noise contrastive estimation for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving the training and evaluation efficiency of recurrent neural network language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Cambridge university transcription systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The development of the cambridge university alignment systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Speaker diarisation and longitudinal linking in multi-genre broadcast data.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Investigation of back-off based interpolation between recurrent neural network and n-gram language models.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The MGB challenge: Evaluating multi-genre broadcast media recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Paraphrastic language models.
Comput. Speech Lang., 2014

Deep neural network bottleneck features for generalized variable parameter HMMs.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Efficient lattice rescoring using recurrent neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2014

Paraphrastic neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Language model cross adaptation for LVCSR system combination.
Comput. Speech Lang., 2013

Use of contexts in language model interpolation and adaptation.
Comput. Speech Lang., 2013

Improving lightly supervised training for broadcast transcription.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Cross-domain paraphrasing for improving language modelling using out-of-domain data.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Feature space generalized variable parameter HMMs for noise robust recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic Transcription of Multi-genre Media Archives.
Proceedings of the First Workshop on Speech, 2013

Paraphrastic language models and combination with neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2013

Automatic model complexity control for generalized variable parameter HMMs.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Transcription of multi-genre media archives using out-of-domain data.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Structured modeling based on generalized variable parameter HMMs and speaker adaptation.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

2011
A flexible framework for HMM based noise robust speech recognition using generalized parametric space polynomial regression.
Sci. China Inf. Sci., 2011

Improving LVCSR System Combination Using Neural Network Language Model Cross Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Generalized Variable Parameter HMMs for Noise Robust Speech Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Investigation of acoustic units for LVCSR systems.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Improved neural network based language modelling and adaptation.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Language model combination and adaptation usingweighted finite state transducers.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Exploiting Chinese character models to improve speech recognition performance.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008
Context dependent language model adaptation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007
Automatic Model Complexity Control Using Marginalized Discriminative Growth Functions.
IEEE Trans. Speech Audio Process., 2007

Improving Speech Transcription for Mandarin-English Translation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Speech Recognition System Combination for Machine Translation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Discriminative language model adaptation for Mandarin broadcast speech transcription and translation.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Corrections to "Automatic Transcription of Conversational Telephone Speech".
IEEE Trans. Speech Audio Process., 2006

The Cu-Htk Mandarin Broadcast News Transcription System.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Automatic transcription of conversational telephone speech.
IEEE Trans. Speech Audio Process., 2005

Investigation of Acoustic Modeling Techniques for LVCSR Systems.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Development of the CUHTK 2004 Mandarin Conversational Telephone Speech Transcription System.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Model complexity control and compression using discriminative growth functions.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Development of the 2003 CU-HTK conversational telephone speech transcription system.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Automatic complexity control for HLDA systems.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003


  Loading...