Mark J. F. Gales

Orcid: 0000-0002-5311-8219

Affiliations:
  • University of Cambridge, UK


According to our database1, Mark J. F. Gales authored at least 406 papers between 1992 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2011, "For contributions to acoustic modeling for speech recognition".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Multi-modal video search by examples - A video quality impact analysis.
IET Comput. Vis., October, 2024

SkillAggregation: Reference-free LLM-Dependent Aggregation.
CoRR, 2024

Finetuning LLMs for Comparative Assessment Tasks.
CoRR, 2024

ASR Error Correction using Large Language Models.
CoRR, 2024

Grammatical Error Feedback: An Implicit Evaluation Approach.
CoRR, 2024

Learn and Don't Forget: Adding a New Language to ASR Foundation Models.
CoRR, 2024

Controlling Whisper: Universal Acoustic Adversarial Attacks to Control Speech Foundation Models.
CoRR, 2024

Cross-Lingual Transfer Learning for Speech Translation.
CoRR, 2024

CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models.
CoRR, 2024

Question-Based Retrieval using Atomic Units for Enterprise RAG.
CoRR, 2024

Question Difficulty Ranking for Multiple-Choice Reading Comprehension.
CoRR, 2024

WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Investigating the Emergent Audio Classification Ability of ASR Foundation Models.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Efficient Sample-Specific Encoder Perturbations.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Short Papers, 2024

MVRMLM 2024: Multimodal Video Retrieval and Multimodal Language Modelling.
Proceedings of the 2024 International Conference on Multimedia Retrieval, 2024

Towards End-to-End Spoken Grammatical Error Correction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Muting Whisper: A Universal Acoustic Adversarial Attack on Speech Foundation Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Efficient LLM Comparative Assessment: A Product of Experts Framework for Pairwise Comparisons.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LLM Task Interference: An Initial Study on the Impact of Task-Switch in Conversational History.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

LLM Comparative Assessment: Zero-shot NLG Evaluation through Pairwise Comparisons using Large Language Models.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Who Needs Decoders? Efficient Estimation of Sequence-Level Attributes with Proxies.
Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics, 2024

Is It Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

Can GPT-4 do L2 analytic assessment?
Proceedings of the 19th Workshop on Innovative Use of NLP for Building Educational Applications, 2024

An Information-Theoretic Approach to Analyze NLP Classification Tasks.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Teacher-Student Training for Debiasing: General Permutation Debiasing for Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2024

2023
Structural-Based Uncertainty in Deep Learning Across Anatomical Scales: Analysis in White Matter Lesion Segmentation.
CoRR, 2023

Zero-shot Audio Topic Reranking using Large Language Models.
CoRR, 2023

Zero-shot NLG evaluation through Pairware Comparisons with LLMs.
CoRR, 2023

Can Generative Large Language Models Perform ASR Error Correction?
CoRR, 2023

CamChoice: A Corpus of Multiple Choice Questions and Candidate Response Distributions.
CoRR, 2023

Sample Attackability in Natural Language Adversarial Attacks.
CoRR, 2023

Who Needs Decoders? Efficient Estimation of Sequence-level Attributes.
CoRR, 2023

Sentiment Perception Adversarial Attacks on Neural Machine Translation Systems.
CoRR, 2023

Identifying Adversarially Attackable and Robust Samples.
CoRR, 2023

Logit-based ensemble distribution distillation for robust autoregressive sequence uncertainties.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Analyzing Multiple-Choice Reading and Listening Comprehension Tests.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Automatic Assessment of Conversational Speaking Tests.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Towards Acoustic-to-Articulatory Inversion for Pronunciation Training.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Adapting an ASR Foundation Model for Spoken Language Assessment.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Annotation of L2 English Speech for Developing and Evaluating End-to-End Spoken Grammatical Error Correction.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Assessment of L2 Oral Proficiency Using Self-Supervised Speech Representation Learning.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Tackling Bias in the Dice Similarity Coefficient: Introducing NDSC for White Matter Lesion Segmentation.
Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023

Novel Structural-Scale Uncertainty Measures and Error Retention Curves: Application to Multiple Sclerosis.
Proceedings of the 20th IEEE International Symposium on Biomedical Imaging, 2023

Speak & Improve: L2 English Speaking Practice Tool.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adapting an Unadaptable ASR System.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

N-best T5: Robust ASR Error Correction using Multiple Input Hypotheses and Constrained Decoding Space.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Head State Space Model for Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Minimum Bayes' Risk Decoding for System Combination of Grammatical Error Correction Systems.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

MQAG: Multiple-choice Question Answering and Generation for Assessing Information Consistency in Summarization.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023

Mitigating Word Bias in Zero-shot Prompt-based Classifiers.
Proceedings of the Findings of the Association for Computational Linguistics: IJCNLP-AACL 2023, 2023

Ensemble Prosody Prediction For Expressive Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

Unsupervised Multi-Hashing for Image Retrieval in Non-stationary Environments.
Proceedings of the 15th International Conference on Advanced Computational Intelligence, 2023

Assessing Distractors in Multiple-Choice Tests.
Proceedings of the 4th Workshop on Evaluation and Comparison of NLP Systems, 2023

SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

CUED at ProbSum 2023: Hierarchical Ensemble of Summarization Models.
Proceedings of the 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks, 2023

2022
Increasing Context for Estimating Confidence Scores in Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

L2 proficiency assessment using self-supervised speech representations.
CoRR, 2022

"World Knowledge" in Multiple Choice Reading Comprehension.
CoRR, 2022

Parallel Attention Forcing for Machine Translation.
CoRR, 2022

Deliberation Networks and How to Train Them.
CoRR, 2022

Multiple-Choice Question Generation: Towards an Automated Assessment Framework.
CoRR, 2022

Podcast Summary Assessment: A Resource for Evaluating Summary Assessment Methods.
CoRR, 2022

Gender Bias and Universal Substitution Adversarial Attacks on Grammatical Error Correction Systems for Automated Assessment.
CoRR, 2022

Shifts 2.0: Extending The Dataset of Real Distributional Shifts.
CoRR, 2022

Self-distribution distillation: efficient uncertainty estimation.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

University of Cambridge at TREC Cast 2022.
Proceedings of the Thirty-First Text REtrieval Conference, 2022

Residue-Based Natural Language Adversarial Attack Detection.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

View-Specific Assessment of L2 Spoken English.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Grammatical Error Correction Systems for Automated Assessment: Are They Susceptible to Universal Adversarial Attacks?
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Analyzing Biases to Spurious Correlations in Text Classification Tasks.
Proceedings of the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, 2022

Detection of Heart Murmurs in Phonocardiograms with Parallel Hidden Semi-Markov Models.
Proceedings of the Computing in Cardiology, 2022

Answer Uncertainty and Unanswerability in Multiple-Choice Machine Reading Comprehension.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks.
CoRR, 2021

An Initial Investigation of Non-Native Spoken Question-Answering.
CoRR, 2021

Long-Span Dependencies in Transformer-based Summarization Systems.
CoRR, 2021

Attention Forcing for Machine Translation.
CoRR, 2021

Should Ensemble Members Be Calibrated?
CoRR, 2021

Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Shifts: A Dataset of Real Distributional Shift Across Multiple Large-Scale Tasks.
Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Deliberation-Based Multi-Pass Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Uncertainty Estimation in Autoregressive Structured Prediction.
Proceedings of the 9th International Conference on Learning Representations, 2021

Analysing Bias in Spoken Language Assessment Using Concept Activation Vectors.
Proceedings of the IEEE International Conference on Acoustics, 2021

Efficient Use of End-to-End Data in Spoken Language Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

Ensemble Distillation Approaches for Grammatical Error Correction.
Proceedings of the IEEE International Conference on Acoustics, 2021

Sparsity and Sentence Structure in Encoder-Decoder Attention of Summarization Systems.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021

Long-Span Summarization via Local Attention and Content Selection.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Regression Prior Networks.
CoRR, 2020

Uncertainty in Structured Prediction.
CoRR, 2020

CUED_SPEECH at TREC 2020 Podcast Summarisation Track.
Proceedings of the Twenty-Ninth Text REtrieval Conference, 2020

Ensemble Approaches for Uncertainty in Spoken Language Assessment.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Universal Adversarial Attacks on Spoken Language Assessment Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Abstractive Spoken Document Summarization Using Hierarchical Model with Multi-Stage Attention Diversity Optimization.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Spoken Language 'Grammatical Error Correction'.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Automatic Detection of Accent and Lexical Pronunciation Errors in Spontaneous Non-Native English Speech.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Non-Native Children's Automatic Speech Recognition: The INTERSPEECH 2020 Shared Task ALTA Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Attention Forcing for Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Ensemble Distribution Distillation.
Proceedings of the 8th International Conference on Learning Representations, 2020

Confidence Estimation for Black Box Automatic Speech Recognition Systems Using Lattice Recurrent Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Complementary Systems for Off-Topic Spoken Response Detection.
Proceedings of the Fifteenth Workshop on Innovative Use of NLP for Building Educational Applications, 2020

2019
General Sequence Teacher-Student Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Exploiting Future Word Contexts in Neural Network Language Models for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Non-native Speaker Verification for Spoken Language Assessment.
CoRR, 2019

Attention Forcing for Sequence-to-sequence Model Training.
CoRR, 2019

Disfluency Detection for Spoken Learner English.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Reverse KL-Divergence Training of Prior Networks: Improved Uncertainty and Adversarial Robustness.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Impact of ASR Performance on Spoken Grammatical Error Detection.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Deep Learning Approach to Automatic Characterisation of Rhythm in Non-Native English Speech.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Bi-directional Lattice Recurrent Neural Networks for Confidence Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Automatic Grammatical Error Detection of Non-native Spoken Learner English.
Proceedings of the IEEE International Conference on Acoustics, 2019

Surprise Languages: Rapid-Response Cross-Language IR.
Proceedings of the 9th International Workshop on Evaluating Information Access co-located with the 14th NTCIR Conference on the Evaluation of Information Access Technologies (NTCIR 2019), 2019

Learning Between Different Teacher and Student Models in ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Improving Interpretability and Regularization in Deep Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

A Log Domain Pulse Model for Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Towards automatic assessment of spontaneous spoken English.
Speech Commun., 2018

Prior Networks for Detection of Adversarial Attacks.
CoRR, 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.
CoRR, 2018

Sequence Teacher-Student Training of Acoustic Models for Automatic Free Speaking Language Assessment.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Improved Auto-Marking Confidence for Spoken Language Assessment.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Confidence Estimation and Deletion Prediction Using Bidirectional Recurrent Neural Networks.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Hierarchical RNNs for Waveform-Level Speech Synthesis.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

A Spectrally Weighted Mixture of Least Square Error and Wasserstein Discriminator Loss for Generative SPSS.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Predictive Uncertainty Estimation via Prior Networks.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Speaker Adaptation and Adaptive Training for Jointly Optimised Tandem Systems.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Waveform-Based Speaker Representations for Speech Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Automatic Speech Recognition System Development in the "Wild".
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Deep Learning Approach to Assessing Non-native Pronunciation of English Using Phone Distances.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Impact of ASR Performance on Free Speaking Language Assessment.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Active Memory Networks for Language Modeling.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Phonetic and Graphemic Systems for Multi-Genre Broadcast Transcription.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Future Word Contexts in Neural Network Language Models.
CoRR, 2017

Low-Resource Speech Recognition and Keyword-Spotting.
Proceedings of the Speech and Computer - 19th International Conference, 2017

An attention based model for off-topic spontaneous spoken response detection: An Initial Study.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Automatic Characterisation of the Pronunciation of Non-native English Speakers using Phone Distance Features.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Deep Activation Mixture Model for Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Student-Teacher Training with Diverse Decision Tree Ensembles.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Use of Graphemic Lexicons for Spoken Language Assessment.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Investigating Bidirectional Recurrent Neural Network Language Models for Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Stimulated training for automatic speech recognition and keyword search in limited resource conditions.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Morph-to-word transduction for accurate and efficient automatic speech recognition and keyword search.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Recurrent neural network language models for keyword search.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Light Supervised Data Selection, Voice Quality Normalized Training and Log Domain Pulse Synthesis.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

Multi-task ensembles with teacher-student training.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Integrated speaker-adaptive speech synthesis.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

A hierarchical attention based model for off-topic spontaneous spoken response detection.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Future word contexts in neural network language models.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Incorporating Uncertainty into Deep Learning for Spoken Language Assessment.
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016
Two Efficient Lattice Rescoring Methods Using Recurrent Neural Network Language Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Efficient Training and Evaluation of Recurrent Neural Network Language Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

A Pulse Model in Log-domain for a Uniform Synthesizer.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Towards Using Conversations with Spoken Dialogue Systems in the Automated Assessment of Non-Native Speakers of English.
Proceedings of the SIGDIAL 2016 Conference, 2016

Log-Linear System Combination Using Structured Support Vector Machines.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Stimulated Deep Neural Network for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Sequence Student-Teacher Training of Deep Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Multi-Language Neural Network Language Models.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Incorporating a Generative Front-End Layer to Deep Neural Network for Noise Robust Automatic Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

System combination with log-linear models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Combining i-vector representation and structured neural networks for rapid adaptation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Improved DNN-based segmentation for multi-genre broadcast audio.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

CUED-RNNLM - An open-source toolkit for efficient training and evaluation of recurrent neural network language models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Off-topic Response Detection for Spontaneous Spoken English Assessment.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

2015
Speaker and Expression Factorization for Audiobook Data: Expressiveness and Transplantation.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Environmentally robust ASR front-end for deep neural network acoustic models.
Comput. Speech Lang., 2015

Automatically grading learners' English using a Gaussian process.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Improving speech recognition and keyword search for low resource languages using web data.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Reconstructing voices within the multiple-average-voice-model framework.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

I-vector estimation using informative priors for adaptation of deep neural networks.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Annotating large lattices with the exact word error.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Recurrent neural network language model adaptation for multi-genre broadcast speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multi-basis adaptive neural network for rapid adaptation in speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

A language space representation for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Paraphrastic recurrent neural network language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Unicode-based graphemic systems for limited resource languages.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Robust excitation-based features for Automatic Speech Recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving multiple-crowd-sourced transcriptions using a speech recogniser.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Recurrent neural network language model training with noise contrastive estimation for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving the training and evaluation efficiency of recurrent neural network language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Cambridge university transcription systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Improving the interpretability of deep neural networks with stimulated learning.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The development of the cambridge university alignment systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Speaker diarisation and longitudinal linking in multi-genre broadcast data.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Structured discriminative models using deep neural-network features.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Multilingual representations for low resource speech recognition and keyword search.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Investigation of back-off based interpolation between recurrent neural network and n-gram language models.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The MGB challenge: Evaluating multi-genre broadcast media recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Building HMM-TTS Voices on Diverse Data.
IEEE J. Sel. Top. Signal Process., 2014

Integrated Expression Prediction and Speech Synthesis From Text.
IEEE J. Sel. Top. Signal Process., 2014

Paraphrastic language models.
Comput. Speech Lang., 2014

Speech recognition and keyword spotting for low-resource languages: Babel project research at CUED.
Proceedings of the 4th Workshop on Spoken Language Technologies for Under-resourced Languages, 2014

Noise-robust TTS speaker adaptation with statistics smoothing.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Combining tandem and hybrid systems for improved speech recognition and keyword spotting on low resource languages.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Data augmentation for low resource languages.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Speech intonation for TTS: study on evaluation methodology.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Generating multiple-accent pronunciations for TTS using joint sequence model interpolation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Language independent and unsupervised acoustic models for speech recognition and keyword spotting.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Adaptation of deep neural network acoustic models using factorised i-vectors.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

An initial investigation of long-term adaptation for meeting transcription.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Investigation of unsupervised adaptation of DNN acoustic models with filter bank input.
Proceedings of the IEEE International Conference on Acoustics, 2014

Impact of single-microphone dereverberation on DNN-based meeting transcription systems.
Proceedings of the IEEE International Conference on Acoustics, 2014

Infinite structured support vector machines for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Cluster adaptive training of average voice models.
Proceedings of the IEEE International Conference on Acoustics, 2014

Efficient lattice rescoring using recurrent neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2014

Paraphrastic neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multiple-average-voice-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

Speaker dependent expression predictor from text: Expressiveness and transplantation.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Structured SVMs for Automatic Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Complex cepstrum for statistical parametric speech synthesis.
Speech Commun., 2013

Language model cross adaptation for LVCSR system combination.
Comput. Speech Lang., 2013

Use of contexts in language model interpolation and adaptation.
Comput. Speech Lang., 2013

Importance sampling to compute likelihoods of noise-corrupted speech.
Comput. Speech Lang., 2013

Noise robustness in HMM-TTS speaker adaptation.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Infinite support vector machines in speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

An explicit independence constraint for factorised adaptation in speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Photo-realistic expressive text to talking head synthesis.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Minimum mean squared error based warped complex cepstrum analysis for statistical parametric speech synthesis.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Improving lightly supervised training for broadcast transcription.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Cross-domain paraphrasing for improving language modelling using out-of-domain data.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic Transcription of Multi-genre Media Archives.
Proceedings of the First Workshop on Speech, 2013

Kernelized log linear models for continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Tandem system adaptation using multiple linear feature transforms.
Proceedings of the IEEE International Conference on Acoustics, 2013

A confidence-based approach for improving keyword hypothesis scores.
Proceedings of the IEEE International Conference on Acoustics, 2013

System combination and score normalization for spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2013

Complex cepstrum analysis based on the minimum mean squared error.
Proceedings of the IEEE International Conference on Acoustics, 2013

Paraphrastic language models and combination with neural network language models.
Proceedings of the IEEE International Conference on Acoustics, 2013

Training a supra-segmental parametric F0 model without interpolating F0.
Proceedings of the IEEE International Conference on Acoustics, 2013

A high-performance Cantonese keyword search system.
Proceedings of the IEEE International Conference on Acoustics, 2013

Efficient decoding with generative score-spaces using the expectation semiring.
Proceedings of the IEEE International Conference on Acoustics, 2013

Integrated automatic expression prediction and speech synthesis from text.
Proceedings of the IEEE International Conference on Acoustics, 2013

Investigation of multilingual deep neural networks for spoken term detection.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Product of Experts for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2012

Statistical Parametric Speech Synthesis Based on Speaker and Language Factorization.
IEEE Trans. Speech Audio Process., 2012

Speaker and Noise Factorization for Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2012

Morphological decomposition in Arabic ASR systems.
Comput. Speech Lang., 2012

Transcription of multi-genre media archives using out-of-domain data.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Structured discriminative models for speech recognition.
Proceedings of the 2012 Symposium on Machine Learning in Speech and Language Processing, 2012

Model-based approaches to adaptive training in reverberant environments.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Combining multiple high quality corpora for improving HMM-TTS.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Rapid Nonlinear Speaker Adaptation for Large-Vocabulary Continuous Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Speech factorization for HMM-TTS based on cluster adaptive training.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Model-Based Approaches for Degraded Channel Modelling in Robust ASR.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Exploring Rich Expressive Information from Audiobook Data Using Cluster Adaptive Training.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Inference algorithms for generative score-spaces.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Complex cepstrum as phase information in statistical parametric speech synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Factor analysis based VTS discriminative adaptive training.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Unsupervised clustering of emotion and voice styles for expressive TTS.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Joint Uncertainty Decoding With Predictive Methods for Noise Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2011

Noisy Constrained Maximum-Likelihood Linear Regression for Noise-Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2011

Extended VTS for Noise-Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2011

Kernel Eigenvoices (Revisited) for Large-Vocabulary Speech Recognition.
IEEE Signal Process. Lett., 2011

The efficient incorporation of MLP features into automatic speech recognition systems.
Comput. Speech Lang., 2011

Structured Support Vector Machines for Noise Robust Continuous Speech Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Gaussian Process Experts for Voice Conversion.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Multipulse Sequences for Residual Signal Modeling.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Improving LVCSR System Combination Using Neural Network Language Model Cross Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Graphone Model Interpolation and Arabic Pronunciation Generation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Integrated Online Speaker Clustering and Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Decision tree-based context clustering based on cross validation and hierarchical priors.
Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker and noise factorisation on the AURORA4 task.
Proceedings of the IEEE International Conference on Acoustics, 2011

Structured discriminative models for noise robust continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Investigation of acoustic units for LVCSR systems.
Proceedings of the IEEE International Conference on Acoustics, 2011

Continuous F0 in the source-excitation generation for HMM-based TTS: Do we need voiced/unvoiced classification?
Proceedings of the IEEE International Conference on Acoustics, 2011

Factor analysis based VTS and JUD noise estimation and compensation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Rapid joint speaker and noise compensation for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Constrained discriminative mapping transforms for unsupervised speaker adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Extending noise robust structured support vector machines to larger vocabulary tasks.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Improving reverberant VTS for hands-free robust speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Derivative kernels for noise robust ASR.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

A variational perspective on noise-robust speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Model-Based Approaches to Handling Uncertainty.
Proceedings of the Robust Speech Recognition of Uncertain or Missing Data, 2011

2010
Structured Log Linear Models for Noise Robust Speech Recognition.
IEEE Signal Process. Lett., 2010

Unsupervised training and directed manual transcription for LVCSR.
Speech Commun., 2010

Discriminative classifiers with adaptive kernels for noise robust speech recognition.
Comput. Speech Lang., 2010

Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Improved neural network based language modelling and adaptation.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Training a parametric-based logF0 model with the minimum generation error criterion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Canonical state models for automatic speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Asymptotically exact noise-corrupted speech likelihoods.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Prior information for rapid speaker adaptation.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Lightly supervised recognition for automatic alignment of large coherent speech recordings.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Statistical parametric speech synthesis based on product of experts.
Proceedings of the IEEE International Conference on Acoustics, 2010

Recent improvements to the Cambridge Arabic Speech-to-Text systems.
Proceedings of the IEEE International Conference on Acoustics, 2010

Language model combination and adaptation usingweighted finite state transducers.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Unsupervised Adaptation With Discriminative Mapping Transforms.
IEEE Trans. Speech Audio Process., 2009

Combining Derivative and Parametric Kernels for Speaker Verification.
IEEE Trans. Speech Audio Process., 2009

Directed decision trees for generating complementary systems.
Speech Commun., 2009

Efficient generation and use of MLP features for Arabic speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Variational dynamic kernels for speaker verification.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Adaptive training with noisy constrained maximum likelihood linear regression for noise robust speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Exploiting Chinese character models to improve speech recognition performance.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Incremental adaptation with VTS and joint adaptively trained systems.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Morphological analysis and decomposition for Arabic speech-to-text systems.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Transforming features to compensate speech recogniser models for noise.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Bayesian discriminative adaptation for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Training and adapting MLP features for Arabic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Combining VTS model compensation and support vector machines.
Proceedings of the IEEE International Conference on Acoustics, 2009

Incremental predictive and adaptive noise compensation.
Proceedings of the IEEE International Conference on Acoustics, 2009

Improving joint uncertainty decoding performance by predictive methods for noise robust speech recognition.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Support vector machines for noise robust ASR.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Acoustic modelling for speech recognition: Hidden Markov models and beyond?
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

Discriminative adaptive training with VTS and JUD.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Issues with uncertainty decoding for noise robust automatic speech recognition.
Speech Commun., 2008

Adaptive training using discriminative mapping transforms.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A generalised derivative kernel for speaker verification.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Context dependent language model adaptation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Discriminative classifiers with generative kernels for noise robust ASR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Covariance modelling for noise-robust speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Unsupervised discriminative adaptation using discriminative mapping transforms.
Proceedings of the IEEE International Conference on Acoustics, 2008

Multiple kernel learning for speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2008

Phonetic pronunciations for arabic speech-to-text systems.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Acoustic Modelling Using Continuous Rational Kernels.
J. VLSI Signal Process., 2007

Bayesian Adaptive Inference and Adaptive Training.
IEEE Trans. Speech Audio Process., 2007

Automatic Model Complexity Control Using Marginalized Discriminative Growth Functions.
IEEE Trans. Speech Audio Process., 2007

The Application of Hidden Markov Models in Speech Recognition.
Found. Trends Signal Process., 2007

Discriminative semi-parametric trajectory model for speech recognition.
Comput. Speech Lang., 2007

Unsupervised training with directed manual transcription for recognising Mandarin broadcast audio.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Derivative and parametric kernels for speaker verification.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Building multiple complementary systems using directed decision trees.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Unsupervised Training for Mandarin Broadcast News and Conversation Transcription.
Proceedings of the IEEE International Conference on Acoustics, 2007

Improving Speech Transcription for Mandarin-English Translation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Consensus Network Decoding for Statistical Machine Translation System Combination.
Proceedings of the IEEE International Conference on Acoustics, 2007

Adaptive Training with Joint Uncertainty Decoding for Robust Recognition of Noisy Data.
Proceedings of the IEEE International Conference on Acoustics, 2007

Speech Recognition System Combination for Machine Translation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Complementary System Generation using Directed Decision Trees.
Proceedings of the IEEE International Conference on Acoustics, 2007

Discriminative language model adaptation for Mandarin broadcast speech transcription and translation.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Development of a phonetic system for large vocabulary Arabic speech recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Predictive linear transforms for noise robust speech recognition.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Discriminative cluster adaptive training.
IEEE Trans. Speech Audio Process., 2006

Minimum phone error training of precision matrix models.
IEEE Trans. Speech Audio Process., 2006

Corrections to "Automatic Transcription of Conversational Telephone Speech".
IEEE Trans. Speech Audio Process., 2006

Progress in the CU-HTK broadcast news transcription system.
IEEE Trans. Speech Audio Process., 2006

Training Augmented Models Using SVMs.
IEICE Trans. Inf. Syst., 2006

Product of Gaussians for speech recognition.
Comput. Speech Lang., 2006

Discriminative adaptation for speaker verification.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Issues with uncertainty decoding for noise robust speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Generating complementary systems for speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Incremental Adaptation using Bayesian Inference.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

The Cu-Htk Mandarin Broadcast News Transcription System.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Augmented Statistical Models for Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Automatic transcription of conversational telephone speech.
IEEE Trans. Speech Audio Process., 2005

The Cambridge University March 2005 speaker diarisation system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Temporally varying model parameters for large vocabulary continuous speech recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Joint uncertainty decoding for noise robust speech recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Adaptation of Precision Matrix Models on Large Vocabulary Continuous Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Investigation of Acoustic Modeling Techniques for LVCSR Systems.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Development of the CU-HTK 2004 Broadcast News Transcription Systems.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Development of the CUHTK 2004 Mandarin Conversational Telephone Speech Transcription System.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Training LVCSR Systems on Thousands of Hours of Data.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Factor analysed hidden Markov models for speech recognition.
Comput. Speech Lang., 2004

Using VTLN for broadcast news transcription.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Adaptive training using structured transforms.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Basis superposition precision matrix modelling for large vocabulary continuous speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Rao-Blackwellised Gibbs sampling for switching linear dynamical systems.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Model complexity control and compression using discriminative growth functions.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Development of the 2003 CU-HTK conversational telephone speech transcription system.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
MMI-MAP and MPE-MAP for acoustic model adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Product of Gaussians as a distributed representation for speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Discriminative map for acoustic model adaptation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Automatic complexity control for HLDA systems.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Porting: SwitchBoard to the VoiceMail task.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Product of Gaussians and multiple stream systems.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Maximum likelihood multiple subspace projections for hidden Markov models.
IEEE Trans. Speech Audio Process., 2002

Automatic transcription of Broadcast News.
Speech Commun., 2002

Transformation streams and the HMM error model.
Comput. Speech Lang., 2002

Combining a Gaussian mixture model front end with MFCC parameters.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Using SVMS and discriminative models for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

Factor analysed hidden Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2002

The HMM error model.
Proceedings of the IEEE International Conference on Acoustics, 2002

Improved cross-task recognition using MMIE training.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Speech Recognition using SVMs.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

A mixture of Gaussians front end for speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Multiple-cluster adaptive training schemes.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Cluster adaptive training of hidden Markov models.
IEEE Trans. Speech Audio Process., 2000

Factored Semi-Tied Covariance Matrices.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Transcription of broadcast news with a time constraint: IBM's 10xRT HUB4 system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Rapid likelihood calculation of subspace clustered Gaussian components.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
State-based Gaussian selection in large vocabulary continuous speech recognition using HMMs.
IEEE Trans. Speech Audio Process., 1999

Semi-tied covariance matrices for hidden Markov models.
IEEE Trans. Speech Audio Process., 1999

Tail distribution modelling using the richter and power exponential distributions.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Recent improvements to IBM's speech recognition system for automatic transcription of broadcast news.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Predictive model-based compensation schemes for robust speech recognition.
Speech Commun., 1998

Maximum likelihood linear transformations for HMM-based speech recognition.
Comput. Speech Lang., 1998

Cluster adaptive training for speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Semi-tied covariance matrices.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
A comparative study of methods for phonetic decision-tree state clustering.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Transformation smoothing for speaker and environmental adaptation.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Broadcast news transcription using HTK.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Robust continuous speech recognition using parallel model combination.
IEEE Trans. Speech Audio Process., 1996

Mean and variance adaptation within the MLLR framework.
Comput. Speech Lang., 1996

Iterative unsupervised adaptation using maximum likelihood linear regression.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Use of Gaussian selection in large vocabulary continuous speech recognition using HMMs.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Improving environmental robustness in large vocabulary speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Robust speech recognition in additive and convolutional noise using parallel model combination.
Comput. Speech Lang., 1995

The application of parallel model combination to a large vocabulary dictation task.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

A fast and flexible implementation of parallel model combination.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Parallel model combination on a noise corrupted resource management task.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1993
Cepstral parameter compensation for HMM recognition in noise.
Speech Commun., 1993

Segmental hidden Markov models.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

HMM recognition in noise using parallel model combination.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
An improved approach to the hidden Markov model decomposition of speech and noise.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992


  Loading...