Steve Renals

Orcid: 0000-0002-8790-3389

Affiliations:
  • University of Edinburgh, UK


According to our database1, Steve Renals authored at least 271 papers between 1987 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2014, "For contributions to speech recognition technology and its use in spoken language processing".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Phonetic Error Analysis of Raw Waveform Acoustic Models with Parametric and Non-Parametric CNNs.
CoRR, 2024

2023
Multi-Stream Acoustic Modelling Using Raw Real and Imaginary Parts of the Fourier Transform.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Phonetic Error Analysis Beyond Phone Error Rate.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

2022
Towards Robust Waveform-Based Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Investigating the contribution of speaker attributes to speaker separability using disentangled speaker representations.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Exploiting ultrasound tongue imaging for the automatic detection of speech articulation errors.
Speech Commun., 2021

Automatic audiovisual synchronisation for ultrasound tongue imaging.
Speech Commun., 2021

On The Usefulness of Self-Attention for Automatic Speech Recognition with Transformers.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Tal: A Synchronised Multi-Speaker Corpus of Ultrasound Tongue Imaging, Audio, and Lip Videos.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Silent versus Modal Multi-Speaker Speech Recognition from Ultrasound and Video.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Leveraging Speaker Attribute Information Using Multi Task Learning for Speaker Verification and Diarization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Speech Acoustic Modelling Using Raw Source and Filter Components.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Train Your Classifier First: Cascade Neural Networks Training from Upper Layers to Lower Layers.
Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Acoustic Modelling from Raw Phase Spectrum.
Proceedings of the IEEE International Conference on Acoustics, 2021


Leveraging Linguistic Knowledge for Accent Robustness of End-to-End Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Stochastic Attention Head Removal: A Simple and Effective Method for Improving Automatic Speech Recognition with Transformers.
CoRR, 2020

Adaptation Algorithms for Speech Recognition: An Overview.
CoRR, 2020

When Can Self-Attention Be Replaced by Feed Forward Layers?
CoRR, 2020

DropClass and DropAdapt: Dropping classes for deep speaker representation learning.
CoRR, 2020

Dropping Classes for Deep Speaker Representation Learning.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020


A Deep 2D Convolutional Network for Waveform-Based Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Raw Sign and Magnitude Spectra for Multi-Head Acoustic Modelling.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

On the Robustness and Training Dynamics of Raw Waveform Models.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Scattering Power Spectrum Features for Robust Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Word Error Rate Estimation Without ASR Output: e-WER2.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Learning Noise Invariant Features Through Transfer Learning For Robust End-to-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Multi-Scale Octave Convolutions for Robust Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Channel Adversarial Training for Speaker Verification and Diarization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Cross Lingual Transfer Learning for Zero-Resource Domain Adaptation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Lattice-Based Unsupervised Test-Time Adaptation of Neural Network Acoustic Models.
CoRR, 2019

Dynamic Evaluation of Transformer Language Models.
CoRR, 2019

Trainable Dynamic Subsampling for End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Ultrasound Tongue Imaging for Diarization and Alignment of Child Speech Therapy Sessions.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

On Learning Interpretable CNNs with Parametric Modulated Kernel-Based Filters.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Lattice-Based Lightly-Supervised Acoustic Model Training.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Synchronising Audio and Ultrasound by Learning Cross-Modal Embeddings.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Untranscribed Web Audio for Low Resource Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Windowed Attention Mechanisms for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Speaker-independent Classification of Phonetic Segments from Raw Ultrasound in Child Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019

On the Usefulness of Statistical Normalisation of Bottleneck Features for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Embeddings for DNN Speaker Adaptive Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Speaker Adaptive Training Using Model Agnostic Meta-Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Acoustic Model Adaptation from Raw Waveforms with Sincnet.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

The MGB-5 Challenge: Recognition and Dialect Identification of Dialectal Arabic Speech.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Analyzing Deep CNN-Based Utterance Embeddings for Acoustic Model Adaptation.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Dynamic Evaluation of Neural Sequence Models.
Proceedings of the 35th International Conference on Machine Learning, 2018

Word Error Rate Estimation for Speech Recognition: e-WER.
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017
Small-Footprint Highway Deep Neural Networks for Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

End-to-End Neural Segmental Models for Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2017

Hierarchical Recurrent Neural Network for Story Segmentation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Factorised Representations for Neural Network Adaptation to Diverse Acoustic Environments.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Multiplicative LSTM for sequence modelling.
Proceedings of the 5th International Conference on Learning Representations, 2017

Knowledge distillation for small-footprint highway networks.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Sequence-to-sequence models for punctuated transcription combining lexical and acoustic features.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017


Hierarchical recurrent neural network for story segmentation using fusion of lexical and acoustic features.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Simplifying very deep convolutional neural network architectures for robust speech recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Speech recognition challenge in the wild: Arabic MGB-3.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

WERD: Using social text spelling variants for evaluating dialectal speech recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

User Generated Dialogue Systems: uDialogue.
Proceedings of the Human-Harmonized Information Technology, Volume 2, 2017

Distant Speech Recognition Experiments Using the AMI Corpus.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016
Differentiable Pooling for Unsupervised Acoustic Model Adaptation.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Learning Hidden Unit Contributions for Unsupervised Acoustic Model Adaptation.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Multi-view Dimensionality Reduction for Dialect Identification of Arabic Broadcast Speech.
CoRR, 2016

Punctuated transcription of multi-genre broadcasts using acoustic and lexical approaches.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

The MGB-2 challenge: Arabic multi-dialect broadcast media recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Character-Level Neural Translation for Multilingual Media Monitoring in the SUMMA Project.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Small-Footprint Deep Neural Networks with Highway Connections for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Segmental Recurrent Neural Networks for End-to-End Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Unsupervised Adaptation of Recurrent Neural Network Language Models.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Improving Children's Speech Recognition Through Out-of-Domain Data Augmentation.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Automatic Dialect Detection in Arabic Broadcast Speech.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SAT-LHUC: Speaker adaptive training for learning hidden unit contributions.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

On training the recurrent neural network encoder-decoder for large vocabulary end-to-end speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Automatic Dialect Detection in Arabic Broadcast Speech.
CoRR, 2015

Open Challenges in Modelling, Analysis and Synthesis of Human Behaviour in Human-Human and Human-Machine Interactions.
Cogn. Comput., 2015

Multi-Reference Evaluation for Dialectal Speech Recognition System: A Study for Egyptian ASR.
Proceedings of the Second Workshop on Arabic Natural Language Processing, 2015

A study of speaker adaptation for DNN-based speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Structured output layer with auxiliary targets for context-dependent acoustic modelling.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A study of the recurrent neural network encoder-decoder for large vocabulary speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Feature-space speaker adaptation for probabilistic linear discriminant analysis acoustic models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Prosodically-enhanced recurrent neural network language models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Complementary tasks for context-dependent deep neural network acoustic models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Modelling acoustic feature dependencies with artificial neural networks: Trajectory-RNADE.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Differentiable pooling for unsupervised speaker adaptation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Multi-frame factorisation for long-span acoustic modelling.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Regularization of context-dependent deep neural networks with context-independent multi-task training.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The MGB challenge: Evaluating multi-genre broadcast media recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Multi-reference WER for evaluating ASR for languages with no orthographic rules.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

A system for automatic alignment of broadcast media captions using weighted finite-state transducers.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Cross-Lingual Subspace Gaussian Mixture Models for Low-Resource Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Editorial: Expanding the Technical Reach of our Transactions.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Convolutional Neural Networks for Distant Speech Recognition.
IEEE Signal Process. Lett., 2014

Probabilistic Linear Discriminant Analysis for Acoustic Modeling.
IEEE Signal Process. Lett., 2014

Glottal Spectral Separation for Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Tied Probabilistic Linear Discriminant Analysis for Speech Recognition.
CoRR, 2014

Learning hidden unit contributions for unsupervised speaker adaptation of neural network acoustic models.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

The UEDIN ASR systems for the IWSLT 2014 evaluation.
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, 2014

Probabilistic linear discriminant analysis with bottleneck features for speech recognition.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Incorporating lexical and prosodic information at different levels for meeting summarization.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Feed forward pre-training for recurrent neural network language models.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Automated production of true-cased punctuated subtitles for weather and news broadcasts.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Cross-lingual adaptation with multi-task adaptive networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

ROCKIT: Roadmap for Conversational Interaction Technologies.
Proceedings of the 2014 Workshop on Roadmapping the Future of Multimodal Interaction Research including Business Opportunities and Challenges, 2014

Neural net word representations for phrase-break prediction without a part of speech tagger.
Proceedings of the IEEE International Conference on Acoustics, 2014

Neural networks for distant speech recognition.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

2013
Editorial.
ACM Trans. Speech Lang. Process., 2013

Joint Uncertainty Decoding for Noise Robust Subspace Gaussian Mixture Models.
IEEE Trans. Speech Audio Process., 2013

Recording speech articulation in dialogue: Evaluating a synchronized double electromagnetic articulography setup.
J. Phonetics, 2013

Description of the UEDIN system for German ASR.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

The UEDIN English ASR system for the IWSLT 2013 evaluation.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

Noise adaptive training for subspace Gaussian mixture models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Automatic Transcription of Multi-genre Media Archives.
Proceedings of the First Workshop on Speech, 2013

Detecting summarization hot spots in meetings using group level involvement and turn-taking features.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Processing and Linking Audio Events in Large Multimedia Archives: The EU inEvent Project.
Proceedings of the First Workshop on Speech, 2013

A lecture transcription system combining neural network acoustic and language models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Recognition of overlapping speech using digital MEMS microphone arrays.
Proceedings of the IEEE International Conference on Acoustics, 2013

Revisiting hybrid and GMM-HMM system combination techniques.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multilingual training of deep neural networks.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-level adaptive networks in tandem and hybrid ASR systems.
Proceedings of the IEEE International Conference on Acoustics, 2013

Hybrid acoustic models for distant and multichannel large vocabulary speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Acoustic data-driven pronunciation lexicon for large vocabulary speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

Lightly supervised automatic subtitling of weather forecasts.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Special issue on searching speech.
ACM Trans. Inf. Syst., 2012

Unsupervised cross-lingual knowledge transfer in DNN-based LVCSR.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Transcription of multi-genre media archives using out-of-domain data.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

The UEDIN systems for the IWSLT 2012 evaluation.
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012

Deep Architectures for Articulatory Inversion.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Ultrax: An Animated Midsagittal Vocal Tract Display for Speech Therapy.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Joint uncertainty decoding with unscented transform for noise robust subspace Gaussian mixture models.
Proceedings of the ISCA Workshop on Statistical And Perceptual Audition, 2012

Noise Compensation for Subspace Gaussian Mixture Models.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Determining the number of speakers in a meeting using microphone array features.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

On the effect of snr and superdirective beamforming in speaker diarisation in meetings.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Maximum a posteriori adaptation of subspace Gaussian mixture models for cross-lingual speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Regularized Subspace Gaussian Mixture Models for Speech Recognition.
IEEE Signal Process. Lett., 2011

HMM-based speech synthesiser using the LF-model of the glottal source.
Proceedings of the IEEE International Conference on Acoustics, 2011

Regularized subspace Gaussian mixture models for cross-lingual speech recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Hierarchical Bayesian Language Models for Conversational Speech Recognition.
IEEE Trans. Speech Audio Process., 2010

Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance.
EURASIP J. Audio Speech Music. Process., 2010

Evaluation of a hierarchical reinforcement learning spoken dialogue system.
Comput. Speech Lang., 2010

Evaluating speech synthesis intelligibility using Amazon Mechanical Turk.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

An HMM-based speech synthesiser using glottal post-filtering.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Invited Talk: Recognition and Understanding of Meetings.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2010

The ambient spotlight: queryless desktop search from meeting speech.
Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech, 2010

Augmentation of adaptation data.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

The Ambient Spotlight: personal multimodal search without query.
Proceedings of the 12th International Conference on Multimodal Interfaces / 7. International Workshop on Machine Learning for Multimodal Interaction, 2010

A digital microphone array for distant speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Power law discounting for n-gram language models.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Extrinsic summarization evaluation: A decision audit task.
ACM Trans. Speech Lang. Process., 2009

Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis.
IEEE Trans. Speech Audio Process., 2009

Speech Recognition Using Augmented Conditional Random Fields.
IEEE Trans. Speech Audio Process., 2009

Age recognition for spoken dialogue systems: do we need it?
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A parallel training algorithm for hierarchical pitman-yor process language models.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Speech Input from Older Users in Smart Environments: Challenges and Perspectives.
Proceedings of the Universal Access in Human-Computer Interaction. Intelligent and Ubiquitous Interaction Environments, 2009

2008
Combining Spectral Representations for Large-Vocabulary Continuous Speech Recognition.
IEEE Trans. Speech Audio Process., 2008

Recognition of Dialogue Acts in Multiparty Meetings Using a Switching DBN.
IEEE Trans. Speech Audio Process., 2008

A Cascaded Broadcast News Highlighter.
IEEE Trans. Speech Audio Process., 2008

Acoustic-Articulatory Modeling With the Trajectory HMM.
IEEE Signal Process. Lett., 2008

Meta Comments for Summarizing Meeting Speech.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Detecting Action Items in Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Modeling Topic and Role Information in Meetings Using the Hierarchical Dirichlet Process.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

Longitudinal study of ASR performance on ageing voices.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Predicting tongue shapes from a few landmark locations.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Unsupervised language model adaptation based on topic and role information in multiparty meetings.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Pitch adaptive features for LVCSR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Glottal spectral separation for parametric speech synthesis.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007
Automatic Meeting Segmentation Using Dynamic Bayesian Networks.
IEEE Trans. Multim., 2007

Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Towards an improved modeling of the glottal source in statistical parametric speech synthesis.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Automatic Segmentation and Summarization of Meeting Speech.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Term-Weighting for Summarization of Multi-party Spoken Dialogues.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Using Prosodic Features in Language Models for Meetings.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Towards online speech summarization.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Hierarchical dialogue optimization using semi-Markov decision processes.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

DBN Based Joint Dialogue Act Recognition of Multiparty Meetings.
Proceedings of the IEEE International Conference on Acoustics, 2007

Recognition and understanding of meetings the AMI and AMIDA projects.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Hierarchical Pitman-Yor language models for ASR in meetings.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Reinforcement Learning of Dialogue Strategies with Hierarchical Abstract Machines.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

Incorporating Speaker and Discourse Features into Speech Summarization.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

Multistream Recognition of Dialogue Acts in Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2006


Phone recognition analysis for trajectory HMM.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Dialogue act compression via pitch contour preservation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Learning multi-goal dialogue strategies using reinforcement learning with reduced state-action spaces.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Automatic Segmentation of Multiparty Dialogue.
Proceedings of the EACL 2006, 2006

2005
Automatic summarization of voicemail messages using lexical and prosodic features.
ACM Trans. Speech Lang. Process., 2005

Speech and crosstalk detection in multichannel audio.
IEEE Trans. Speech Audio Process., 2005

Speaker verification using sequence discriminant support vector machines.
IEEE Trans. Speech Audio Process., 2005

Content-based access to spoken audio.
IEEE Signal Process. Mag., 2005

Accessing the spoken word.
Int. J. Digit. Libr., 2005

The Development of the AMI System for the Transcription of Speech in Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

The 2005 AMI System for the Transcription of Speech in Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Multimodal Integration for Meeting Group Action Segmentation and Recognition.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Extractive summarization of meeting recordings.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

A hybrid Maxent/HMM based ASR system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Transcription of conference room meetings: an investigation.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Applying vocal tract length normalization to meeting recordings.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Maximum entropy segmentation of broadcast news.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Evaluating Automatic Summaries of Meeting Recordings.
Proceedings of the Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization@ACL 2005, 2005

2004
Multi-stream segmentation of meetings.
Proceedings of the IEEE 6th Workshop on Multimedia Signal Processing, 2004

Multistream Dynamic Bayesian Network for Meeting Segmentation.
Proceedings of the Machine Learning for Multimodal Interaction, 2004

Dynamic Bayesian networks for meeting structuring.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Acoustic space dimensionality selection and combination using the maximum entropy principle.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

From Text Summarisation to Style-Specific Summarisation for Broadcast News.
Proceedings of the Advances in Information Retrieval, 2004

2003
Feature selection for the classification of crosstalk in multi-channel audio.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Multi-class extractive voicemail summarization.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

SVMSVM: support vector machine speaker verification methodology.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Audio information access from meeting rooms.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Connectionist speech recognition of Broadcast News.
Speech Commun., 2002

Evaluation of kernel methods for speaker verification and identification.
Proceedings of the IEEE International Conference on Acoustics, 2002

ASR system modeling for automatic evaluation and optimization of dialogue systems.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Extractive summarization of voicemail using lexical and prosodic feature subset selection.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

An Advanced Integrated Architecture for Wireless Voicemail Data Retrieval.
Proceedings of the 15th International Conference on Information Networking, 2001

2000
Accessing information in spoken audio.
Speech Commun., 2000

Indexing and retrieval of broadcast news.
Speech Commun., 2000

Practical Identifiability of Finite Mixtures of Multivariate Bernoulli Distributions.
Neural Comput., 2000

Information Extraction from Broadcast News
CoRR, 2000

The Thisl SDR System at TREC-9.
Proceedings of The Ninth Text REtrieval Conference, 2000

Transcription and summarization of voicemail speech.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Variable word rate N-grams.
Proceedings of the IEEE International Conference on Acoustics, 2000

Text- and Speech-Triggered Information Access: Introduction.
Proceedings of the Text- and Speech-Triggered Information Access, 2000

Statistical Language Modelling.
Proceedings of the Text- and Speech-Triggered Information Access, 2000

1999
Start-synchronous search for large vocabulary continuous speech recognition.
IEEE Trans. Speech Audio Process., 1999

Topic-based mixture language modelling.
Nat. Lang. Eng., 1999

Confidence measures from local posterior probability estimates.
Comput. Speech Lang., 1999

The THISL SDR System At TREC-8.
Proceedings of The Eighth Text REtrieval Conference, 1999

The THISL system for indexing and retrieval of broadcast news.
Proceedings of the Third IEEE Workshop on Multimedia Signal Processing, 1999

Recognition, indexing and retrieval of british broadcast news with the THISL system.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Integrated transcription and identification of named entities in broadcast speech.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

The THISL Spoken Document Retrieval Project.
Proceedings of the IEEE International Conference on Multimedia Computing and Systems, 1999

Named entity tagged language models.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Dimensionality reduction of electropalatographic data using latent variable models.
Speech Commun., 1998

Retrieval Of Broadcast News Documents With the THISL System.
Proceedings of The Seventh Text REtrieval Conference, 1998

Confidence measures derived from an acceptor HMM.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Acoustic confidence measures for segmenting broadcast news.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Retrieval of broadcast news documents with the THISL system.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
The THISL Spoken Document Retrieval System.
Proceedings of The Sixth Text REtrieval Conference, 1997

Confidence measures for hybrid HMM/ANN speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Estimation of global posteriors and forward-backward training of hybrid HMM/ANN systems.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Document space models using latent semantic analysis.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996
Phone deactivation pruning in large vocabulary continuous speech recognition.
IEEE Signal Process. Lett., 1996

The 1995 abbot LVCSR system for multiple unknown microphones.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Efficient evaluation of the LVCSR search space using the NOWAY decoder.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Speaker-adaptation for hybrid HMM-ANN continuous speech recognition system.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

WSJCAMO: a British English speech corpus for large vocabulary continuous speech recognition.
Proceedings of the 1995 International Conference on Acoustics, 1995

Efficient search using posterior phone probability estimates.
Proceedings of the 1995 International Conference on Acoustics, 1995

Recent improvements to the ABBOT large vocabulary CSR system.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Connectionist probability estimators in HMM speech recognition.
IEEE Trans. Speech Audio Process., 1994

Using gamma filters to model temporal dependencies in speech.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Large vocabulary continuous speech recognition using a hybrid connectionist-HMM system.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

IPA: improved phone modelling with recurrent neural networks.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Hybrid Neural Network/Hidden Markov Model Systems for Continuous Speech Recognition.
Int. J. Pattern Recognit. Artif. Intell., 1993

Learning Temporal Dependencies in Connectionist Speech Recognition.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

A neural network based, speaker independent, large vocabulary, continuous speech recognition system: the WERNICKE project.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

Bayesian regularisation methods in a hybrid MLP-HMM system.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
Neural nets and hidden Markov models: Review and generalizations.
Speech Commun., 1992

Connectionist probability estimation in the DECIPHER speech recognition system.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

CDNN: a context dependent neural network for continuous speech recognition.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Connectionist Optimisation of Tied Mixture Hidden Markov Models.
Proceedings of the Advances in Neural Information Processing Systems 4, 1991

A comparative study of continuous speech recognition using neural networks and hidden Markov models.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Speech and neural network dynamics.
PhD thesis, 1990

Chaos in Neural Networks.
Proceedings of the Neural Networks, 1990

1989
Analysis of a neural network model for speech recognition.
Proceedings of the First European Conference on Speech Communication and Technology, 1989

Learning phoneme recognition using neural networks.
Proceedings of the IEEE International Conference on Acoustics, 1989

1988
A connectionist approach to speech recognition using peripheral auditory modelling.
Proceedings of the IEEE International Conference on Acoustics, 1988

Unstable connectionist networks in speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1988

1987
Automatic speech recognition using peripheral auditory modelling and a PDP approach to classification.
Proceedings of the European Conference on Speech Technology, 1987


  Loading...