Andreas Stolcke

Orcid: 0000-0002-9925-905X

Affiliations:
  • Microsoft Research, Mountain View, CA, USA
  • Microsoft Research


According to our database1, Andreas Stolcke authored at least 239 papers between 1989 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2011, "For contributions to statistical language modeling, automatic speech recognition and understanding, and automatic speaker recognition".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition.
CoRR, 2024

Investigating Training Strategies and Model Robustness of Low-Rank Adaptation for Language Modeling in Speech Recognition.
CoRR, 2024

Turn-Taking and Backchannel Prediction with Acoustic and Large Language Model Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2024

Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue.
Proceedings of the IEEE International Conference on Acoustics, 2024

Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models.
Proceedings of the IEEE International Conference on Acoustics, 2024

Towards ASR Robust Spoken Language Understanding Through in-Context Learning with Word Confusion Networks.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Streaming Speech-to-Confusion Network Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learning When to Trust Which Teacher for Weakly Supervised ASR.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Cross-Utterance ASR Rescoring with Graph-Based Label Propagation.
Proceedings of the IEEE International Conference on Acoustics, 2023

Procter: Pronunciation-Aware Contextual Adapter For Personalized Speech Recognition In Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Adaptive Endpointing with Deep Contextual Multi-Armed Bandits.
Proceedings of the IEEE International Conference on Acoustics, 2023

Low-Rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Generative Speech Recognition Error Correction With Large Language Models and Task-Activating Prompting.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
Stutter-TTS: Controlled Synthesis and Improved Recognition of Stuttered Speech.
CoRR, 2022

An Experimental Study on Private Aggregation of Teacher Ensemble Learning for End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Adversarial Reweighting for Speaker Verification Fairness.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Toward Fairness in Speech Recognition: Discovery and mitigation of performance disparities.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Graph-based Multi-View Fusion and Local Adaptation: Mitigating Within-Household Confusability for Speaker Identification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Contrastive-mixup Learning for Improved Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2022

Mitigating Closed-Model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

RescoreBERT: Discriminative Speech Recognition Rescoring With Bert.
Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Fairness in Speaker Verification via Group-Adapted Fusion Network.
Proceedings of the IEEE International Conference on Acoustics, 2022

OpenFEAT: Improving Speaker Identification by Open-Set Few-Shot Embedding Adaptation with Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2022

ASR-Aware End-to-End Neural Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2022

Self-Supervised Speaker Recognition Training using Human-Machine Dialogues.
Proceedings of the IEEE International Conference on Acoustics, 2022

CUE Vectors: Modular Training of Language Models Conditioned on Diverse Contextual Signals.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
DOVER-Lap: A Method for Combining Overlap-Aware Diarization Outputs.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

wav2vec-C: A Self-Supervised Model for Speech Representation Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-to-End Neural Diarization: From Transformer to Conformer.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Graph-Based Label Propagation for Semi-Supervised Speaker Identification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

DO as I Mean, Not as I Say: Sequence Loss Training for Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2021

Joint ASR and Language Identification Using RNN-T: An Efficient Approach to Dynamic Language Switching.
Proceedings of the IEEE International Conference on Acoustics, 2021

Contrastive Unsupervised Learning for Speech Emotion Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

REDAT: Accent-Invariant Representation for End-To-End ASR by Domain Adversarial Training with Relabeling.
Proceedings of the IEEE International Conference on Acoustics, 2021

BW-EDA-EEND: streaming END-TO-END Neural Speaker Diarization for a Variable Number of Speakers.
Proceedings of the IEEE International Conference on Acoustics, 2021

Personalization Strategies for End-to-End Speech Recognition Systems.
Proceedings of the IEEE International Conference on Acoustics, 2021

Improving Speaker Identification for Shared Devices by Adapting Embeddings to Speaker Subsets.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Attention-based Contextual Language Model Adaptation for Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, 2021

2020
Improving Diarization Robustness using Diversification, Randomization and the DOVER Algorithm.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Speaker Identification for Household Scenarios with Self-Attention and Adversarial Training.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Efficient Minimum Word Error Rate Training of RNN-Transducer for End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Combining Acoustics, Content and Interaction Features to Find Hot Spots in Meetings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Meeting Transcription Using Virtual Microphone Arrays.
CoRR, 2019

Meeting Transcription Using Asynchronous Distant Microphones.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Acoustic and Lexical Sentiment Analysis for Customer Service Calls.
Proceedings of the IEEE International Conference on Acoustics, 2019

Dover: A Method for Combining Diarization Outputs.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Mispronunciation Detection in Children's Reading of Sentences.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

The Microsoft 2017 Conversational Speech Recognition System.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Session-level Language Modeling for Conversational Speech.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

2017
Toward Human Parity in Conversational Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Automatic evaluation of reading aloud performance in children.
Speech Commun., 2017

Comparing Human and Machine Errors in Conversational Speech Transcription.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Automatic Evaluation of Children Reading Aloud on Sentences and Pseudowords.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Detection of Mispronunciations and Disfluencies in Children Reading Aloud.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Advances in all-neural speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

The microsoft 2016 conversational speech recognition system.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Robust and Efficient Multiple Alignment of Unsynchronized Meeting Recordings.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Achieving Human Parity in Conversational Speech Recognition.
CoRR, 2016

Design and Analysis of a Database to Evaluate Children's Reading Aloud Performance.
Proceedings of the Computational Processing of the Portuguese Language, 2016

Deep Convolutional Neural Networks with Layer-Wise Context Expansion and Attention.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A comparative study of recurrent neural network models for lexical domain classification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
A Study of Multimodal Addressee Detection in Human-Human-Computer Interaction.
IEEE Trans. Multim., 2015

A comparison of neural network feature transforms for speaker diarization.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Aligning meeting recordings via adaptive fingerprinting.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Recurrent neural network and LSTM models for lexical utterance classification.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Personalization of word-phrase-entity language models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multimodal addressee detection in multiparty dialogue systems.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Token-level interpolation for class-based language models.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

A comparative study of neural network models for lexical intent classification.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Deep bi-directional recurrent networks over spectral windows.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Artificial neural network features for speaker diarization.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Neural network models for lexical addressee detection.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Word-phrase-entity language models: getting more mileage out of n-grams.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

The Relation of Eye Gaze and Face Pose: Potential Impact on Speech Recognition.
Proceedings of the 16th International Conference on Multimodal Interaction, 2014

Highly accurate phonetic segmentation using boundary correction models and system fusion.
Proceedings of the IEEE International Conference on Acoustics, 2014

Gaze-enhanced speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
A Cross-language Study on Automatic Speech Disfluency Detection.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Using Out-of-Domain Data for Lexical Addressee Detection in Human-Human-Computer Dialog.
Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, 2013

Automatic phonetic segmentation using boundary models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Addressee detection for dialog systems using temporal and spectral dimensions of speaking style.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Articulatory trajectories for large-vocabulary speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Using multiple versions of speech input in phone recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Language Modeling of Nonverbal Vocalizations in Spontaneous Speech.
Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012

Effects of audio and ASR quality on cepstral and high-level speaker verification systems.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Learning When to Listen: Detecting System-Addressed Speech in Human-Human-Computer Dialog.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

ProTK: An Improved Prosody Toolkit.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Speaker recognition with region-constrained MLLR transforms.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Constrained Cepstral Speaker Recognition Using Matched UBM and JFA Training.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Effective Arabic Dialect Classification Using Diverse Phonotactic Models.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Making themost from multiple microphones in meeting recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Language-independent constrained cepstral features for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

The SRI NIST 2010 speaker recognition evaluation system.
Proceedings of the IEEE International Conference on Acoustics, 2011

Bird species recognition combining acoustic and sequence modeling.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
The CALO Meeting Assistant System.
IEEE Trans. Speech Audio Process., 2010

Unsupervised domain adaptation with multiple acoustic models.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Improving Language Recognition with Multilingual Phone Recognition and Speaker Adaptation Transforms.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Leveraging speaker diarization for meeting recognition from distant microphones.
Proceedings of the IEEE International Conference on Acoustics, 2010

Acoustic front-end optimization for bird species recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Improving robustness of MLLR adaptation with speaker-clustered regression class trees.
Comput. Speech Lang., 2009

Multifactor adaptation for Mandarin broadcast news and conversation speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Development of the 2008 SRI Mandarin speech-to-text system for broadcast news and conversation.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Feature-based and channel-based analyses of intrinsic variability in speaker verification.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Exploiting user feedback for language model adaptation in meeting recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Data-driven lexicon expansion for Mandarin broadcast news and conversation speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

THE SRI NIST 2008 speaker recognition evaluation system.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Efficient data selection for machine translation.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Recognizing Arabic speakers with English phones.
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Detecting nonnative speech using speaker recognition approaches.
Proceedings of the Odyssey 2008: The Speaker and Language Recognition Workshop, 2008

Development of the SRI/nightingale Arabic ASR system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

The case for automatic higher-level features in forensic speaker recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Nonparametric feature normalization for SVM-based speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2008

Open-vocabulary spoken term detection using graphone-based hybrid recognition systems.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Morph-based speech recognition and modeling of out-of-vocabulary words across languages.
ACM Trans. Speech Lang. Process., 2007

Web resources for language modeling in conversational speech recognition.
ACM Trans. Speech Lang. Process., 2007

Speaker Recognition With Session Variability Normalization Based on MLLR Adaptation Transforms.
IEEE Trans. Speech Audio Process., 2007

Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

fMPE-MAP: improved discriminative adaptation for modeling new domains.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Integrating MAP, marginals, and unsupervised language model adaptation.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

The SRI/OGI 2006 spoken term detection system.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Duration and pronunciation conditioned lexical modeling for speaker verification.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Detecting deception using critical segments.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007

Unsupervised Languagemodel Adaptation for Meeting Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007

NAP and WCCN: Comparison of Approaches using MLLR-SVM Speaker Verification System.
Proceedings of the IEEE International Conference on Acoustics, 2007

Noise Robust Speaker Identification for Spontaneous Arabic Speech.
Proceedings of the IEEE International Conference on Acoustics, 2007

The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007

Reranking machine translation hypotheses with structured and web-based language models.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
SmartKom-English: From Robust Recognition to Felicitous Interaction.
Proceedings of the SmartKom: Foundations of Multimodal Dialogue Systems, 2006

Recent innovations in speech-to-text transcription at SRI-ICSI-UW.
IEEE Trans. Speech Audio Process., 2006

Enriching speech recognition with automatic detection of sentence boundaries and disfluencies.
IEEE Trans. Speech Audio Process., 2006

Editorial for computer speech and language.
Comput. Speech Lang., 2006

A study in machine learning from imbalanced data for sentence boundary detection in speech.
Comput. Speech Lang., 2006

Morphology-based language modeling for conversational Arabic speech recognition.
Comput. Speech Lang., 2006

Detecting Categories in News Video Using Acoustic, Speech, and Image Features.
Proceedings of the 2006 TREC Video Retrieval Evaluation, 2006

Improvements in MLLR-Transform-based Speaker Recognition.
Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006

Text Based Dialog Act Classification for Multiparty Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

The ICSI-SRI Spring 2006 Meeting Recognition System.
Proceedings of the Machine Learning for Multimodal Interaction, 2006

Speaker clustered regression-class trees for MLLR adaptation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Within-class covariance normalization for SVM-based speaker recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Improved speech activity detection using cross-channel features for recognition of multiparty meetings.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Cross-Domain and Cross-Language Portability of Acoustic Features Estimated by Multilayer Perceptrons.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Combining Prosodic Lexical and Cepstral Systems for Deceptive Speech Detection.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

The Contribution of Cepstral and Stylistic Features to SRI's 2005 NIST Speaker Recognition Evaluation System.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Pushing the envelope - aside [speech recognition].
IEEE Signal Process. Mag., 2005

Modeling prosodic feature sequences for speaker recognition.
Speech Commun., 2005

Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System.
Proceedings of the Machine Learning for Multimodal Interaction, 2005

Using MLP features in SRI's conversational speech recognition system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Improved discriminative training using phone lattices.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Development of a conversational telephone speech recognizer for Levantine Arabic.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Does active learning help automatic dialog act tagging in meeting data?
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

MLLR transforms as features in speaker recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Leveraging speaker-dependent variation of adaptation.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Comparing HMM, maximum entropy, and conditional random fields for disfluency detection.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Two experiments comparing reading with listening for human processing of conversational telephone speech.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Distinguishing deceptive from non-deceptive speech.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Human language technology: opportunities and challenges.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Structural metadata research in the EARS program.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

SRI's 2004 NIST Speaker Recognition Evaluation System.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Improved Phonetic Speaker Recognition Using Lattice Decoding.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Using Conditional Random Fields for Sentence Boundary Detection in Speech.
Proceedings of the ACL 2005, 2005

2004
Modeling NERFs for speaker recognition.
Proceedings of the Odyssey 2004: The Speaker and Language Recognition Workshop, Toledo, Spain, May 31, 2004

Improving Automatic Sentence Boundary Detection with Confusion Networks.
Proceedings of HLT-NAACL 2004: Short Papers, Boston, Massachusetts, USA, May 2-7, 2004, 2004

Tandem Connectionist Feature Extraction for Conversational Speech Recognition.
Proceedings of the Machine Learning for Multimodal Interaction, 2004

The 2004 ICSI-SRI-UW Meeting Recognition System.
Proceedings of the Machine Learning for Multimodal Interaction, 2004

Progress on Mandarin conversational telephone speech recognition.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

On using MLP features in LVCSR.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Effective acoustic modeling for rate-of-speech variation in large vocabulary conversational speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Morphology-based language modeling for arabic speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

An efficient repair procedure for quick transcriptions.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

From switchboard to meetings: development of the 2004 ICSI-SRI-UW meeting recognition system.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

The ICSI-SRI-UW metadata extraction system.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Using machine learning to cope with imbalanced classes in natural speech: evidence from sentence boundary and disfluency detection.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

The use of a linguistically motivated language model in conversational speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Trapping conversational speech: extending TRAP/tandem approaches to conversational telephone speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Voicing feature integration in SRI's decipher LVCSR system.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech.
Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing , 2004

2003
Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition.
Speech Commun., 2003

Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, 2003

"TalkPrinting": Improving Speaker Recognition by Modeling Stylistic Features.
Proceedings of the Intelligence and Security Informatics, First NSF/NIJ Symposium, 2003

Automatic disfluency identification in conversational speech using multiple knowledge sources.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Modeling duration patterns for speaker recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

The robustness of an almost-parsing language model given errorful training data.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Prosodic knowledge sources for automatic speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Training a prosody-based dialog act tagger from unlabeled data.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Meetings about meetings: research at ICSI on speech in multiparty conversations.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

The ICSI Meeting Corpus.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

A prosody-based approach to end-of-utterance detection that does not require speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Iterative Statistical Language Model Generation for Use with an Agent-Oriented Natural Language Interface.
Proceedings of the Human-Computer Interaction: Universal Access in HCI: Inclusive Design in the Information Society, 2003

2002
Improved modeling and efficiency for automatic transcription of Broadcast News.
Speech Commun., 2002

SRILM - an extensible language modeling toolkit.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Building an ASR system for noisy environments: SRI's 2001 SPINE evaluation system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Is the speaker done yet? faster and more accurate end-of-utterance detection using prosody.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Automatic punctuation and disfluency detection in multi-party meetings using prosodic and lexical cues.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Prosody-based automatic detection of annoyance and frustration in human-computer dialog.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2001
Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation.
Comput. Linguistics, 2001

The Meeting Project at ICSI.
Proceedings of the First International Conference on Human Language Technology Research, 2001

Improved maximum mutual information estimation training of continuous density HMMs.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Observations on overlap: findings and implications for automatic processing of multi-party conversation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
Prosody-based automatic segmentation of speech into sentences and topics.
Speech Commun., 2000

Finding consensus in speech recognition: word error minimization and other applications of confusion networks.
Comput. Speech Lang., 2000

Entropy-based Pruning of Backoff Language Models
CoRR, 2000

Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech?
CoRR, 2000

Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech
CoRR, 2000

Dialog Act Modeling for Automatic Tagging and Recognition of Conversational Speech.
Comput. Linguistics, 2000

1999
Modeling the prosody of hidden events for improved word recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Finding consensus among words: lattice-based word error minimization.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Combining words and prosody for information extraction from speech.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Efficient lattice representation and generation.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Automatic detection of sentence boundaries and disfluencies based on recognized words.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

How far do speakers back up in repairs? a quantitatve model.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Linguistic Knowledge and Empirical Methods in Speech Recognition.
AI Mag., 1997

A study of multilingual speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Explicit word error minimization in n-best list rescoring.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Modeling linguistic segment and turn boundaries for n-best rescoring of spontaneous speech.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

A prosody only decision-tree model for disfluency detection.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Structure and performance of a dependency language model.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Neural-network based measures of confidence for word recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
L0 - The First Five Years of an Automated Language Acquisition Project.
Artif. Intell. Rev., 1996

Automatic linguistic segmentation of conversational speech.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Word predictability after hesitations: a corpus-based study.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Statistical language modeling for speech disfluencies.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities.
Comput. Linguistics, 1995

Partitioning Grammars and Composing Parsers.
Proceedings of the Fourth International Workshop on Parsing Technologies, 1995

Using a stochastic context-free grammar as a language model for speech recognition.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
Best-first Model Merging for Hidden Markov Model Induction.
CoRR, 1994

Multiple-pronunciation lexical modeling in a speaker independent speech understanding system.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

The berkeley restaurant project.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Inducing Probabilistic Grammars by Bayesian Model Merging.
Proceedings of the Grammatical Inference and Applications, Second International Colloquium, 1994

Precise N-Gram Probabilities from Stochastic Context-Free Grammars.
Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, 1994

1992
Hidden Markov Model} Induction by Bayesian Model Merging.
Proceedings of the Advances in Neural Information Processing Systems 5, [NIPS Conference, Denver, Colorado, USA, November 30, 1992

1990
Gapping and Frame Semantics: A fresh look from a cognitive perspective.
Proceedings of the 13th International Conference on Computational Linguistics, 1990

1989
Unification as Constraint Satisfaction in Structured Connectionist Networks.
Neural Comput., 1989


  Loading...