Frank Seide

According to our database1, Frank Seide authored at least 91 papers between 1994 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Efficient Streaming LLM for Speech Recognition.
CoRR, 2024

Navigating the Minefield of MT Beam Search in Cascaded Streaming Speech Translation.
CoRR, 2024

Speech ReaLLM - Real-time Streaming Speech Recognition with Multimodal LLMs by Teaching the Flow of Time.
CoRR, 2024

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024

Effective Internal Language Model Training and Fusion for Factorized Transducer Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Directional Source Separation for Robust Speech Recognition on Smart Glasses.
CoRR, 2023

DISGO: Automatic End-to-End Evaluation for Scene Text OCR.
CoRR, 2023

Directional Speech Recognition for Speaker Disambiguation and Cross-talk Suppression.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2023

Joint Federated Learning and Personalization for on-Device ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
An Investigation of Monotonic Transducers for Large-Scale Automatic Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Federated Domain Adaptation for ASR with Full Self-Supervision.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2018
Achieving Human Parity on Automatic Chinese to English News Translation.
CoRR, 2018

Marian: Fast Neural Machine Translation in C++.
Proceedings of ACL 2018, Melbourne, Australia, July 15-20, 2018, System Demonstrations, 2018

2017
Toward Human Parity in Conversational Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

The microsoft 2016 conversational speech recognition system.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Achieving Human Parity in Conversational Speech Recognition.
CoRR, 2016

CNTK: Microsoft's Open-Source Deep-Learning Toolkit.
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016

2015
Deep bi-directional recurrent networks over spectral windows.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
An introduction to computational networks and the computational network toolkit (invited talk).
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

On parallelizability of stochastic gradient descent for speech DNNS.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
The Deep Tensor Neural Network With Applications to Large Vocabulary Speech Recognition.
IEEE Trans. Speech Audio Process., 2013

Feature Learning in Deep Neural Networks - A Study on Speech Recognition Tasks
Proceedings of the 1st International Conference on Learning Representations, 2013

MSR-FBK IWSLT 2013 SLT system description.
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013

A new language independent, photo-realistic talking head driven by voice only.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

KL-divergence regularized deep neural network adaptation for improved large vocabulary speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Error back propagation for sequence training of Context-Dependent Deep NetworkS for conversational speech transcription.
Proceedings of the IEEE International Conference on Acoustics, 2013

Recent advances in deep learning for speech research at Microsoft.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Adaptation of context-dependent deep neural networks for automatic speech recognition.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Context-dependent Deep Neural Networks for audio indexing of real-life data.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Large Vocabulary Speech Recognition Using Deep Tensor Neural Networks.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Voice Activity Detection Using Speech Recognizer Feedback.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

ClippyScript: A Programming Language for Multi-Domain Dialogue Systems.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Pipelined Back-Propagation for Context-Dependent Deep Neural Networks.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Exploiting sparseness in deep neural networks for large vocabulary speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Conversational Speech Transcription Using Context-Dependent Deep Neural Networks.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Leveraging the Web for automatically generating indexable and browsable keywords for speech files.
Proceedings of the IEEE International Conference on Acoustics, 2011

Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Subword-based multi-span pronunciation adaptation for recognizing accented speech.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
On using missing-feature theory with cepstral features - approximations to the multivariate integral.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Vocabulary and language model adaptation using just one speech file.
Proceedings of the IEEE International Conference on Acoustics, 2010

Music rhythm characterization with application to workout-mix generation.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Multimedia retrieval through indexing speech: an enterprise perspective.
Proceedings of the third workshop on Searching spontaneous conversational speech, 2009

Unsupervised lattice-based acoustic model adaptation for speaker-dependent conversational telephone speech transcription.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Learning a music similarity measure on automatic annotations with application to playlist generation.
Proceedings of the IEEE International Conference on Acoustics, 2009

Unsupervised speaker adaptation for telephone call transcription.
Proceedings of the IEEE International Conference on Acoustics, 2009

Automatic punctuation generation for speech.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Mobile Search With Multimodal Queries.
Proc. IEEE, 2008

Word-lattice based spoken-document indexing with standard text indexers.
Proceedings of the 2008 IEEE Spoken Language Technology Workshop, 2008

Fragmented context-dependent syllable acoustic models.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

GPU-accelerated Gaussian clustering for fMPE discriminative training.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Towards vocabulary-independent speech indexing for large-scale repositories.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Addressing the out-of-vocabulary problem for large-scale Chinese spoken term detection.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Approximateword-lattice indexing with text indexers: Time-Anchored Lattice Expansion.
Proceedings of the IEEE International Conference on Acoustics, 2008

Fusing multiple systems into a compact lattice index for chinese spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2008

Mobile ringtone search through query by humming.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Online vocabulary adaptation using limited adaptation data.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A Hidden-State Maximum Entropy Model Forword Confidence Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Towards spoken-document retrieval for the enterprise: Approximate word-lattice indexing with text indexers.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

A study of lattice-based spoken term detection for Chinese spontaneous speech.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Discriminatively Trained spoken Document Similarity Models and their Application to Probabilistic Latent Semantic Analysis.
Proceedings of the 2006 IEEE ACL Spoken Language Technology Workshop, 2006

Towards Spoken-Document Retrieval for the Internet: Lattice Indexing For Large-Scale Web-Search Architectures.
Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2006

Maximum Entropy Based Normalization Of Word Posteriors For Phonetic And Lvcsr Lattice Search.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Vocabulary-Independent Indexing of Spontaneous Speech.
IEEE Trans. Speech Audio Process., 2005

The use of virtual hypothesis copies in decoding of large-vocabulary continuous speech.
IEEE Trans. Speech Audio Process., 2005

Searching the Audio Notebook: Keyword Search in Recorded Conversation.
Proceedings of the HLT/EMNLP 2005, 2005

Fast Two-Stage Vocabulary-Independent Search In Spontaneous Speech.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
A hybrid word / phoneme-based approach for improved vocabulary-independent search in spontaneous speech.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Vocabulary-independent search in spontaneous speech.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
An improved model-based speaker segmentation system.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - model and training.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Coarticulation modeling by embedding a target-directed hidden trajectory model into HMM - MAP decoding and evaluation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A system for spoken query information retrieval on mobile devices.
IEEE Trans. Speech Audio Process., 2002

2001
Rapid speaker adaptation using a priori knowledge by eigenspace analysis of MLLR parameters.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
The thoughtful elephant: strategies for spoken dialog systems.
IEEE Trans. Speech Audio Process., 2000

MAT-2000 - design, collection, and validation of a Mandarin 2000-speaker telephone speech database.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Two-stream modeling of Mandarin tones.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Improvements of the Philips 2000 Taiwan Mandarin benchmark system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Pitch tracking and tone features for Mandarin speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Development of the philips 1999 taiwan Mandarin benchmark system.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Phonetic Modelling In the Philips Chinese Continuous-Speech Recognition System.
Proceedings of the 1998 International Symposium on Chinese Spoken Language Processing, 1998

1997
PADIS - An automatic telephone switchboard and directory information system.
Speech Commun., 1997

Towards an automated directory information system.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996
A word graph based n-best search in continuous speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Improving speech understanding by incorporating database constraints and dialogue history.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

1995
The Philips automatic train timetable information system.
Speech Commun., 1995

Fast likelihood computation for continuous-mixture densities using a tree-based nearest neighbor search.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1994
Non-linear regression based feature extraction for connected-word recognition in noise.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994


  Loading...