Frank K. Soong
Orcid: 0000-0002-9088-3577Affiliations:
- Microsoft Research Asia, Beijing, China
- Chinese University of Hong Kong (CUHK), Department of Systems Engineering and Engineering Management, Hong Kong
- Bell Labs Research, Murray Hill, NJ, USA
- University of Stanford, Department of Electrical Engineering, CA, USA (PhD)
According to our database1,
Frank K. Soong
authored at least 305 papers
between 1978 and 2024.
Collaborative distances:
Collaborative distances:
Awards
IEEE Fellow
IEEE Fellow 2010, "For contributions to speech processing".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE Trans. Pattern Anal. Mach. Intell., June, 2024
2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
2022
ParaTTS: Learning Linguistic and Prosodic Cross-Sentence Information in Paragraph-Based TTS.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Ordinal Regression via Binary Preference vs Simple Regression: Statistical and Experimental Perspectives.
CoRR, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2022
Improving Fastspeech TTS with Efficient Self-Attention and Compact Feed-Forward Network.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Neural Networks, 2021
Effective and direct control of neural TTS prosody by removing interactions between different attributes.
Neural Networks, 2021
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Pronunciation Assessment Via Ordinal Regression with Anchored Reference Samples.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Spoken Language Understanding of Human-Machine Conversations for Language Learning Applications.
J. Signal Process. Syst., 2020
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020
Transfer Learning for Improving Singing-Voice Detection in Polyphonic Instrumental Music.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
An Improved Frame-Unit-Selection Based Voice Conversion System Without Parallel Training Data.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Improving Prosody with Linguistic and Bert Derived Features in Multi-Speaker Based Mandarin Chinese Neural TTS.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Improving LPCNET-Based Text-to-Speech with Linear Prediction-Structured Mixture Density Network.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Voice conversion with SI-DNN and KL divergence based mapping without parallel training data.
Speech Commun., 2019
CoRR, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Domain Adversarial Training for Improving Keyword Spotting Performance of ESL Speech.
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Modeling Multi-speaker Latent Space to Improve Neural TTS: Quick Enrolling New Speaker and Enhancing Premium Voice.
CoRR, 2018
Frame Selection in SI-DNN Phonetic Space with WaveNet Vocoder for Voice Conversion without Parallel Training Data.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
From Speech Signals to Semantics - Tagging Performance at Acoustic, Phonetic and Word Levels.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
A Refined Query-by-Example Approach to Spoken-Term-Detection on ESL learners' Speech.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Effective Spectral and Excitation Modeling Techniques for LSTM-RNN-Based Speech Synthesis Systems.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proficiency Assessment of ESL Learner's Sentence Prosody with TTS Synthesized Voice as Reference.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Improving Sub-Phone Modeling for Better Native Language Identification with Non-Native English Speech.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Perceptual quality and modeling accuracy of excitation parameters in DLSTM-based speech synthesis systems.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Improving native language (L1) identifation with better VAD and TDNN trained separately on native and non-native English corpora.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
A Two-Pass Framework of Mispronunciation Detection and Diagnosis for Computer-Aided Pronunciation Training.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Speech Commun., 2016
Speech Commun., 2016
Multim. Tools Appl., 2016
Learning Distributed Word Representations For Bidirectional LSTM Recurrent Neural Network.
Proceedings of the NAACL HLT 2016, 2016
A KL Divergence and DNN-Based Approach to Voice Conversion without Parallel Training Sentences.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Improved Time-Frequency Trajectory Excitation Vocoder for DNN-Based Speech Synthesis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
KL-divergence based mispronunciation detection via DNN and decision tree in the phonetic space.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
Improved mispronunciation detection with deep neural network trained acoustic models and transfer learning based logistic regression classifiers.
Speech Commun., 2015
Multim. Tools Appl., 2015
A Unified Tagging Solution: Bidirectional LSTM Recurrent Neural Network with Word Embedding.
CoRR, 2015
Part-of-Speech Tagging with Bidirectional Long Short-Term Memory Recurrent Neural Network.
CoRR, 2015
An improved DNN-based approach to mispronunciation detection and diagnosis of L2 learners' speech.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015
Sequence generation error (SGE) minimization based deep neural networks training for text-to-speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
AA spectral space warping approach to cross-lingual voice transformation in HMM-based TTS.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
From text-to-speech (TTS) to talking head - a machine learning approach to A/V speech modeling and rendering.
Proceedings of the Auditory-Visual Speech Processing, 2015
A two-pass framework of mispronunciation detection & diagnosis for computer-aided pronunciation training.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
A new Neural Network based logistic regression classifier for improving mispronunciation detection of L2 language learners.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
A maximum a Posterior-based reconstruction approach to speech bandwidth expansion in noise.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
A DNN-based acoustic modeling of tonal language and its application to Mandarin pronunciation training.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
2013
IEEE Trans. Speech Audio Process., 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Binocular photometric stereo acquisition and reconstruction for 3d talking head applications.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
A source-filter based adaptive harmonic model and its application to speech prosody modification.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
A new DNN-based high quality pronunciation evaluation for computer-aided language learning (CALL).
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
Proceedings of the Mobile HCI '12, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Cross validation and Minimum Generation Error for improved model clustering in HMM-based TTS.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
A unified trajectory tiling approach to high quality TTS and cross-lingual voice transformation.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Objective Intelligibility Assessment of Text-to-Speech System using Template Constrained Generalized Posterior Probability.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
The Use of DBN-HMMs for Mispronunciation Detection and Diagnosis in L2 English to Support Computer-Aided Pronunciation Training.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Modeling pitch trajectory by hierarchical HMM with minimum generation error training.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Improved minimum converted trajectory error training for real-time speech-to-lips conversion.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
High quality lips animation with speech and captured facial action unit as A/V input.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
IEEE ACM Trans. Audio Speech Lang. Process., 2011
Improved Prosody Generation by Maximizing Joint Probability of State and Longer Units.
IEEE Trans. Speech Audio Process., 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT).
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
A Sparse and Low-rank approach to efficient face alignment for photo-real talking head synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
Speaker characterization using spectral subband energy ratio based on Harmonic plus Noise Model.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Automatic prosody prediction and detection with Conditional Random Field (CRF) models.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Capturing L2 segmental mispronunciations with joint-sequence models in Computer-Aided Pronunciation Training (CAPT).
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Discriminative acoustic model for improving mispronunciation detection and diagnosis in computer-aided pronunciation training (CAPT).
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
An HMM Trajectory Tiling (HTT) Approach to High Quality TTS - Microsoft Entry to Blizzard Challenge 2010.
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010
2009
A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS.
IEEE Trans. Speech Audio Process., 2009
IEEE Trans. Speech Audio Process., 2009
IEEE Signal Process. Lett., 2009
A Multi-Space Distribution (MSD) and two-stream tone modeling approach to Mandarin speech recognition.
Speech Commun., 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
An evidence framework for Bayesian learning of continuous-density hidden Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2009
Improved prosody generation by maximizing joint likelihood of state and longer units.
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the Auditory-Visual Speech Processing, 2009
2008
IEEE Trans. Speech Audio Process., 2008
IEEE Trans. Speech Audio Process., 2008
Tone-enhanced generalized character posterior probability (GCPP) for Cantonese LVCSR.
Comput. Speech Lang., 2008
Modeling and Generating Tone Contour with Phrase Intonation for Mandarin Chinese Speech.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Improving Automatic Evaluation of Mandarin Pronunciation with Speaker Adaptive Training (SAT) and MLLR Speaker Adaption.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Prosody for Mandarin speech recognition: a comparative study of read and spontaneous speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Efficient handwriting correction of speech recognition errors with template constrained posterior (TCP).
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
An ellipsoid constrained quadratic programming perspective to discriminative training of HMMs.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Improving letter-to-sound conversion performance with automatically generated new words.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Symbol graph based discriminative training and rescoring for improved math symbol recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
Static and Dynamic Spectral Features: Their Noise Robustness and Optimal Weights for ASR.
IEEE Trans. Speech Audio Process., 2007
A Cohort-Based Speaker Model Synthesis for Mismatched Channels in Speaker Verification.
IEEE Trans. Speech Audio Process., 2007
IEEE Trans. Speech Audio Process., 2007
Int. J. Comput. Linguistics Chin. Lang. Process., 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Context constrained-generalized posterior probability for verifying phone transcriptions.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Minimum Error Discriminative Training for Radical-Based Online Chinese Handwriting Recognition.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
A Unified Framework for Symbol Segmentation and Recognition of Handwritten Mathematical Expressions.
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR 2007), 2007
Generalized Segment Posterior Probability for Automatic Mandarin Pronunciation Evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
A Constrained Line Search Optimization for Discriminative Training in Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Enrich Web Applications with Voice Internet Persona Text-to-Speech for Anyone, Anywhere.
Proceedings of the Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments, 2007
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007
2006
A tree-based kernel selection approach to efficient Gaussian mixture model-universal background model based speaker identification.
Speech Commun., 2006
Modeling Cantonese Pronunciation Variations for Large-Vocabulary Continuous Speech Recognition.
Int. J. Comput. Linguistics Chin. Lang. Process., 2006
IEICE Trans. Inf. Syst., 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Integrating Hypotheses of Multiple Recognizers for Improving Mandarin LVCSR Performance.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Generalization of the minimum classification error (MCE) training based on maximizing generalized posterior probability (GPP).
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the 8th International Conference on Multimodal Interfaces, 2006
Improved Chinese Character Input by Merging Speech and Handwriting Recognition Hypotheses.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
A Comparative Study of Discriminative Methods for Reranking LVCSR N-Best Hypotheses in Domain Adaptation and Generalization.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006
2005
A Dynamic In-Search Data Selection Method With Its Applications to Acoustic Modeling and Utterance Verification.
IEEE Trans. Speech Audio Process., 2005
IEICE Trans. Inf. Syst., 2005
Refining phoneme segmentations using speaker-adaptive context dependent boundary models.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Generalized Posterior Probability for Minimum Error Verification of Recognized Sentences.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Optimal Clustering and Non-Uniform Allocation of Gaussian Kernels in Scalar Dimension for HMM Compression.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
On noise robustness of dynamic and static features for continuous Cantonese digit recognition.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004
Generalized posterior probability for minimizing verification errors at subword, word and sentence levels.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation.
Proceedings of the COLING 2004, 2004
2003
On divergence based clustering of normal distributions and its application to HMM adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Optimal clustering of multivariate normal distributions using divergence and its application to HMM adaptation.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Combining neighboring filter channels to improve quantile based histogram equalization.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Proceedings of the IEEE International Conference on Acoustics, 2002
A dynamic in-search discriminative training approach for large vocabulary speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002
2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
A data selection strategy for utterance verification in continuous speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the IEEE International Conference on Acoustics, 2001
2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
1999
IEEE Robotics Autom. Mag., 1999
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999
1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
1997
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997
1996
Proceedings of the 4th International Conference on Spoken Language Processing, 1996
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996
1995
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995
An orthogonal polynomial representation of speech signals and its probabilistic model for text independent speaker verification.
Proceedings of the 1995 International Conference on Acoustics, 1995
1994
IEEE Trans. Speech Audio Process., 1994
An N-best candidates-based discriminative training for speech recognition applications.
IEEE Trans. Speech Audio Process., 1994
Int. J. Pattern Recognit. Artif. Intell., 1994
The use of tree-trellis search for large-vocabulary Mandarin polysyllabic word speech recognition.
Comput. Speech Lang., 1994
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994
Discriminative training of high performance speech recognizer using N best candidates.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994
1993
1992
Proceedings of the Second International Conference on Spoken Language Processing, 1992
Proceedings of the Second International Conference on Spoken Language Processing, 1992
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992
1990
A Tree.Trellis Based Fast Search for Finding the N Best Sentence Hypotheses in Continuous Speech Recognition.
Proceedings of the Speech and Natural Language: Proceedings of a Workshop Held at Hidden Valley, 1990
A tree-trellis based fast search for finding the n best sentence hypotheses in continuous speech recognition.
Proceedings of the First International Conference on Spoken Language Processing, 1990
Experiments in automatic talker verification using sub-word unit hidden Markov models.
Proceedings of the First International Conference on Spoken Language Processing, 1990
Proceedings of the 1990 International Conference on Acoustics, 1990
Proceedings of the 1990 International Conference on Acoustics, 1990
Proceedings of the 1990 International Conference on Acoustics, 1990
Proceedings of the 1990 International Conference on Acoustics, 1990
Proceedings of the 1990 International Conference on Acoustics, 1990
1989
IEEE Trans. Acoust. Speech Signal Process., 1989
A phonetically labeled acoustic segment (PLAS) approach to speech analysis-synthesis.
Proceedings of the IEEE International Conference on Acoustics, 1989
Proceedings of the IEEE International Conference on Acoustics, 1989
1988
A frequency-weighted Itakura spectral distortion measure and its application to speech recognition in noise.
IEEE Trans. Acoust. Speech Signal Process., 1988
On the use of instantaneous and transitional spectral information in speaker recognition.
IEEE Trans. Acoust. Speech Signal Process., 1988
Proceedings of the IEEE International Conference on Acoustics, 1988
Proceedings of the IEEE International Conference on Acoustics, 1988
1987
Proceedings of the IEEE International Conference on Acoustics, 1987
A training procedure for a segment-based-network approach to isolated word recognition.
Proceedings of the IEEE International Conference on Acoustics, 1987
1986
A high quality subband speech coder with backward adaptive predictor and optimal time-frequency bit assignment.
Proceedings of the IEEE International Conference on Acoustics, 1986
Evaluation of a vector quantization talker recognition system in text independent and text dependent modes.
Proceedings of the IEEE International Conference on Acoustics, 1986
1985
A vector-quantization-based preprocessor for speaker-independent isolated word recognition.
IEEE Trans. Acoust. Speech Signal Process., 1985
Speech Commun., 1985
Single-frame vowel recognition using vector quantization with several distance measures.
AT&T Tech. J., 1985
Incorporation of temporal structure into a vector-quantization-based preprocessor for speaker-independent, isolated-word recognition.
AT&T Tech. J., 1985
Proceedings of the IEEE International Conference on Acoustics, 1985
Proceedings of the IEEE International Conference on Acoustics, 1985
An efficient vector-quantization preprocessor for speaker independent isolated word recognition.
Proceedings of the IEEE International Conference on Acoustics, 1985
1984
On the performance of isolated word speech recognizers using vector quantization and temporal energy contours.
AT&T Bell Lab. Tech. J., 1984
Proceedings of the IEEE International Conference on Acoustics, 1984
Proceedings of the IEEE International Conference on Acoustics, 1984
1982
Proceedings of the IEEE International Conference on Acoustics, 1982
On the high resolution and unbiased frequency estimates of sinusoids in white noise-A new adaptive approach.
Proceedings of the IEEE International Conference on Acoustics, 1982
1981
Proceedings of the IEEE International Conference on Acoustics, 1981
1980
Proceedings of the IEEE International Conference on Acoustics, 1980
1978
Proceedings of the IEEE International Conference on Acoustics, 1978
Proceedings of the IEEE International Conference on Acoustics, 1978