Yong Zhao

Affiliations:

Microsoft Corporation, Redmond, WA, USA
Georgia Institute of Technology, Atlanta, GA, USA (former)

According to our database¹, Yong Zhao authored at least 53 papers between 2002 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2005

2010

2015

2020

2025

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

Streaming Speaker Change Detection and Gender Classification for Transducer-Based Multi-Talker Speech Translation.

[BibT_eX]

[DOI]

Aswin Shanmugam Subramanian

CoRR, February, 2025

2023

Improving Transformer-Based Networks with Locality for Automatic Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022.

[BibT_eX]

[DOI]

CoRR, 2022

2021

ResNeXt and Res2Net Structures for Speaker Verification.

[BibT_eX]

[DOI]

Tianyan Zhou

Yong Zhao

Jian Wu

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

ResNeXt and Res2Net Structure for Speaker Verification.

[BibT_eX]

[DOI]

Tianyan Zhou

Yong Zhao

Jian Wu

CoRR, 2020

Improving Deep CNN Networks with Long Temporal Context for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Adversarial Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Conditional Teacher-student Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

CNN with Phonetic Attention for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Advances in Online Audio-Visual Meeting Transcription.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Speaker-Invariant Training via Adversarial Learning.

[BibT_eX]

[DOI]

CoRR, 2018

Speaker Adaptation for End-to-End CTC Models.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Domain and Speaker Adaptation for Cortana Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

A comparative study of noise estimation algorithms for nonlinear compensation in robust speech recognition.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Fred Juang

Speech Commun., 2017

Extended low-rank plus diagonal adaptation for deep and recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

End-to-End attention based text-dependent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Low-rank plus diagonal adaptation for deep neural networks.

[BibT_eX]

[DOI]

Yong Zhao

Jinyu Li

Yifan Gong

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013

Nonlinear compensation and heterogeneous data modeling for robust speech recognition.

[BibT_eX]

[DOI]

Yong Zhao

PhD thesis, 2013

Modeling heterogeneous data sources for speech recognition using synchronous hidden Markov models.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

IEEE Trans. Speech Audio Process., 2012

Automatic Speech Recognition Based on Non-Uniform Error Criteria.

[BibT_eX]

[DOI]

Qiang Fu

Yong Zhao

Biing-Hwang Juang

IEEE Trans. Speech Audio Process., 2012

A general discriminative training algorithm for speech recognition using weighted finite-state transducers.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Stranded Gaussian mixture hidden Markov models for robust speech recognition.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Exploiting sparsity in stranded hidden Markov models for automatic speech recognition.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the Conference Record of the Forty Sixth Asilomar Conference on Signals, 2012

2011

Non-linear noise compensation for robust speech recognition using Gauss-Newton method.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

A comparative study of noise estimation algorithms for VTS-based robust speech recognition.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On noise estimation for robust speech recognition using vector Taylor series.

[BibT_eX]

[DOI]

Yong Zhao

Biing-Hwang Juang

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

A study on recognizing distorted speech over local distributed transducer networks.

[BibT_eX]

[DOI]

Yong Zhao

Sunghwan Shin

Enrique Robledo-Arnuncio

Biing-Hwang Juang

Proceedings of the IEEE International Conference on Acoustics, 2009

2007

Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Perceptual annotation of expressive speech.

[BibT_eX]

[DOI]

Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Iterative unit selection with unnatural prosody detection.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Agreement Learning for Automatic Accent Annotation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Modeling stylized invariance and local variability of prosody in text-to-speech synthesis.

[BibT_eX]

[DOI]

Min Chu

Yong Zhao

Eric Chang

Speech Commun., 2006

Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Constructing stylistic synthesis databases from audio books.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Identify language origin of personal names with normalized appearance number of web pages.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Measuring Target Cost in Unit Selection with Kl-Divergence Between Context-Dependent HMMS.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Hierarchical Approach to Automatic Stress Detection in English Sentences.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Identifying Language Origin of Person Names With N-Grams of Different Units.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Study on How Human Annotations Benefit the TTS Voice.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006

2005

Refining phoneme segmentations using speaker-adaptive context dependent boundary models.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Phonetic transcription verification with generalized posterior probability.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Customizing base unit set with speech database in TTS systems.

[BibT_eX]

[DOI]

Yining Chen

Yong Zhao

Min Chu

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Refining segmental boundaries for TTS database using fine contextual-dependent boundary models.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Custom-tailoring TTS voice font - keeping the naturalness when reducing database size.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Microsoft Mulan - a bilingual TTS system.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluation.

[BibT_eX]

[DOI]

Hu Peng

Yong Zhao

Min Chu

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Yong Zhao

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...