Yong Zhao

Affiliations:
  • Microsoft Corporation, Redmond, WA, USA
  • Georgia Institute of Technology, Atlanta, GA, USA (former)


According to our database1, Yong Zhao authored at least 52 papers between 2002 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2023
Improving Transformer-Based Networks with Locality for Automatic Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
The Microsoft System for VoxCeleb Speaker Recognition Challenge 2022.
CoRR, 2022

2021
ResNeXt and Res2Net Structures for Speaker Verification.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
ResNeXt and Res2Net Structure for Speaker Verification.
CoRR, 2020

Improving Deep CNN Networks with Long Temporal Context for Text-Independent Speaker Verification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Adversarial Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2019

Conditional Teacher-student Learning.
Proceedings of the IEEE International Conference on Acoustics, 2019

CNN with Phonetic Attention for Text-Independent Speaker Verification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019


2018
Speaker-Invariant Training via Adversarial Learning.
CoRR, 2018

Speaker Adaptation for End-to-End CTC Models.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Domain and Speaker Adaptation for Cortana Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Exploring Sequential Characteristics in Speaker Bottleneck Feature for Text-Dependent Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
A comparative study of noise estimation algorithms for nonlinear compensation in robust speech recognition.
Speech Commun., 2017

Extended low-rank plus diagonal adaptation for deep and recurrent neural networks.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016
End-to-End attention based text-dependent speaker verification.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Low-rank plus diagonal adaptation for deep neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Confidence-features and confidence-scores for ASR applications in arbitration and DNN speaker adaptation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Investigating online low-footprint speaker adaptation using generalized linear regression and click-through data.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2013
Nonlinear compensation and heterogeneous data modeling for robust speech recognition.
PhD thesis, 2013

Modeling heterogeneous data sources for speech recognition using synchronous hidden Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Nonlinear Compensation Using the Gauss-Newton Method for Noise-Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2012

Automatic Speech Recognition Based on Non-Uniform Error Criteria.
IEEE Trans. Speech Audio Process., 2012

A general discriminative training algorithm for speech recognition using weighted finite-state transducers.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Stranded Gaussian mixture hidden Markov models for robust speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Exploiting sparsity in stranded hidden Markov models for automatic speech recognition.
Proceedings of the Conference Record of the Forty Sixth Asilomar Conference on Signals, 2012

2011
Non-linear noise compensation for robust speech recognition using Gauss-Newton method.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
A comparative study of noise estimation algorithms for VTS-based robust speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

On noise estimation for robust speech recognition using vector Taylor series.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A study on recognizing distorted speech over local distributed transducer networks.
Proceedings of the IEEE International Conference on Acoustics, 2009

2007
Measuring attribute dissimilarity with HMM KL-divergence for speech synthesis.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Perceptual annotation of expressive speech.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Iterative unit selection with unnatural prosody detection.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Agreement Learning for Automatic Accent Annotation.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Modeling stylized invariance and local variability of prosody in text-to-speech synthesis.
Speech Commun., 2006

Context-Dependent Boundary Model for Refining Boundaries Segmentation of TTS Units.
IEICE Trans. Inf. Syst., 2006

The Paradigm for Creating Multi-lingual Text-To-Speech Voice Databases.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Constructing stylistic synthesis databases from audio books.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Identify language origin of personal names with normalized appearance number of web pages.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Measuring Target Cost in Unit Selection with Kl-Divergence Between Context-Dependent HMMS.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Hierarchical Approach to Automatic Stress Detection in English Sentences.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Identifying Language Origin of Person Names With N-Grams of Different Units.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

A Study on How Human Annotations Benefit the TTS Voice.
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006

2005
Refining phoneme segmentations using speaker-adaptive context dependent boundary models.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Phonetic transcription verification with generalized posterior probability.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Customizing base unit set with speech database in TTS systems.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Refining segmental boundaries for TTS database using fine contextual-dependent boundary models.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Custom-tailoring TTS voice font - keeping the naturalness when reducing database size.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Microsoft Mulan - a bilingual TTS system.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Perpetually optimizing the cost function for unit selection in a TTS system with one single run of MOS evaluation.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002


  Loading...