Yu Zhang
Orcid: 0000-0002-9505-1833Affiliations:
- Massachusetts Institute of Technology, Computer Science and Artificial Intelligence Laboratory, Cambridge, MA, USA (PhD 2017)
According to our database1,
Yu Zhang
authored at least 136 papers
between 2013 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the International Conference on Machine Learning, 2023
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Improving Multilingual and Code-Switching ASR Using Large Language Model Generated Text.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2022
IEEE J. Sel. Top. Signal Process., 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Improving Generalizability of Distilled Self-Supervised Speech Processing Models Under Distorted Settings.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Leveraging unsupervised and weakly-supervised data to improve direct speech-to-speech translation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Improving Distortion Robustness of Self-supervised Speech Processing Tasks with Domain Adaptation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training.
CoRR, 2021
SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network.
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Streaming Automatic Speech Recognition with Non-Streaming Model Distillation on Unsupervised Data.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
w2v-BERT: Combining Contrastive Learning and Masked Language Modeling for Self-Supervised Speech Pre-Training.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
CoRR, 2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling.
CoRR, 2020
A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.
CoRR, 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
2019
CoRR, 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Bytes Are All You Need: End-to-end Multilingual Speech Recognition and Synthesis with Bytes.
Proceedings of the IEEE International Conference on Acoustics, 2019
Disentangling Correlated Speaker and Noise for Speech Synthesis via Data Augmentation and Adversarial Factorization.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Semi-supervised Training for Improving Data Efficiency in End-to-end Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the 3rd Annual Conference on Robot Learning, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Style Tokens: Unsupervised Style Modeling, Control and Transfer in End-to-End Speech Synthesis.
Proceedings of the 35th International Conference on Machine Learning, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
2016
CoRR, 2016
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
SLS at SemEval-2016 Task 3: Neural-based Approaches for Ranking in Community Question Answering.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016
On training bi-directional neural network language model with noise contrastive estimation.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Exploiting Depth and Highway Connections in Convolutional Recurrent Deep Neural Networks for Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Prediction-adaptation-correction recurrent neural networks for low-resource language speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Integrated adaptation with multi-factor joint-learning for far-field speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the COLING 2016, 2016
2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014
Graph-based re-ranking using acoustic feature similarity between search results for spoken term detection on low-resource languages.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Extracting deep neural network bottleneck features using low-rank matrix factorization.
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 2013