We stand with Ukraine

We stand with Ukraine

Yanzhang He

According to our database¹, Yanzhang He authored at least 65 papers between 2012 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

2012

2014

2016

2018

2020

2022

2024

0

5

10

15

3

1

1

1

1

3

7

14

11

7

3

2

4

4

1

2

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Massive End-to-end Speech Recognition Models with Time Reduction.

[BibT_eX]

[DOI]

,

Rohit Prabhavalkar

,

,

,

Dongseong Hwang

,

,

,

,

,

,

,

Chengjian Zheng

,

,

Tara N. Sainath

,

Pedro Moreno Mengibar

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models.

[BibT_eX]

[DOI]

Rohit Prabhavalkar

,

,

,

,

,

,

,

Dongseong Hwang

,

Tara N. Sainath

,

Pedro J. Moreno

Proceedings of the IEEE International Conference on Acoustics, 2024

USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Rohit Prabhavalkar

,

,

Tara N. Sainath

,

,

,

Amir Yazdanbakhsh

,

Shivani Agrawal

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Partial Rewriting for Multi-Stage ASR.

[BibT_eX]

[DOI]

Antoine Bruguier

,

,

CoRR, 2023

Massive End-to-end Models for Short Search Queries.

[BibT_eX]

[DOI]

,

Rohit Prabhavalkar

,

Dongseong Hwang

,

,

,

,

,

,

,

,

,

,

Tara N. Sainath

,

Pedro Moreno Mengibar

CoRR, 2023

RAND: Robustness Aware Norm Decay For Quantized Seq2seq Models.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2023

2-bit Conformer quantization for automatic speech recognition.

[BibT_eX]

[DOI]

,

Phoenix Meadowlark

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks.

[BibT_eX]

[DOI]

,

,

,

,

Shuo-Yiin Chang

,

,

Tara N. Sainath

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Conditional Conformer: Improving Speaker Modulation For Single And Multi-User Speech Enhancement.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model.

[BibT_eX]

[DOI]

,

Shuo-Yiin Chang

,

Tara N. Sainath

,

,

,

,

Rohit Prabhavalkar

,

,

,

Trevor D. Strohman

Proceedings of the IEEE International Conference on Acoustics, 2023

Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models.

[BibT_eX]

[DOI]

Steven M. Hernandez

,

,

,

Antoine Bruguier

,

Rohit Prabhavalkar

,

Tara N. Sainath

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

The Role of Feature Correlation on Quantized Neural Networks.

[BibT_eX]

[DOI]

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Efficient Cascaded Streaming ASR System Via Frame Rate Reduction.

[BibT_eX]

[DOI]

,

,

,

Dongseong Hwang

,

,

Antoine Bruguier

,

Rohit Prabhavalkar

,

Tara N. Sainath

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Context-Aware Neural Confidence Estimation for Rare Word Speech Recognition.

[BibT_eX]

[DOI]

,

Tsendsuren Munkhdalai

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Flickering Reduction with Partial Hypothesis Reranking for Streaming ASR.

[BibT_eX]

[DOI]

Antoine Bruguier

,

,

Trevor Strohman

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems.

[BibT_eX]

[DOI]

,

Shuo-Yiin Chang

,

,

Tara N. Sainath

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Closing the Gap Between Single-User and Multi-User VoiceFilter-Lite.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Improving Rare Word Recognition with LM-aware MWER Training.

[BibT_eX]

[DOI]

,

,

Tara N. Sainath

,

,

Rohit Prabhavalkar

,

,

Bhuvana Ramabhadran

,

,

Sepand Mavandadi

,

,

Trevor Strohman

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Language Agnostic Multilingual Streaming On-Device ASR System.

[BibT_eX]

[DOI]

,

Tara N. Sainath

,

,

Shuo-Yiin Chang

,

,

Trevor Strohman

,

,

,

,

,

,

Sameer Bidichandani

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Deliberation by Text-Only and Semi-Supervised Training.

[BibT_eX]

[DOI]

,

Tara N. Sainath

,

,

Rohit Prabhavalkar

,

Trevor Strohman

,

Sepand Mavandadi

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Unified Cascaded Encoder ASR Model for Dynamic Model Sizes.

[BibT_eX]

[DOI]

,

,

,

Tara N. Sainath

,

,

,

,

,

,

,

Dongseong Hwang

,

,

Rohit Prabhavalkar

,

Trevor Strohman

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Personal VAD 2.0: Optimizing Personal Voice Activity Detection for On-Device Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

4-bit Conformer with Native Quantization Aware Training for Speech Recognition.

[BibT_eX]

[DOI]

,

Phoenix Meadowlark

,

,

,

Shivani Agrawal

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Turn-Taking Prediction for Natural Conversational Speech.

[BibT_eX]

[DOI]

Shuo-Yiin Chang

,

,

Tara N. Sainath

,

,

Trevor Strohman

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving The Latency And Quality Of Cascaded Encoders.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Philip C. Woodland

Proceedings of the IEEE International Conference on Acoustics, 2022

Large-Scale ASR Domain Adaptation Using Self- and Semi-Supervised Learning.

[BibT_eX]

[DOI]

Dongseong Hwang

,

,

,

Nikhil Siddhartha

,

,

,

,

Trevor Strohman

,

Françoise Beaufays

,

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling.

[BibT_eX]

[DOI]

Tara N. Sainath

,

,

,

,

,

,

,

,

,

Quoc-Nam Le-The

,

Shuo-Yiin Chang

,

,

,

,

Chung-Cheng Chiu

,

Diamantino Caseiro

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Personalized Keyphrase Detection Using Speaker and Environment Information.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Tied & Reduced RNN-T Decoder.

[BibT_eX]

[DOI]

,

Tara N. Sainath

,

,

Emmanuel Guzman

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

FastEmit: Low-Latency Streaming ASR with Sequence-Level Emission Regularization.

[BibT_eX]

[DOI]

,

Chung-Cheng Chiu

,

,

Shuo-Yiin Chang

,

Tara N. Sainath

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Learning Word-Level Confidence for Subword End-To-End ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Rohit Prabhavalkar

,

,

,

,

Tara N. Sainath

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Less is More: Improved RNN-T Decoding Using Limited Label Context and Path Merging.

[BibT_eX]

[DOI]

Rohit Prabhavalkar

,

,

,

,

,

Trevor Strohman

,

Tara N. Sainath

Proceedings of the IEEE International Conference on Acoustics, 2021

Confidence Estimation for Attention-Based Sequence-to-Sequence Models for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Philip C. Woodland

,

,

Trevor Strohman

Proceedings of the IEEE International Conference on Acoustics, 2021

A Better and Faster end-to-end Model for Streaming ASR.

[BibT_eX]

[DOI]

,

,

,

Tara N. Sainath

,

Chung-Cheng Chiu

,

,

Shuo-Yiin Chang

,

,

,

,

,

,

,

Trevor Strohman

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Multi-User Voicefilter-Lite via Attentive Speaker Embedding.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Cross-Attention Conformer for Context Modeling in Speech Enhancement for ASR.

[BibT_eX]

[DOI]

,

Chung-Cheng Chiu

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency.

[BibT_eX]

[DOI]

CoRR, 2020

VoiceFilter-Lite: Streaming Targeted Voice Separation for On-Device Speech Recognition.

[BibT_eX]

[DOI]

,

Ignacio López-Moreno

,

,

Kevin W. Wilson

,

,

,

,

,

Jason Pelecanos

,

,

Alexander Gruenstein

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Analyzing the Quality and Stability of a Streaming End-to-End On-Device Speech Recognizer.

[BibT_eX]

[DOI]

,

,

,

,

Françoise Beaufays

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Parallel Rescoring with Transformer for Streaming On-Device Speech Recognition.

[BibT_eX]

[DOI]

,

,

Chung-Cheng Chiu

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Low Latency Speech Recognition Using End-to-End Prefetching.

[BibT_eX]

[DOI]

Shuo-Yiin Chang

,

,

,

,

,

Tara N. Sainath

,

Trevor Strohman

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Attention-Based Joint Acoustic and Text on-Device End-To-End Model.

[BibT_eX]

[DOI]

Tara N. Sainath

,

,

,

,

Chung-Cheng Chiu

,

Trevor Strohman

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

A Streaming On-Device End-To-End Model Surpassing Server-Side Conventional Model Quality and Latency.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Towards Fast and Accurate Streaming End-To-End ASR.

[BibT_eX]

[DOI]

,

Shuo-Yiin Chang

,

Tara N. Sainath

,

,

,

Trevor Strohman

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Tara N. Sainath

,

,

Chung-Cheng Chiu

,

,

,

,

Stella Laurenzo

,

,

,

Wolfgang Macherey

,

,

,

,

,

,

Rohit Prabhavalkar

,

,

,

,

,

,

Sébastien Jean

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Kuan-Chieh Wang

,

Ekaterina Gonina

,

,

,

,

,

,

,

,

,

George F. Foster

,

John Richardson

,

,

Antoine Bruguier

,

,

,

,

,

,

,

Vijayaditya Peddinti

,

,

Michiel Bacchiani

,

Thomas B. Jablin

,

Robert Suderman

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Dmitry Lepikhin

,

,

,

,

Shubham Toshniwal

,

,

Michael Nirschl

,

CoRR, 2019

Two-Pass End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Tara N. Sainath

,

,

,

,

Rohit Prabhavalkar

,

,

Mirkó Visontai

,

,

Trevor Strohman

,

,

,

Chung-Cheng Chiu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Streaming End-to-end Speech Recognition for Mobile Devices.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Joint Endpointing and Decoding with End-to-end Models.

[BibT_eX]

[DOI]

Shuo-Yiin Chang

,

Rohit Prabhavalkar

,

,

Tara N. Sainath

,

Proceedings of the IEEE International Conference on Acoustics, 2019

2017

Streaming small-footprint keyword spotting using sequence-to-sequence models.

[BibT_eX]

[DOI]

,

Rohit Prabhavalkar

,

,

,

,

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Self-adaptive Failure Detector for Peer-to-Peer Distributed System Considering the Link Faults.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the Advanced Parallel Processing Technologies, 2017

2016

Using Pronunciation-Based Morphological Subword Units to Improve OOV Handling in Keyword Search.

[BibT_eX]

[DOI]

,

,

,

Brian Hutchinson

,

,

,

Eric Fosler-Lussier

,

Janet B. Pierrehumbert

IEEE ACM Trans. Audio Speech Lang. Process., 2016

2015

Segmental conditional random fields with deep neural networks as acoustic models for first-pass word recognition.

[BibT_eX]

[DOI]

,

Eric Fosler-Lussier

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Deep neural network based spectral feature mapping for robust speech recognition.

[BibT_eX]

[DOI]

,

,

,

Eric Fosler-Lussier

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Improvements on transducing syllable lattice to word lattice for keyword search.

[BibT_eX]

[DOI]

,

,

,

James Hieronymus

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Combining spectral feature mapping and multi-channel model-based source separation for noise-robust automatic speech recognition.

[BibT_eX]

[DOI]

,

Michael I. Mandel

,

,

,

Andrew R. Plummer

,

Eric Fosler-Lussier

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Syllable based keyword search: Transducing syllable lattices to word lattices.

[BibT_eX]

[DOI]

,

James Hieronymus

,

,

Eric Fosler-Lussier

,

Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Subword-based modeling for handling OOV words inkeyword spotting.

[BibT_eX]

[DOI]

,

Brian Hutchinson

,

,

,

Eric Fosler-Lussier

,

Janet B. Pierrehumbert

Proceedings of the IEEE International Conference on Acoustics, 2014

Virtual Machine Scheduling Considering Both Computing and Cooling Energy.

[BibT_eX]

[DOI]

,

,

Proceedings of the 2014 IEEE International Conference on High Performance Computing and Communications, 2014

Scalability Analysis and Improvement of Hadoop Virtual Cluster with Cost Consideration.

[BibT_eX]

[DOI]

,

,

,

,

Zhongzhong Chen

Proceedings of the 2014 IEEE 7th International Conference on Cloud Computing, Anchorage, AK, USA, June 27, 2014

2013

Conditional Random Fields in Speech, Audio, and Language Processing.

[BibT_eX]

[DOI]

Eric Fosler-Lussier

,

,

,

Rohit Prabhavalkar

Proc. IEEE, 2013

HPACS: A High Privacy and Availability Cloud Storage Platform with Matrix Encryption.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Advanced Parallel Processing Technologies, 2013

2012

Efficient Segmental Conditional Random Fields for One-Pass Phone Recognition.

[BibT_eX]

[DOI]

,

Eric Fosler-Lussier

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

vHadoop: A Scalable Hadoop Virtual Cluster Platform for MapReduce-Based Parallel Machine Learning with Performance Consideration.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops, 2012

Loading...