Zhen Huang

Orcid: 0000-0002-1772-7674

Affiliations:
  • Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA, USA


According to our database1, Zhen Huang authored at least 34 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Contextualization of ASR with LLM using phonetic retrieval-based augmentation.
CoRR, 2024

Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models.
CoRR, 2024

Enhancing CTC-based speech recognition with diverse modeling units.
CoRR, 2024

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2024

Personalization of CTC-Based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Conformer-Based Speech Recognition On Extreme Edge-Computing Devices.
CoRR, 2023

Acoustic Model Fusion For End-to-End Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
A Treatise On FST Lattice Based MMI Training.
CoRR, 2022

2020
SNDCNN: Self-Normalizing Deep CNNs with Scaled Exponential Linear Units for Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning.
J. Signal Process. Syst., 2018

2017
Bayesian adaptation and combination of deep models for automatic speech recognition.
PhD thesis, 2017

Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation.
Pattern Recognit. Lett., 2017

An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition.
IEEE J. Sel. Top. Signal Process., 2017

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation.
EURASIP J. Adv. Signal Process., 2017

A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

2016
A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition.
Neurocomputing, 2016

Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Towards a direct Bayesian adaptation framework for deep models.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Maximum a Posteriori Adaptation of Network Parameters in Deep Models.
CoRR, 2015

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Maximum a posteriori adaptation of network parameters in deep models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Rapid adaptation for deep neural networks through multi-task learning.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014
Beyond cross-entropy: towards better frame-level objective functions for deep neural network training in automatic speech recognition.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Feature space maximum a posteriori linear regression for adaptation of deep neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A maximal figure-of-merit learning approach to maximizing mean average precision with deep neural network based classifiers.
Proceedings of the IEEE International Conference on Acoustics, 2014

Deep learning vector quantization for acoustic information retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2014

An i-vector based descriptor for alphabetical gesture recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013

A blind segmentation approach to acoustic event detection based on i-vector.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012


  Loading...