We stand with Ukraine

We stand with Ukraine

Zhen Huang

Orcid: 0000-0002-1772-7674

Affiliations:

Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA, USA

According to our database¹, Zhen Huang authored at least 34 papers between 2012 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2024

Contextualization of ASR with LLM using phonetic retrieval-based augmentation.

[BibT_eX]

[DOI]

,

,

,

Ernest Pusateri

,

Christophe Van Gysel

,

,

,

CoRR, 2024

Focused Discriminative Training For Streaming CTC-Trained Automatic Speech Recognition Models.

[BibT_eX]

[DOI]

,

,

,

,

,

CoRR, 2024

Enhancing CTC-based speech recognition with diverse modeling units.

[BibT_eX]

[DOI]

,

,

,

,

CoRR, 2024

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Mahesh Krishnamoorthy

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Track, 2024

Personalization of CTC-Based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization.

[BibT_eX]

[DOI]

,

Ernest Pusateri

,

,

,

,

,

,

,

Mirko Hannemann

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Conformer-Based Speech Recognition On Extreme Edge-Computing Devices.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Mahesh Krishnamoorthy

CoRR, 2023

Acoustic Model Fusion For End-to-End Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Ernest Pusateri

,

Mirko Hannemann

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

A Treatise On FST Lattice Based MMI Training.

[BibT_eX]

[DOI]

,

,

,

,

Antti-Veikko Rosti

CoRR, 2022

2020

SNDCNN: Self-Normalizing Deep CNNs with Scaled Exponential Linear Units for Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Exploring Retraining-free Speech Recognition for Intra-sentential Code-switching.

[BibT_eX]

[DOI]

,

,

,

,

,

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Improving Deep Neural Network Based Speech Synthesis through Contextual Feature Parametrization and Multi-Task Learning.

[BibT_eX]

[DOI]

,

,

,

,

J. Signal Process. Syst., 2018

2017

Bayesian adaptation and combination of deep models for automatic speech recognition.

[BibT_eX]

[DOI]

PhD thesis, 2017

Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Hierarchical Bayesian combination of plug-in maximum a posteriori decoders in deep neural networks-based speech recognition and speaker adaptation.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

Pattern Recognit. Lett., 2017

An End-to-End Deep Learning Approach to Simultaneous Speech Dereverberation and Acoustic Modeling for Robust Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Sabato Marco Siniscalchi

,

IEEE J. Sel. Top. Signal Process., 2017

A reverberation-time-aware DNN approach leveraging spatial information for microphone array dereverberation.

[BibT_eX]

[DOI]

,

,

,

,

Sabato Marco Siniscalchi

,

,

EURASIP J. Adv. Signal Process., 2017

A transfer learning and progressive stacking approach to reducing deep model sizes with an application to speech enhancement.

[BibT_eX]

[DOI]

,

,

,

Sabato Marco Siniscalchi

,

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A unified deep modeling approach to simultaneous speech dereverberation and recognition for the reverb challenge.

[BibT_eX]

[DOI]

,

,

,

Sabato Marco Siniscalchi

,

,

Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

2016

A unified approach to transfer learning of deep neural networks with applications to speaker adaptation in automatic speech recognition.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

Neurocomputing, 2016

Learning auxiliary categorical information for speech synthesis based on deep and recurrent neural networks.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Towards a direct Bayesian adaptation framework for deep models.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Maximum a Posteriori Adaptation of Network Parameters in Deep Models.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

,

,

CoRR, 2015

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

DNN-based speech bandwidth expansion and its application to adding high-frequency missing features for automatic speech recognition of narrowband speech.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Maximum a posteriori adaptation of network parameters in deep models.

[BibT_eX]

[DOI]

,

Sabato Marco Siniscalchi

,

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Rapid adaptation for deep neural networks through multi-task learning.

[BibT_eX]

[DOI]

,

,

Sabato Marco Siniscalchi

,

,

,

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Beyond cross-entropy: towards better frame-level objective functions for deep neural network training in automatic speech recognition.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Feature space maximum a posteriori linear regression for adaptation of deep neural networks.

[BibT_eX]

[DOI]

,

,

Sabato Marco Siniscalchi

,

,

,

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A maximal figure-of-merit learning approach to maximizing mean average precision with deep neural network based classifiers.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

Deep learning vector quantization for acoustic information retrieval.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

An i-vector based descriptor for alphabetical gesture recognition.

[BibT_eX]

[DOI]

,

Ville Hautamäki

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

TRECVID 2013 GENIE: Multimedia Event Detection and Recounting.

[BibT_eX]

[DOI]

Proceedings of the 2013 TREC Video Retrieval Evaluation, 2013

A blind segmentation approach to acoustic event detection based on i-vector.

[BibT_eX]

[DOI]

,

,

,

Ville Hautamäki

,

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

TRECVID 2012 GENIE: Multimedia Event Detection and Recounting.

[BibT_eX]

[DOI]

Proceedings of the 2012 TREC Video Retrieval Evaluation, 2012

Loading...