Chao Zhang
Orcid: 0000-0002-7730-5131Affiliations:
- Tsinghua University, Department of Electronic Engineering, Beijing, China
- University of Cambridge, Department of Engineering, UK (PhD 2017)
According to our database1,
Chao Zhang
authored at least 113 papers
between 2011 and 2025.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2025
Comput. Speech Lang., 2025
2024
Graph Neural Networks for Contextual ASR With the Tree-Constrained Pointer Generator.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
CoRR, 2024
CoRR, 2024
Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition.
CoRR, 2024
Whisper-PMFA: Partial Multi-Scale Feature Aggregation for Speaker Verification using Whisper Models.
CoRR, 2024
Confidence Estimation for Automatic Detection of Depression and Alzheimer's Disease Based on Clinical Interviews.
CoRR, 2024
CoRR, 2024
Spontaneous Speech-Based Suicide Risk Detection Using Whisper and Large Language Models.
CoRR, 2024
CoRR, 2024
CoRR, 2024
M<sup>3</sup>AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset.
CoRR, 2024
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Bridging the Gap: Integrating Pre-Trained Speech Enhancement and Recognition Models for Robust Speech Recognition.
Proceedings of the 32nd European Signal Processing Conference, 2024
Bayesian Example Selection Improves In-Context Learning for Speech, Text and Visual Modalities.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Combining hybrid DNN-HMM ASR systems with attention-based models using lattice rescoring.
Speech Commun., February, 2023
Prosody Modelling With Pre-Trained Cross-Utterance Representations for Improved Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Minimising Biasing Word Errors for Contextual ASR With the Tree-Constrained Pointer Generator.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Estimating the Uncertainty in Emotion Class Labels With Utterance-Specific Dirichlet Priors.
IEEE Trans. Affect. Comput., 2023
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.
CoRR, 2023
It HAS to be Subjective: Human Annotator Simulation via Zero-shot Density Estimation.
CoRR, 2023
Knowledge Distillation from Multiple Foundation Models for End-to-End Speech Recognition.
CoRR, 2023
Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Improving Speech Enhancement Using Audio Tagging Knowledge From Pre-Trained Representations and Multi-Task Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Transferring Speech-Generic and Depression-Specific Knowledge for Alzheimer's Disease Detection.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
2022
On the similarities of representations in artificial and brain neural networks for speech recognition.
Frontiers Comput. Neurosci., 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
A Truly Multilingual First Pass and Monolingual Second Pass Streaming on-Device ASR System.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Unified End-to-End Speech Recognition and Endpointing for Fast and Efficient Speech Systems.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Tandem Multitask Training of Speaker Diarisation and Speech Recognition for Meeting Transcription.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Streaming End-to-End Multilingual Speech Recognition with Joint Language Identification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Tree-constrained Pointer Generator with Graph Neural Network Encodings for Contextual Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
A distributed optimisation framework combining natural gradient with Hessian-free for discriminative sequence training.
Neural Networks, 2021
Input Length Matters: An Empirical Study Of RNN-T And MWER Training For Long-form Telephony Speech Recognition.
CoRR, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Prosody Modelling with Cross-Utterance Bert Embeddings for End-to-End Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2021
Emotion Recognition by Fusing Time Synchronous and Time Asynchronous Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Transformer Language Models with LSTM-Based Cross-Utterance Information Representation.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Multimodal Intelligence: Representation Learning, Information Fusion, and Applications.
IEEE J. Sel. Top. Signal Process., 2020
Introduction to the Special Issue on Deep Learning for Multi-Modal Intelligence Across Speech, Language, Vision, and Heterogeneous Signals.
IEEE J. Sel. Top. Signal Process., 2020
Sound Event Localization and Detection Based on Multiple DOA Beamforming and Multi-Task Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Direct-Path Signal Cross-Correlation Estimation for Sound Source Localization in Reverberation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Integrating Source-Channel and Attention-Based Sequence-to-Sequence Models for Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Joint training methods for tandem and hybrid speech recognition systems using deep neural networks
PhD thesis, 2017
Relating dynamic brain states to dynamic machine states: Human and machine solutions to the speech recognition problem.
PLoS Comput. Biol., 2017
Joint optimisation of tandem systems using Gaussian mixture density neural network discriminative sequence training.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
2016
Selection of Multi-Genre Broadcast Data for the Training of Automatic Speech Recognition Systems.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
DNN speaker adaptation using parameterised sigmoid and ReLU hidden activation functions.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Parameterised sigmoid and reLU hidden activation functions for DNN acoustic modelling.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Joint decoding of tandem and hybrid systems for improved keyword spotting on low resource languages.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
The Cambridge University 2014 BOLT conversational telephone Mandarin Chinese LVCSR system for speech translation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
The development of the cambridge university alignment systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
Reliable Accent-Specific Unit Generation With Discriminative Dynamic Gaussian Mixture Selection for Multi-Accent Chinese Speech Recognition.
IEEE Trans. Speech Audio Process., 2013
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013
2012
Discriminative dynamic Gaussian mixture selection with enhanced robustness and performance for multi-accent speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Reliable accent specific unit generation with dynamic Gaussian mixture selection for multi-accent speech recognition.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011
Proceedings of the International Conference on Asian Language Processing, 2011
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011