Zhuo Chen
Orcid: 0000-0003-0563-1760Affiliations:
- Microsoft, Redmond, WA, USA
- Columbia University, New York, NY, USA (PhD 2017)
According to our database1,
Zhuo Chen
authored at least 128 papers
between 2015 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
VioLA: Conditional Language Models for Speech Recognition, Synthesis, and Translation.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
ICMC-ASR: The ICASSP 2024 In-Car Multi-Channel Automatic Speech Recognition Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Computer Vision - ECCV 2024, 2024
2023
VioLA: Unified Codec Language Models for Speech Recognition, Synthesis, and Translation.
CoRR, 2023
Speak Foreign Languages with Your Own Voice: Cross-Lingual Neural Codec Language Modeling.
CoRR, 2023
Toward Scalable Image Feature Compression: A Content-Adaptive and Diffusion-Based Approach.
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the International Conference on Machine Learning, 2023
Proceedings of the IEEE International Conference on Multimedia and Expo, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
On Decoder-Only Architecture For Speech-to-Text and Large Language Model Integration.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE J. Sel. Top. Signal Process., 2022
CoRR, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 2022 IEEE International Conference on Image Processing, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
CoRR, 2021
Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
ESPnet-SE: End-To-End Speech Enhancement and Separation Toolkit Designed for ASR Integration.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Don't Shoot Butterfly with Rifles: Multi-Channel Continuous Speech Separation with Early Exit Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 29th European Signal Processing Conference, 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
IEEE Trans. Image Process., 2020
Continuous Speech Separation Using Speaker Inventory for Long Multi-talker Recording.
CoRR, 2020
Don't shoot butterfly with rifles: Multi-channel Continuous Speech Separation with Early Exit Transformer.
CoRR, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the IEEE International Conference on Image Processing, 2020
Improving Deep CNN Networks with Long Temporal Context for Text-Independent Speaker Verification.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
End-to-end Microphone Permutation and Number Invariant Multi-channel Speech Separation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Proceedings of the 27th ACM International Conference on Multimedia, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 2019 IEEE International Conference on Image Processing, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
IEEE ACM Trans. Audio Speech Lang. Process., 2018
CoRR, 2018
Multi-Channel Overlapped Speech Recognition with Location Guided Speech Extraction Network.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Multi-Microphone Neural Speech Separation for Far-Field Multi-Talker Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Efficient Integration of Fixed Beamformers and Speech Separation Networks for Multi-Channel Far-Field Speech Separation.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Pattern Recognit., 2017
Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend.
Comput. Speech Lang., 2017
Improving Mask Learning Based Speech Enhancement System with Restoration Layers and Residual Connection.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Forecasting of ionospheric vertical total electron content (TEC) using LSTM networks.
Proceedings of the 2017 International Conference on Machine Learning and Cybernetics, 2017
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Neural decoding of attentional selection in multi-speaker environments without access to separated sources.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017
Unsupervised adaptation with domain separation networks for robust speech recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Facial action recognition using very deep networks for highly imbalanced class distribution.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
2016
Multim. Tools Appl., 2016
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Proceedings of the Eighth International Conference on Quality of Multimedia Experience, 2016
Adaptation of Neural Networks Constrained by Prior Statistics of Node Co-Activations.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Proceedings of the 2015 IEEE International Conference on Systems, 2015
Perceptual Quality Improvement for Synthesis Imaging of Chinese Spectral Radioheliograph.
Proceedings of the Advances in Multimedia Information Processing - PCM 2015, 2015
Speech enhancement and recognition using multi-task learning of long short-term memory recurrent neural networks.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the High Performance Computing and Applications, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015