Zhongqiu Wang

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Mixture to Mixture: Leveraging Close-Talk Mixtures as Weak-Supervision for Speech Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

Cross-Talk Reduction.

[BibT_eX]

[DOI]

Anurag Kumar

Shinji Watanabe

Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2024

Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Summary on the Multimodal Information-Based Speech Processing (MISP) 2023 Challenge.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.

[BibT_eX]

[DOI]

J. Open Source Softw., November, 2023

Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).

[BibT_eX]

[DOI]

Dataset, October, 2023

STFT-Domain Neural Speech Enhancement With Very Low Algorithmic Latency.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

TF-GridNet: Integrating Full- and Sub-Band Modeling for Speech Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Tackling the Cocktail Fork Problem for Separation and Transcription of Real-World Soundtracks.

[BibT_eX]

[DOI]

Darius Petermann

Aswin Shanmugam Subramanian

IEEE ACM Trans. Audio Speech Lang. Process., 2023

The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction.

[BibT_eX]

[DOI]

Sabato Marco Siniscalchi

CoRR, 2023

The CHiME-7 DASR Challenge: Distant Meeting Transcription with Multiple Devices in Diverse Scenarios.

[BibT_eX]

[DOI]

CoRR, 2023

Neural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated Full- and Sub-Band Modeling.

[BibT_eX]

[DOI]

CoRR, 2023

Multi-Channel Target Speaker Extraction with Refinement: The WavLab Submission to the Second Clarity Enhancement Challenge.

[BibT_eX]

[DOI]

CoRR, 2023

Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

UNSSOR: Unsupervised Neural Speech Separation by Leveraging Over-determined Training Mixtures.

[BibT_eX]

[DOI]

Shinji Watanabe

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FNeural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated full- and sub-band Modeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

TF-GRIDNET: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Multi-Channel Speaker Extraction with Adversarial Training: The Wavlab Submission to The Clarity ICASSP 2023 Grand Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Toward Universal Speech Enhancement For Diverse Input Conditions.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

A Single Speech Enhancement Model Unifying Dereverberation, Denoising, Speaker Counting, Separation, And Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Neural Spectrospatial Filtering.

[BibT_eX]

[DOI]

Ke Tan

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Improving Frame-Online Neural Speech Enhancement With Overlapped-Frame Prediction.

[BibT_eX]

[DOI]

Shinji Watanabe

IEEE Signal Process. Lett., 2022

ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Localization based Sequential Grouping for Continuous Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Locate This, Not that: Class-Conditioned Sound Event DOA Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Conditional Diffusion Probabilistic Model for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-Se Submission to the L3DAS22 Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation.

[BibT_eX]

[DOI]

Peidong Wang

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

On the Compensation Between Magnitude and Phase in Speech Separation.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Leveraging Low-Distortion Target Estimates for Improved Speech Enhancement.

[BibT_eX]

[DOI]

CoRR, 2021

Anomalous Sound Detection Using Attentive Neural Processes.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Convolutive Prediction for Reverberant Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Count And Separate: Incorporating Speaker Counting For Continuous Speaker Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Complex Spectral Mapping for Single- and Multi-Channel Speech Enhancement and Robust ASR.

[BibT_eX]

[DOI]

Peidong Wang

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Deep Learning Based Target Cancellation for Speech Dereverberation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speaker Separation.

[BibT_eX]

[DOI]

Peidong Wang

CoRR, 2020

Multi-Microphone Complex Spectral Mapping for Speech Dereverberation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Two-Stage Deep Learning for Noisy-Reverberant Speech Enhancement.

[BibT_eX]

[DOI]

Yan Zhao

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Robust Speaker Localization Guided by Deep Learning-Based Time-Frequency Masking.

[BibT_eX]

[DOI]

Xueliang Zhang

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Combining Spectral and Spatial Features for Deep Learning Based Blind Speaker Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Alternating Between Spectral and Spatial Estimation for Speech Separation and Enhancement.

[BibT_eX]

[DOI]

CoRR, 2019

Deep Learning Based Multi-Channel Speaker Recognition in Noisy and Reverberant Environments.

[BibT_eX]

[DOI]

Hassan Taherian

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective.

[BibT_eX]

[DOI]

Ke Tan

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Robust TDOA Estimation Based on Time-Frequency Masking and Deep Neural Networks.

[BibT_eX]

[DOI]

Xueliang Zhang

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

All-Neural Multi-Channel Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Integrating Spectral and Spatial Features for Multi-Channel Speaker Separation.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On Spatial Features for Supervised Speech Separation and its Application to Beamforming and Robust ASR.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Mask Weighted Stft Ratios for Relative Transfer Function Estimation and ITS Application to Robust ASR.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Alternative Objective Functions for Deep Clustering.

[BibT_eX]

[DOI]

John R. Hershey

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation.

[BibT_eX]

[DOI]

John R. Hershey

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Speech emotion recognition based on Gaussian Mixture Models and Deep Neural Networks.

[BibT_eX]

[DOI]

Ivan J. Tashev

Keith W. Godin

Proceedings of the 2017 Information Theory and Applications Workshop, 2017

A two-stage algorithm for noisy and reverberant speech enhancement.

[BibT_eX]

[DOI]

Yan Zhao

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

A speech enhancement algorithm by iterating single- and multi-microphone processing and its application to robust ASR.

[BibT_eX]

[DOI]

Xueliang Zhang

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Unsupervised speaker adaptation of batch normalized acoustic models for robust ASR.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Recurrent deep stacking networks for supervised speech separation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Learning utterance-level representations for speech emotion and age/gender recognition using deep neural networks.

[BibT_eX]

[DOI]

Ivan Tashev

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

A Joint Training Framework for Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Phoneme-specific speech separation.

[BibT_eX]

[DOI]

Yan Zhao

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Robust speech recognition from ratio masks.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Joint training of speech separation, filterbank and acoustic model for robust automatic speech recognition.

[BibT_eX]

[DOI]