2024

Enhanced Acoustic Howling Suppression via Hybrid Kalman Filter and Deep Learning Models.

[DOI]

Hao Zhang

Yixuan Zhang

Meng Yu

Dong Yu

IEEE ACM Trans. Audio Speech Lang. Process., 2024

SSR-Speech: Towards Stable, Safe and Robust Zero-shot Text-based Speech Editing and Synthesis.

[DOI]

CoRR, 2024

2023

Unifying Robustness and Fidelity: A Comprehensive Study of Pretrained Generative Methods for Speech Enhancement in Adverse Conditions.

[DOI]

CoRR, 2023

KalmanNet: A Learnable Kalman Filter for Acoustic Echo Cancellation.

[DOI]

CoRR, 2023

Hybrid AHS: A Hybrid of Kalman Filter and Deep Learning for Acoustic Howling Suppression.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Zoneformer: On-device Neural Beamformer For In-car Multi-zone Speech Separation, Enhancement and Echo Cancellation.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Deep Neural Mel-Subband Beamformer for in-Car Speech Separation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Neuralkalman: A Learnable Kalman Filter for Acoustic Echo Cancellation.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Neuralecho: Hybrid of Full-Band and Sub-Band Recurrent Neural Network For Acoustic Echo Cancellation and Speech Enhancement.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

Deep learning based multi-source localization with source splitting and its effectiveness in multi-talker speech recognition.

[DOI]

Aswin Shanmugam Subramanian

Comput. Speech Lang., 2022

An investigation of neural uncertainty estimation for target speaker extraction equipped RNN transducer.

[DOI]

Comput. Speech Lang., 2022

NeuralEcho: A Self-Attentive Recurrent Neural Network For Unified Acoustic Echo Suppression And Speech Enhancement.

[DOI]

CoRR, 2022

Enhancing Zero-Shot Many to Many Voice Conversion with Self-Attention VAE.

[DOI]

CoRR, 2022

EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Joint Neural AEC and Beamforming with Double-Talk Detection.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Towards end-to-end Speaker Diarization with Generalized Neural Speaker Clustering.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Enhancing Zero-Shot Many to Many Voice Conversion via Self-Attention VAE with Structurally Regularized Layers.

[DOI]

Proceedings of the 5th International Conference on Artificial Intelligence for Industries, 2022

2021

Multi-Channel Multi-Frame ADL-MVDR for Target Speech Separation.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Joint AEC AND Beamforming with Double-Talk Detection using RNN-Transformer.

[DOI]

CoRR, 2021

Generalized RNN beamformer for target speech separation.

[DOI]

CoRR, 2021

WPD++: An Improved Neural Beamformer for Simultaneous Speech Separation and Dereverberation.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Neural Mask based Multi-channel Convolutional Beamforming for Joint Dereverberation, Echo Cancellation and Denoising.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

A Joint Training Framework of Multi-Look Separator and Speaker Embedding Extractor for Overlapped Speech.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Towards Robust Speaker Verification with Target Speaker Enhancement.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

ADL-MVDR: All Deep Learning MVDR Beamformer for Target Speech Separation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Self-Supervised Text-Independent Speaker Verification Using Prototypical Momentum Contrastive Learning.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Directional ASR: A New Paradigm for E2E Multi-Speaker Speech Recognition with Source Localization.

[DOI]

Aswin Shanmugam Subramanian

Proceedings of the IEEE International Conference on Acoustics, 2021

Improving RNN Transducer with Target Speaker Extraction and Neural Uncertainty Estimation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

3D Spatial Features for Multi-Channel Target Speech Separation.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Audio-Visual Speech Separation and Dereverberation With a Two-Stage Multimodal Network.

[DOI]

IEEE J. Sel. Top. Signal Process., 2020

Audio-Visual Multi-Channel Recognition of Overlapped Speech.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

DurIAN: Duration Informed Attention Network for Speech Synthesis.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Neural Spatio-Temporal Beamformer for Target Speech Separation.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Multi-Look Keyword Spotting.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Far-Field Location Guided Target Speech Extraction Using End-to-End Speech Recognition Objectives.

[DOI]

Aswin Shanmugam Subramanian

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speaker-Aware Target Speaker Enhancement by Jointly Learning with Speaker Embedding Extraction.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Integration of Multi-Look Beamformers for Multi-Channel Keyword Spotting.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

A Unified Framework for Speech Separation.

[DOI]

Fahimeh Bahmaninezhad

CoRR, 2019

DurIAN: Duration Informed Attention Network For Multimodal Synthesis.

[DOI]

CoRR, 2019

End-to-End Multi-Channel Speech Separation.

[DOI]

CoRR, 2019

Improved Speaker-Dependent Separation for CHiME-5 Challenge.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Direction-Aware Speaker Beam for Multi-Channel Speaker Extraction.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information.

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation.

[DOI]

Fahimeh Bahmaninezhad

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Seq2Seq Attentional Siamese Neural Networks for Text-dependent Speaker Verification.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Joint Training of Complex Ratio Mask Based Beamformer and Acoustic Model for Noise Robust Asr.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Boundary Discriminative Large Margin Cosine Loss for Text-independent Speaker Verification.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Multi-band PIT and Model Integration for Improved Multi-channel Speech Separation.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Improving Speech Enhancement with Phonetic Embedding Features.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Time Domain Audio Visual Speech Separation.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Syllable-Dependent Discriminative Learning for Small Footprint Text-Dependent Speaker Verification.

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Text-Dependent Speech Enhancement for Small-Footprint Robust Keyword Detection.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation.

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2012

Multi-Channel l<sub>1</sub> Regularized Convex Speech Enhancement Model and Fast Computation by the Split Bregman Method.

[DOI]

IEEE Trans. Speech Audio Process., 2012

Exploring Off Time Nature for Speech Enhancement.

[DOI]

Meng Yu

Jack Xin

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Constrained Multichannel Speech Dereverberation.

[DOI]

Meng Yu

Frank K. Soong

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A Triple-Microphone Real-Time Speech Enhancement Algorithm Based on Approximate Array Analytical Solutions.

[DOI]

Meng Yu

Ryan Ritch

Jack Xin

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011

Modeling Category Identification Using Sparse Instance Representation.

[DOI]

Proceedings of the 33th Annual Meeting of the Cognitive Science Society, 2011

2010

Convexity and fast speech extraction by split bregman method.

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Reducing musical noise in blind source separation by time-domain sparse filters and split bregman method.

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009

A nonlocally weighted soft-constrained natural gradient algorithm for blind separation of reverberant speech.

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009