Jie Zhang

Zhenhua Ling

Proceedings of the 14th IEEE International Symposium on Chinese Spoken Language Processing, 2024

An End-to-End EEG Channel Selection Method with Residual Gumbel Softmax for Brain-Assisted Speech Enhancement.

[BibT_eX]

[DOI]

Qing-Tian Xu

Zhen-Hua Ling

Proceedings of the IEEE International Conference on Acoustics, 2024

A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Multichannel AV-wav2vec2: A Framework for Learning Multichannel Multi-Modal Speech Representation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Adaptive Video Streaming With Automatic Quality-of-Experience Optimization.

[BibT_eX]

[DOI]

IEEE Trans. Mob. Comput., August, 2023

DUASVS: A Mobile Data Saving Strategy in Short-Form Video Streaming.

[BibT_eX]

[DOI]

IEEE Trans. Serv. Comput., 2023

A Joint Speech Enhancement and Self-Supervised Representation Learning Framework for Noise-Robust Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

SDW-SWF: Speech Distortion Weighted Single-Channel Wiener Filter for Noise Reduction.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Energy-Efficient Sparsity-Driven Speech Enhancement in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

A Dynamic Convolution Framework for Session-Independent Speaker Embedding Learning.

[BibT_eX]

[DOI]

Bin Gu

Wu Guo

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Memory Storable Network Based Feature Aggregation for Speaker Representation Learning.

[BibT_eX]

[DOI]

Bin Gu

Wu Guo

IEEE ACM Trans. Audio Speech Lang. Process., 2023

A Semi-Supervised Complementary Joint Training Approach for Low-Resource Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

Rep2wav: Noise Robust text-to-speech Using self-supervised representations.

[BibT_eX]

[DOI]

CoRR, 2023

Speech Enhancement with Multi-granularity Vector Quantization.

[BibT_eX]

[DOI]

Xiao-Ying Zhao

Qiu-Shi Zhu

CoRR, 2023

A Speech Distortion Weighted Single-Channel Wiener Filter Based STFT-Domain Noise Reduction.

[BibT_eX]

[DOI]

Rui Tao

Proceedings of the IEEE Statistical Signal Processing Workshop, 2023

Hierarchical Audio-Visual Information Fusion with Multi-label Joint Decoding for MER 2023.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

The USTC's Dialect Speech Translation System for IWSLT 2023.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Spoken Language Translation, 2023

Real-Time Causal Spectro-Temporal Voice Activity Detection Based on Convolutional Encoding and Residual Decoding.

[BibT_eX]

[DOI]

Jingyuan Wang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Semantic VAD: Low-Latency Voice Activity Detection for Speech Interaction.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

CASA-ASR: Context-Aware Speaker-Attributed ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Robust Data2VEC: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

The NERCSLIP-USTC System for the L3DAS23 Challenge Task2: 3D Sound Event Localization and Detection (SELD).

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023

Speech Enhancement with Multi-granularity Vector Quantization.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

A Comparative Study on Multichannel Speaker-Attributed Automatic Speech Recognition in Multi-party Meetings.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

Learning Semantic Information from Machine Translation to Improve Speech-to-Text Translation.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Frequency-Invariant Sensor Selection for MVDR Beamforming in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

Guanghui Zhang

IEEE Trans. Wirel. Commun., 2022

A Parametric Unconstrained Beamformer Based Binaural Noise Reduction for Assistive Hearing.

[BibT_eX]

[DOI]

Guanghui Zhang

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Speech Enhancement Using Self-Supervised Pre-Trained Model and Vector Quantization.

[BibT_eX]

[DOI]

Xiao-Ying Zhao

Qiu-Shi Zhu

CoRR, 2022

Joint Training of Speech Enhancement and Self-supervised Model for Noise-robust ASR.

[BibT_eX]

[DOI]

CoRR, 2022

A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2022

External Text Based Data Augmentation for Low-Resource Speech Recognition in the Constrained Condition of OpenASR21 Challenge.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Differential Time-frequency Log-mel Spectrogram Features for Vision Transformer Based Infant Cry Recognition.

[BibT_eX]

[DOI]

Hai-tao Xu

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Complementary Joint Training Approach Using Unpaired Speech and Text A Complementary Joint Training Approach Using Unpaired Speech and Text.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

An Experimental Comparison between Low-Resource Semi-Supervised and High-Resource Supervised Automatic Speech Recognition Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2022

Learning Contextually Fused Audio-Visual Representations For Audio-Visual Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

A Noise-Robust Self-Supervised Pre-Training Model Based Speech Representation Learning for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Reference Microphone Selection and Low-Rank Approximation Based Multichannel Wiener Filter with Application to Speech Recognition.

[BibT_eX]

[DOI]

Xing-Yu Chen

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Power Optimized and Power Constrained Randomized Gossip Approaches for Wireless Sensor Networks.

[BibT_eX]

[DOI]

IEEE Wirel. Commun. Lett., 2021

Quantization-Aware Binaural MWF Based Noise Reduction Incorporating External Wireless Devices.

[BibT_eX]

[DOI]

Changheng Li

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Sensor Selection for Relative Acoustic Transfer Function Steered Linearly-Constrained Beamformers.

[BibT_eX]

[DOI]

Jun Du

IEEE ACM Trans. Audio Speech Lang. Process., 2021

A Study on Reference Microphone Selection for Multi-Microphone Speech Enhancement.

[BibT_eX]

[DOI]

Huawei Chen

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Multi-Granularity Sequence Alignment Mapping for Encoder-Decoder Based End-to-End ASR.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Joint Sampling Synchronization and Source Localization for Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

Pingping Wu

IEEE Commun. Lett., 2020

Attentive Fusion Enhanced Audio-Visual Encoding for Transformer Based Robust Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Relative Acoustic Transfer Function Estimation in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Distributed Rate-Constrained LCMV Beamforming.

[BibT_eX]

[DOI]

Andreas I. Koutrouvelis

IEEE Signal Process. Lett., 2019

Sensor Selection and Rate Distribution Based Beamforming in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the 27th European Signal Processing Conference, 2019

2018

Rate-Distributed Spatial Filtering Based Noise Reduction in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Microphone Subset Selection for MVDR Beamformer Based Noise Reduction.

[BibT_eX]

[DOI]

Sundeep Prabhakar Chepuri

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Rate-Distributed Binaural LCMV Beamforming for Assistive Hearing in Wireless Acoustic Sensor Networks.

[BibT_eX]

[DOI]

Proceedings of the 10th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018

2017

Binaural Sound Localization Based on Reverberation Weighting and Generalized Parametric Mapping.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

2016

Bi-Direction Interaural Matching Filter and Decision Weighting Fusion for Sound Source Localization in Noisy Environments.

[BibT_eX]

[DOI]

Mengdi Yue

IEICE Trans. Inf. Syst., 2016

Structured total least squares based internal delay estimation for distributed microphone auto-localization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Probabilistic binaural multiple sources localization based on time-delay compensation estimator and clustering analysis.

[BibT_eX]

[DOI]

Mengdi Yue

Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

2015

Robust Acoustic Localization Via Time-Delay Compensation and Interaural Matching Filter.

[BibT_eX]

[DOI]

IEEE Trans. Signal Process., 2015

Binaural cues estimates based on Interaural Matching Filter for sound source localization.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Robotics and Biomimetics, 2015

Direction of arrival estimation based on reverberation weighting and noise error estimator.

[BibT_eX]

[DOI]

Cheng Pang

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Binaural sound source localization based on generalized parametric model and two-layer matching strategy in complex environments.

[BibT_eX]

[DOI]

Cheng Pang

Proceedings of the IEEE International Conference on Robotics and Automation, 2015

2014

A new hierarchical binaural sound source localization method based on Interaural Matching Filter.

[BibT_eX]

[DOI]

Zhuo Fu

Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

A binaural sound source localization model based on time-delay compensation and interaural coherence.

[BibT_eX]

[DOI]