Rongzhi Gu

Orcid: 0000-0003-1861-9170

According to our database¹, Rongzhi Gu authored at least 42 papers between 2017 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2017

2018

2019

2020

2021

2022

2023

2024

2025

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

MuQ: Self-Supervised Music Representation Learning with Mel Residual Vector Quantization.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

The Sound Demixing Challenge 2023 - Cinematic Demixing Track.

[BibT_eX]

[DOI]

Alexander L. Stempkovskiy

Tatiana Habruseva

Mikhail Sukhovei

Yuki Mitsufuji

Trans. Int. Soc. Music. Inf. Retr., January, 2024

ReZero: Region-Customizable Sound Extraction.

[BibT_eX]

[DOI]

Rongzhi Gu

Yi Luo

IEEE ACM Trans. Audio Speech Lang. Process., 2024

SongEditor: Adapting Zero-Shot Song Generation Language Model as a Multi-Task Editor.

[BibT_eX]

[DOI]

CoRR, 2024

MuCodec: Ultra Low-Bitrate Music Codec.

[BibT_eX]

[DOI]

CoRR, 2024

Gull: A Generative Multifunctional Audio Codec.

[BibT_eX]

[DOI]

CoRR, 2024

Fast Random Approximation of Multi-Channel Room Impulse Response.

[BibT_eX]

[DOI]

Yi Luo

Rongzhi Gu

Proceedings of the IEEE International Conference on Acoustics, 2024

A Unified Geometry-Aware Source Localization and Separation Framework for AD-HOC Microphone Array.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Improving Music Source Separation with Simo Stereo Band-Split Rnn.

[BibT_eX]

[DOI]

Yi Luo

Rongzhi Gu

Proceedings of the IEEE International Conference on Acoustics, 2024

SECap: Speech Emotion Captioning with Large Language Model.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Towards Unified All-Neural Beamforming for Time and Frequency Domain Speech Separation.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

The Sound Demixing Challenge 2023 - Cinematic Demixing Track.

[BibT_eX]

[DOI]

Alexander L. Stempkovskiy

Tatiana Habruseva

Mikhail Sukhovei

Yuki Mitsufuji

CoRR, 2023

3D Neural Beamforming for Multi-channel Speech Separation Against Location Uncertainty.

[BibT_eX]

[DOI]

Rongzhi Gu

Shi-Xiong Zhang

Dong Yu

CoRR, 2023

High Fidelity Speech Enhancement with Band-split RNN.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Ultra Dual-Path Compression For Joint Echo Cancellation And Noise Suppression.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

TSpeech-AI System Description to the 5th Deep Noise Suppression (DNS) Challenge.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Target Confusion in End-to-end Speaker Extraction: Analysis and Approaches.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Speaker-Aware Mixture of Mixtures Training for Weakly Supervised Speaker Extraction.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Learnable Sparse Filterbank for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention.

[BibT_eX]

[DOI]

Xinmeng Xu

Rongzhi Gu

Yuexian Zou

Proceedings of the IEEE International Conference on Acoustics, 2022

Learning Decoupling Features Through Orthogonality Regularization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Complex Neural Spatial Filter: Enhancing Multi-Channel Target Speech Separation in Complex Domain.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2021

Layer Reduction: Accelerating Conformer-Based Self-Supervised Model via Layer Consistency.

[BibT_eX]

[DOI]

CoRR, 2021

Text Anchor Based Metric Learning for Small-Footprint Keyword Spotting.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Effective Phase Encoding for End-To-End Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

3D Spatial Features for Multi-Channel Target Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Multi-Modal Multi-Channel Target Speech Separation.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2020

Temporal-Spatial Neural Filter: Direction Informed End-to-End Multi-channel Target Speech Separation.

[BibT_eX]

[DOI]

Rongzhi Gu

Yuexian Zou

CoRR, 2020

Audio-Visual Multi-Channel Recognition of Overlapped Speech.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Deep Speaker Embedding with Long Short Term Centroid Learning for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Junyi Peng

Rongzhi Gu

Yuexian Zou

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Enhancing End-to-End Multi-Channel Speech Separation Via Spatial Feature Learning.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Context-adaptive Gaussian Attention for Text-independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

End-to-End Multi-Channel Speech Separation.

[BibT_eX]

[DOI]

CoRR, 2019

Neural Spatial Filter: Target Speaker Speech Separation Assisted with Directional Information.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A Comprehensive Study of Speech Separation: Spectrogram vs Waveform Separation.

[BibT_eX]

[DOI]

Fahimeh Bahmaninezhad

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Logistic Similarity Metric Learning via Affinity Matrix for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Junyi Peng

Rongzhi Gu

Yuexian Zou

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Speaker-discriminative Embedding Learning via Affinity Matrix for Short Utterance Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Alleviate Cross-chunk Permutation through Chunk-level Speaker Embedding for Blind Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2017

Interaction Data Detection System to Upgrade Brick and Mortar Shops: Metrics Allow Offline Shops to Compete with Online Retailers.

[BibT_eX]

[DOI]

IEEE Consumer Electron. Mag., 2017

Learning a robust DOA estimation model with acoustic vector sensor cues.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Rongzhi Gu

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...