2025
CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset.
CoRR, January, 2025

2024
Enhancing Multi-Modal Perception and Interaction: An Augmented Reality Visualization System for Complex Decision Making.
Syst., 2024

MC-SEMamba: A Simple Multi-channel Extension of SEMamba.
CoRR, 2024

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement.
CoRR, 2024

EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations.
CoRR, 2024

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments With Advanced Post-Processing.
Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023
Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement.
CoRR, 2023