CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset.
,
,
,
,
,
,
,
,
,
,
CoRR, January, 2025
Enhancing Multi-Modal Perception and Interaction: An Augmented Reality Visualization System for Complex Decision Making.
Syst., 2024
MC-SEMamba: A Simple Multi-channel Extension of SEMamba.
CoRR, 2024
Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement.
CoRR, 2024
EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations.
CoRR, 2024
DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset.
Proceedings of the IEEE Spoken Language Technology Workshop, 2024
Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments With Advanced Post-Processing.
Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024
EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024
Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement.
CoRR, 2023