2025

CodecFake-Omni: A Large-Scale Codec-based Deepfake Speech Dataset.

[DOI]

,

,

,

,

,

,

,

,

,

Jyh-Shing Roger Jang

,

CoRR, January, 2025

2024

Enhancing Multi-Modal Perception and Interaction: An Augmented Reality Visualization System for Complex Decision Making.

[DOI]

,

,

,

,

,

,

,

Syst., 2024

MC-SEMamba: A Simple Multi-channel Extension of SEMamba.

[DOI]

,

,

,

,

,

CoRR, 2024

Leveraging Joint Spectral and Spatial Learning with MAMBA for Multichannel Speech Enhancement.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2024

EMO-Codec: An In-Depth Look at Emotion Preservation capacity of Legacy and Neural Codec Models With Subjective and Objective Evaluations.

[DOI]

,

,

Huang-Cheng Chou

,

,

,

,

,

CoRR, 2024

DFADD: The Diffusion and Flow-Matching Based Audio Deepfake Dataset.

[DOI]

,

,

,

,

,

,

,

,

Jyh-Shing Roger Jang

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

Robust Audio-Visual Speech Enhancement: Correcting Misassignments in Complex Environments With Advanced Post-Processing.

[DOI]

,

,

,

,

,

Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations.

[DOI]

,

,

Huang-Cheng Chou

,

,

,

,

,

,

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement.

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2023