2024

Optimizing feature fusion for improved zero-shot adaptation in text-to-speech synthesis.

[DOI]

EURASIP J. Audio Speech Music. Process., December, 2024

Enhancing Open-Set Speaker Identification Through Rapid Tuning With Speaker Reciprocal Points and Negative Sample.

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2024

StyleFusion TTS: Multimodal Style-Control and Enhanced Feature Fusion for Zero-Shot Text-to-Speech Synthesis.

[DOI]

Proceedings of the Pattern Recognition and Computer Vision - 7th Chinese Conference, 2024

MM-KWS: Multi-modal Prompts for Multilingual User-defined Keyword Spotting.

[DOI]

Proceedings of the 25th Annual Conference of the International Speech Communication Association, 2024