Kun Zhou

Orcid: 0000-0002-7869-4474

Affiliations:
  • Alibaba DAMO Academy, Singapore
  • National University of Singapore, Department of Electrical and Computer Engineering, Singapore (PhD 2023)


According to our database1, Kun Zhou authored at least 21 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Phonetic Enhanced Language Modeling for Text-to-Speech Synthesis.
CoRR, 2024

Converting Anyone's Voice: End-to-End Expressive Voice Conversion with A Conditional Diffusion Model.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

Mixed-EVC: Mixed Emotion Synthesis and Control in Voice Conversion.
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024

MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2024

SPGM: Prioritizing Local Features for Enhanced Speech Separation Performance.
Proceedings of the IEEE International Conference on Acoustics, 2024

Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Speech Synthesis With Mixed Emotions.
IEEE Trans. Affect. Comput., 2023

Emotion Intensity and its Control for Emotional Voice Conversion.
IEEE Trans. Affect. Comput., 2023

2022
Emotional voice conversion: Theory, databases and ESD.
Speech Commun., 2022

Mixed Emotion Modelling for Emotional Voice Conversion.
CoRR, 2022

Disentanglement of Emotional Style and Speaker Identity for Expressive Voice Conversion.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Identity Conversion for Emotional Speakers: A Study for Disentanglement of Emotion Style and Speaker Identity.
CoRR, 2021

Vaw-Gan For Disentanglement And Recomposition Of Emotional Elements In Speech.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Seen and Unseen Emotional Style Transfer for Voice Conversion with A New Emotional Speech Dataset.
Proceedings of the IEEE International Conference on Acoustics, 2021

Expressive Voice Conversion: A Joint Framework for Speaker Identity and Emotional Style Transfer.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Transforming Spectrum and Prosody for Emotional Voice Conversion with Non-Parallel Training Data.
Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Converting Anyone's Emotion: Towards Speaker-Independent Emotional Voice Conversion.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

VAW-GAN for Singing Voice Conversion with Non-parallel Training Data.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Spectrum and Prosody Conversion for Cross-lingual Voice Conversion with CycleGAN.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Large-Scale Speaker Diarization of Radio Broadcast Archives.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019


  Loading...