2025

Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement.

[DOI]

Xueyao Zhang

Xiaohui Zhang

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing.

[DOI]

CoRR, 2024

2023

Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2020

WaveFlow: A Compact Flow-based Model for Raw Audio.

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Non-Autoregressive Neural Text-to-Speech.

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

2019

Multi-Speaker End-to-End Speech Synthesis.

[DOI]

CoRR, 2019

Parallel Neural Text-to-Speech.

[DOI]

CoRR, 2019

ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech.

[DOI]

Wei Ping

Kainan Peng

Jitong Chen

Proceedings of the 7th International Conference on Learning Representations, 2019

2018

Neural Voice Cloning with a Few Samples.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning.

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Deep Voice 3: 2000-Speaker Neural Text-to-Speech.

[DOI]

CoRR, 2017

Deep Voice 2: Multi-Speaker Neural Text-to-Speech.

[DOI]

Andrew Gibiansky

Sercan Ömer Arik

Gregory Frederick Diamos

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017