Vevo: Controllable Zero-Shot Voice Imitation with Self-Supervised Disentanglement.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing.
CoRR, 2024
Zero-Shot Accent Conversion using Pseudo Siamese Disentanglement Network.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
WaveFlow: A Compact Flow-based Model for Raw Audio.
Proceedings of the 37th International Conference on Machine Learning, 2020
Non-Autoregressive Neural Text-to-Speech.
Proceedings of the 37th International Conference on Machine Learning, 2020
Incremental Text-to-Speech Synthesis with Prefix-to-Prefix Framework.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020
Multi-Speaker End-to-End Speech Synthesis.
CoRR, 2019
Parallel Neural Text-to-Speech.
CoRR, 2019
ClariNet: Parallel Wave Generation in End-to-End Text-to-Speech.
Proceedings of the 7th International Conference on Learning Representations, 2019
Neural Voice Cloning with a Few Samples.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018
Deep Voice 3: Scaling Text-to-Speech with Convolutional Sequence Learning.
Proceedings of the 6th International Conference on Learning Representations, 2018
Deep Voice 3: 2000-Speaker Neural Text-to-Speech.
CoRR, 2017
Deep Voice 2: Multi-Speaker Neural Text-to-Speech.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017