June Sig Sung

Paris Mastorocostas

Speech Commun., 2023

Investigating Content-Aware Neural Text-to-Speech MOS Prediction Using Prosodic and Linguistic Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Predicting phoneme-level prosody latents using AR and flow-based Prior Networks for expressive speech synthesis.

[BibT_eX]

[DOI]

CoRR, 2022

Learning utterance-level representations through token-level acoustic latents prediction for Expressive Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2022

Generating Gender-Ambiguous Text-to-Speech Voices.

[BibT_eX]

[DOI]

Spyros Raptis

CoRR, 2022

Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Fine-grained Noise Control for Multispeaker Speech Synthesis.

[BibT_eX]

[DOI]

Spyros Raptis

Gunu Jho

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SOMOS: The Samsung Open MOS Dataset for the Evaluation of Neural Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Self supervised learning for robust voice cloning.

[BibT_eX]

[DOI]

Panagiotis Kakoulidis

Spyros Raptis

Gunu Jho

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Karaoker: Alignment-free singing voice synthesis with speech training data.

[BibT_eX]

[DOI]

Panagiotis Kakoulidis

Nikolaos Ellinas

Georgios Vamvoukakis

Gunu Jho

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Rapping-Singing Voice Synthesis based on Phoneme-level Prosody Control.

[BibT_eX]

[DOI]

Georgia Maniati

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Word-Level Style Control for Expressive, Non-attentive Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Improved Prosodic Clustering for Multispeaker and Speaker-Independent Phoneme-Level Prosody Control.

[BibT_eX]

[DOI]

Panos Kakoulidis

Hyoungmin Park

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Cross-Lingual Low Resource Speaker Adaptation Using Phonological Features.

[BibT_eX]

[DOI]

Georgia Maniati

Nikolaos Ellinas

Georgios Vamvoukakis

Hyoungmin Park

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Prosodic Clustering for Phoneme-Level Prosody Control in End-to-End Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

The SRCB-SL system for Blizzard Challenge 2021.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2021, virtual, October 23, 2021, 2021

Vibrato Learning in Multi-Singer Singing Voice Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

High Quality Streaming Speech Synthesis with Low, Sentence-Length-Independent Latency.

[BibT_eX]

[DOI]

Nikolaos Ellinas

Georgios Vamvoukakis

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2014

Factored Maximum Penalized Likelihood Kernel Regression for HMM-Based Style-Adaptive Speech Synthesis.

[BibT_eX]

[DOI]

Doo Hwa Hong

Nam Soo Kim

IEEE J. Sel. Top. Signal Process., 2014

2013

Statistical Approaches to Excitation Modeling in HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2013

Factored maximum likelihood kernelized regression for HMM-based singing voice synthesis.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

2012

Outlier Detection and Removal for HMM-Based Speech Synthesis with an Insufficient Speech Database.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

Factored MLLR Adaptation Algorithm for HMM-based Expressive TTS.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Artificial stereo data generation for speech feature mapping.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Factored MLLR Adaptation.

[BibT_eX]

[DOI]

Nam Soo Kim