Éva Székely

Orcid: 0000-0003-1175-840X

According to our database1, Éva Székely authored at least 55 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Should you use a probabilistic duration model in TTS? Probably! Especially for spontaneous speech.
CoRR, 2024

Voice and Choice: Investigating the Role of Prosodic Variation in Request Compliance and Perceived Politeness Using Conversational TTS.
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024

Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching.
Proceedings of the IEEE International Conference on Acoustics, 2024

Unified Speech and Gesture Synthesis Using Flow Matching.
Proceedings of the IEEE International Conference on Acoustics, 2024

Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

The Role of Creaky Voice in Turn Taking and the Perception of Speaker Stance: Experiments Using Controllable TTS.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Situating Speech Synthesis: Investigating Contextual Factors in the Evaluation of Conversational TTS.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

The Impact of Pause-Internal Phonetic Particles on Recall in Synthesized Lectures.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Can a gender-ambiguous voice reduce gender stereotypes in human-robot interactions?
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Hi robot, it's not what you say, it's how you say it.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters.
Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents, 2023

So-to-Speak: An Exploratory Platform for Investigating the Interplay between Style and Prosody in TTS.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Prosody-controllable Gender-ambiguous Speech Synthesis: A Tool for Investigating Implicit Bias in Speech Perception.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

OverFlow: Putting flows on top of neural transducers for better TTS.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Beyond Style: Synthesizing Speech with Pragmatic Functions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pardon my disfluency: The impact of disfluency effects on the perception of speaker competence and confidence.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Synthesis after a couple PINTs: Investigating the Role of Pause-Internal Phonetic Particles in Speech Synthesis and Perception.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS.
Proceedings of the IEEE International Conference on Acoustics, 2023

Prosody-Controllable Spontaneous TTS with Neural HMMS.
Proceedings of the IEEE International Conference on Acoustics, 2023

Why is my Agent so Slow? Deploying Human-Like Conversational Turn-Taking.
Proceedings of the International Conference on Human-Agent Interaction, 2023

Casual chatter or speaking up? Adjusting articulatory effort in generation of speech and animation for conversational characters.
Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

Evaluating Sampling-based Filler Insertion with Spontaneous TTS.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Where's the uh, hesitation? The interplay between filled pause location, speech rate and fundamental frequency in perception of confidence.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).
Proceedings of the IEEE International Conference on Acoustics, 2022

Perception of smiling voice in spontaneous speech synthesis.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Personality in the mix - investigating the contribution of fillers and speaking style to the perception of spontaneous speech synthesis.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Integrated Speech and Gesture Synthesis.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Generating coherent spontaneous speech and gesture from text.
Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Breathing and Speech Planning in Spontaneous Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Speech Synthesis Evaluation - State-of-the-Art Assessment and Suggestion for a Novel Research Program.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

How to train your fillers: uh and um in spontaneous speech synthesis.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Spontaneous Conversational Speech Synthesis from Found Data.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The Greennn Tree - Lengthening Position Influences Uncertainty Perception.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Casting to Corpus: Segmenting and Selecting Spontaneous Dialogue for Tts with a Cnn-lstm Speaker-dependent Breath Detector.
Proceedings of the IEEE International Conference on Acoustics, 2019

Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions.
Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

Synthesising Uncertainty: The Interplay of Vocal Effort and Hesitation Disfluencies.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Using crowd-sourcing for the design of listening agents: challenges and opportunities.
Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, 2017

They Know as Much as We Do: Knowledge Estimation and Partner Modelling of Artificial Partners.
Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

The effect of soft, modal and loud voice levels on entrainment in noisy conditions.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Predicting synthetic voice style from facial expressions. An application for augmented conversations.
Speech Commun., 2014

Facial expression-based affective speech translation.
J. Multimodal User Interfaces, 2014

A system for facial expression-based affective speech translation.
Proceedings of the 18th International Conference on Intelligent User Interfaces, 2013

Synthesizing expressive speech from amateur audiobook recordings.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

WinkTalk: a demonstration of a multimodal speech synthesis platform linking facial expressions to expressive synthetic voices.
Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies, 2012

Evaluating expressive speech synthesis from audiobook corpora for conversational phrases.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Detecting a targeted voice style in an audiobook using voice quality features.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

UCD Blizzard Challenge 2011 Entry.
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011
