Éva Székely

Jeff Higginbotham

Francesco Possemato

Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024

Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Unified Speech and Gesture Synthesis Using Flow Matching.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Evaluating Text-to-Speech Synthesis from a Large Discrete Token-based Speech Language Model.

[BibT_eX]

[DOI]

Siyang Wang

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

The Role of Creaky Voice in Turn Taking and the Perception of Speaker Stance: Experiments Using Controllable TTS.

[BibT_eX]

[DOI]

Harm Lameris

Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023

On the Use of Self-Supervised Speech Representations in Spontaneous Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Diff-TTSG: Denoising probabilistic integrated speech and gesture synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Situating Speech Synthesis: Investigating Contextual Factors in the Evaluation of Conversational TTS.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Stuck in the MOS pit: A critical analysis of MOS test methodology in TTS evaluation.

[BibT_eX]

[DOI]

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

The Impact of Pause-Internal Phonetic Particles on Recall in Synthesized Lectures.

[BibT_eX]

[DOI]

Mikey Elmers

Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Can a gender-ambiguous voice reduce gender stereotypes in human-robot interactions?

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Hi robot, it's not what you say, it's how you say it.

[BibT_eX]

[DOI]

Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Generation of speech and facial animation with controllable articulatory effort for amusing conversational characters.

[BibT_eX]

[DOI]

Jonas Beskow

Proceedings of the 23rd ACM International Conference on Intelligent Virtual Agents, 2023

So-to-Speak: An Exploratory Platform for Investigating the Interplay between Style and Prosody in TTS.

[BibT_eX]

[DOI]

Siyang Wang

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Prosody-controllable Gender-ambiguous Speech Synthesis: A Tool for Investigating Implicit Bias in Speech Perception.

[BibT_eX]

[DOI]

Ilaria Torre

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

OverFlow: Putting flows on top of neural transducers for better TTS.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Beyond Style: Synthesizing Speech with Pragmatic Functions.

[BibT_eX]

[DOI]

Harm Lameris

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Pardon my disfluency: The impact of disfluency effects on the perception of speaker competence and confidence.

[BibT_eX]

[DOI]

Ambika Kirkland

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Synthesis after a couple PINTs: Investigating the Role of Pause-Internal Phonetic Particles in Speech Synthesis and Perception.

[BibT_eX]

[DOI]

Mikey Elmers

Johannah O'Mahony

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

A Comparative Study of Self-Supervised Speech Representations in Read and Spontaneous TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Prosody-Controllable Spontaneous TTS with Neural HMMS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Why is my Agent so Slow? Deploying Human-Like Conversational Turn-Taking.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Human-Agent Interaction, 2023

Casual chatter or speaking up? Adjusting articulatory effort in generation of speech and animation for conversational characters.

[BibT_eX]

[DOI]

Proceedings of the 17th IEEE International Conference on Automatic Face and Gesture Recognition, 2023

2022

Evaluating Sampling-based Filler Insertion with Spontaneous TTS.

[BibT_eX]

[DOI]

Siyang Wang

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Where's the uh, hesitation? The interplay between filled pause location, speech rate and fundamental frequency in perception of confidence.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Neural HMMS Are All You Need (For High-Quality Attention-Free TTS).

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Perception of smiling voice in spontaneous speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Personality in the mix - investigating the contribution of fillers and speaking style to the perception of spontaneous speech synthesis.

[BibT_eX]

[DOI]

Jonas Beskow

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Integrated Speech and Gesture Synthesis.

[BibT_eX]

[DOI]

Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020

Augmented Prompt Selection for Evaluation of Spontaneous Speech Synthesis.

[BibT_eX]

[DOI]

Jens Edlund

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Generating coherent spontaneous speech and gesture from text.

[BibT_eX]

[DOI]

Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Breathing and Speech Planning in Spontaneous Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Speech Synthesis Evaluation - State-of-the-Art Assessment and Suggestion for a Novel Research Program.

[BibT_eX]

[DOI]

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

How to train your fillers: uh and um in spontaneous speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Spontaneous Conversational Speech Synthesis from Found Data.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

The Greennn Tree - Lengthening Position Influences Uncertainty Perception.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Casting to Corpus: Segmenting and Selecting Spontaneous Dialogue for Tts with a Cnn-lstm Speaker-dependent Breath Detector.

[BibT_eX]

[DOI]

Gustav Eje Henter

Proceedings of the IEEE International Conference on Acoustics, 2019

Mapping Theoretical and Methodological Perspectives for Understanding Speech Interface Interactions.

[BibT_eX]

[DOI]

Proceedings of the Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

2017

Synthesising Uncertainty: The Interplay of Vocal Effort and Hesitation Disfluencies.

[BibT_eX]

[DOI]

Joseph Mendelson

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Using crowd-sourcing for the design of listening agents: challenges and opportunities.

[BibT_eX]

[DOI]

Proceedings of the 1st ACM SIGCHI International Workshop on Investigating Social Interactions with Artificial Agents, 2017

They Know as Much as We Do: Knowledge Estimation and Partner Modelling of Artificial Partners.

[BibT_eX]

[DOI]

Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

2015

The effect of soft, modal and loud voice levels on entrainment in noisy conditions.

[BibT_eX]

[DOI]

Mark T. Keane

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Predicting synthetic voice style from facial expressions. An application for augmented conversations.

[BibT_eX]

[DOI]

Speech Commun., 2014

Facial expression-based affective speech translation.

[BibT_eX]

[DOI]

Ingmar Steiner

Zeeshan Ahmed

J. Multimodal User Interfaces, 2014

2013

A system for facial expression-based affective speech translation.

[BibT_eX]

[DOI]

Zeeshan Ahmed

Ingmar Steiner

Proceedings of the 18th International Conference on Intelligent User Interfaces, 2013

2012

Synthesizing expressive speech from amateur audiobook recordings.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

WinkTalk: a demonstration of a multimodal speech synthesis platform linking facial expressions to expressive synthetic voices.

[BibT_eX]

[DOI]

Zeeshan Ahmed

João P. Cabral

Proceedings of the Third Workshop on Speech and Language Processing for Assistive Technologies, 2012

Evaluating expressive speech synthesis from audiobook corpora for conversational phrases.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Rapidly Testing the Interaction Model of a Pronunciation Training System via Wizard-of-Oz.

[BibT_eX]

[DOI]

Stephan Schlögl

Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Detecting a targeted voice style in an audiobook using voice quality features.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters.

[BibT_eX]

[DOI]

João P. Cabral

Peter Cahill

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

UCD Blizzard Challenge 2011 Entry.

[BibT_eX]

[DOI]