Gustav Eje Henter

Simon Alexanderson

Jonas Beskow

ACM Trans. Graph., 2020

Robust model training and generalisation with Studentising flows.

[BibT_eX]

[DOI]

Simon Alexanderson

CoRR, 2020

Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows.

[BibT_eX]

[DOI]

Comput. Graph. Forum, 2020

Robust Classification Using Hidden Markov Models and Mixtures of Normalizing Flows.

[BibT_eX]

[DOI]

Proceedings of the 30th IEEE International Workshop on Machine Learning for Signal Processing, 2020

Let's Face It: Probabilistic Multi-modal Interlocutor-aware Generation of Facial Gestures in Dyadic Settings.

[BibT_eX]

[DOI]

Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Generating coherent spontaneous speech and gesture from text.

[BibT_eX]

[DOI]

Proceedings of the IVA '20: ACM International Conference on Intelligent Virtual Agents, 2020

Gesticulator: A framework for semantically-aware speech-driven gesture generation.

[BibT_eX]

[DOI]

Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020

Breathing and Speech Planning in Spontaneous Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.

[BibT_eX]

[DOI]

CoRR, 2019

Where do the improvements come from in sequence-to-sequence neural TTS?

[BibT_eX]

[DOI]

Oliver Watts

Jason Fong

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Speech Synthesis Evaluation - State-of-the-Art Assessment and Suggestion for a Novel Research Program.

[BibT_eX]

[DOI]

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

How to train your fillers: uh and um in spontaneous speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Analyzing Input and Output Representations for Speech-Driven Gesture Generation.

[BibT_eX]

[DOI]

Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents, 2019

Spontaneous Conversational Speech Synthesis from Found Data.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Off the Cuff: Exploring Extemporaneous Speech Delivery with TTS.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Casting to Corpus: Segmenting and Selecting Spontaneous Dialogue for Tts with a Cnn-lstm Speaker-dependent Breath Detector.

[BibT_eX]

[DOI]

Éva Székely

Joakim Gustafson

Proceedings of the IEEE International Conference on Acoustics, 2019

On the Importance of Representations for Speech-Driven Gesture Generation.

[BibT_eX]

[DOI]

Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

2018

Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.

[BibT_eX]

[DOI]

Speech Commun., 2018

Deep Encoder-Decoder Models for Unsupervised Learning of Controllable Speech Synthesis.

[BibT_eX]

[DOI]

Xin Wang

Junichi Yamagishi

CoRR, 2018

Kernel Density Estimation-Based Markov Models with Hidden State.

[BibT_eX]

[DOI]

Arne Leijon

CoRR, 2018

Analysing Shortcomings of Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2018

Cyborg Speech: Deep Multilingual Speech Synthesis for Generating Segmental Foreign Accent with Natural Prosody.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Consensus-based Sequence Training for Video Captioning.

[BibT_eX]

[DOI]

CoRR, 2017

Misperceptions of the Emotional Content of Natural and Vocoded Speech in a Car.

[BibT_eX]

[DOI]

Jaime Lorenzo-Trueba

Junichi Yamagishi

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Principles for Learning Controllable TTS from Annotated and Latent Variation.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Adapting and controlling DNN-based speech synthesis using input codes.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Bayesian Analysis of Phoneme Confusion Matrices.

[BibT_eX]

[DOI]

Leijon Leijon

Martin Dahlquist

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Minimum Entropy Rate Simplification of Stochastic Processes.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., 2016

Median-based generation of synthetic speech durations using a non-parametric approach.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A Template-Based Approach for Speech Synthesis Intonation Generation Using LSTMs.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

From HMMS to DNNS: Where do the improvements come from?

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Robust TTS duration modelling using DNNS.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Testing the consistency assumption: Pronunciation variant forced alignment in read and spontaneous speech synthesis.

[BibT_eX]

[DOI]

Rasmus Dall

Sandrine Brognaux

Korin Richmond

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015

Are we using enough listeners? no! - an empirically-supported critique of interspeech 2014 TTS evaluations.

[BibT_eX]

[DOI]

Mirjam Wester

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014

Measuring the perceptual effects of modelling assumptions in speech synthesis using stimuli constructed from repeated natural speech.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A flexible front-end for HTS.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

2013

Probabilistic Sequence Models with Speech and Language Applications.

[BibT_eX]

[DOI]

PhD thesis, 2013

Maximizing Phoneme Recognition Accuracy for Enhanced Speech Intelligibility in Noise.

[BibT_eX]

[DOI]

Petko Nikolov Petkov

IEEE Trans. Speech Audio Process., 2013

Picking up the pieces: Causal states in noisy data, and how to recover them.

[BibT_eX]

[DOI]

Pattern Recognit. Lett., 2013

2012

Enhancing Subjective Speech Intelligibility Using a Statistical Model of Speech.

[BibT_eX]

[DOI]

Petko Nikolov Petkov

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Gaussian process dynamical models for nonparametric speech representation and synthesis.

[BibT_eX]

[DOI]

Marcus R. Frean

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Intermediate-State HMMs to Capture Continuously-Changing Signal Features.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Simplified probability models for generative tasks: A rate-distortion approach.

[BibT_eX]

[DOI]