Lauri Juvela

Sebastian J. Schlecht

IEEE Signal Process. Lett., 2024

Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models.

[BibT_eX]

[DOI]

Alec Wright

Alistair Carson

CoRR, 2024

HiFi-Glot: Neural Formant Synthesis with Differentiable Resonant Filters.

[BibT_eX]

[DOI]

CoRR, 2024

Audio Codec Augmentation for Robust Collaborative Watermarking of Speech Synthesis.

[BibT_eX]

[DOI]

Xin Wang

CoRR, 2024

End-to-End Amp Modeling: From Data to Controllable Guitar Amplifier Models.

[BibT_eX]

[DOI]

Stylianos I. Mimilakis

Athanasios Gotsopoulos

CoRR, 2024

Collaborative Watermarking for Adversarial Speech Synthesis.

[BibT_eX]

[DOI]

Xin Wang

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

DDSP-based Neural Waveform Synthesis of Polyphonic Guitar Performance from String-wise MIDI Input.

[BibT_eX]

[DOI]

CoRR, 2023

Speaker-independent neural formant synthesis.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adversarial Guitar Amplifier Modelling with Unpaired Data.

[BibT_eX]

[DOI]

Alec Wright

Vesa Välimäki

Proceedings of the IEEE International Conference on Acoustics, 2023

End-to-End Amp Modeling: from Data to Controllable Guitar Amplifier Models.

[BibT_eX]

[DOI]

Stylianos I. Mimilakis

Kimmo Rauhanen

Athanasios Gotsopoulos

Proceedings of the IEEE International Conference on Acoustics, 2023

2021

Exposure Bias and State Matching in Recurrent Neural Network Virtual Analog Models.

[BibT_eX]

[DOI]

Aleksi Peussa

Eero-Pekka Damskägg

Thomas Sherson

Stylianos I. Mimilakis

Athanasios Gotsopoulos

Vesa Välimäki

Proceedings of the 24th International Conference on Digital Audio Effects, 2021

2020

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2020

Conditional Spoken Digit Generation with StyleGAN.

[BibT_eX]

[DOI]

Kasperi Palkama

Alexander Ilin

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transferring Neural Speech Waveform Synthesizers to Musical Instrument Sounds Generation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

GlotNet - A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks.

[BibT_eX]

[DOI]

Cassia Valentini-Botinhao

Manu Airaksinen

Speech Commun., 2019

The ASVspoof 2019 database.

[BibT_eX]

[DOI]

CoRR, 2019

Vocal Effort Based Speaking Style Conversion Using Vocoder Features and Parallel Learning.

[BibT_eX]

[DOI]

IEEE Access, 2019

Augmented CycleGANs for Continuous Scale Normal-to-Lombard Speaking Style Conversion.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Lombard Speech Synthesis Using Transfer Learning in a Tacotron Text-to-Speech System.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Deep Learning for Tube Amplifier Emulation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Data Augmentation Strategies for Neural Network F0 Estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention.

[BibT_eX]

[DOI]

CoRR, 2018

Speaker-independent Raw Waveform Model for Glottal Excitation.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Time-regularized Linear Prediction for Noise-robust Extraction of the Spectral Envelope of Speech.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Reducing Mismatch in Training of DNN-Based Glottal Excitation Models in a Statistical Parametric Text-to-Speech System.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis.

[BibT_eX]

[DOI]