2024

Attention-based End-to-End Models in Language Technology ; Attentiopohjaiset kokonaismallit kieliteknologiassa.

[DOI]

Aku Rouhe

PhD thesis, 2024

Principled Comparisons for End-to-End Speech Recognition: Attention vs Hybrid at the 1000-Hour Scale.

[DOI]

Aku Rouhe

Tamás Grósz

Mikko Kurimo

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Open-Source Conversational AI with SpeechBrain 1.0.

[DOI]

CoRR, 2024

2023

Finnish parliament ASR corpus.

[DOI]

Lang. Resour. Evaluation, December, 2023

Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks.

[DOI]

Lang. Resour. Evaluation, September, 2023

A pronunciation Scoring System Embedded into Children's Foreign Language Learning Games with Experimental Verification of Learning Benefits.

[DOI]

Reima Karhila

Sari Ylinen

Anna-Riikka Smolander

Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022

Finnish Parliament ASR corpus - Analysis, benchmarks and statistics.

[DOI]

CoRR, 2022

Lahjoita puhetta - a large-scale corpus of spoken Finnish with some benchmarks.

[DOI]

CoRR, 2022

Low Resource Comparison of Attention-based and Hybrid ASR Exploiting wav2vec 2.0.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

SpeechBrain: A General-Purpose Speech Toolkit.

[DOI]

CoRR, 2021

An Equal Data Setting for Attention-Based Encoder-Decoder and HMM/DNN Models: A Case Study in Finnish ASR.

[DOI]

Proceedings of the Speech and Computer - 23rd International Conference, 2021

Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces.

[DOI]

Tuomas Kaseva

Hemant Kumar Kathania

Aku Rouhe

Mikko Kurimo

Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

Self-Supervised End-to-End ASR for Low Resource L2 Swedish.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Multimodal machine translation through visuals and speech.

[DOI]

Mach. Transl., 2020

Finnish Language Modeling with Deep Transformer Models.

[DOI]

CoRR, 2020

Finnish ASR with Deep Transformer Models.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker-Aware Training of Attention-Based End-to-End Speech Recognition Using Neural Speaker Embeddings.

[DOI]

Aku Rouhe

Tuomas Kaseva

Mikko Kurimo

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Spherediar: An Effective Speaker Diarization System for Meeting Data.

[DOI]

Tuomas Kaseva

Aku Rouhe

Mikko Kurimo

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

The MeMAD Submission to the IWSLT 2018 Speech Translation Task.

[DOI]

Proceedings of the 15th International Conference on Spoken Language Translation, 2018

Captaina: Integrated Pronunciation Practice and Data Collection Portal.

[DOI]

Anna-Riikka Smolander

Mikko Kurimo

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017

A pipeline for automatic assessment of foreign language pronunciation.

[DOI]

Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Reading Validation for Pronunciation Evaluation in the Digitala Project.

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination.

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016