Attention-based End-to-End Models in Language Technology ; Attentiopohjaiset kokonaismallit kieliteknologiassa.
PhD thesis, 2024
Principled Comparisons for End-to-End Speech Recognition: Attention vs Hybrid at the 1000-Hour Scale.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Finnish parliament ASR corpus.
Lang. Resour. Evaluation, December, 2023
Lahjoita puhetta: a large-scale corpus of spoken Finnish with some benchmarks.
Lang. Resour. Evaluation, September, 2023
A pronunciation Scoring System Embedded into Children's Foreign Language Learning Games with Experimental Verification of Learning Benefits.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023
Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Finnish Parliament ASR corpus - Analysis, benchmarks and statistics.
CoRR, 2022
Lahjoita puhetta - a large-scale corpus of spoken Finnish with some benchmarks.
CoRR, 2022
Low Resource Comparison of Attention-based and Hybrid ASR Exploiting wav2vec 2.0.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
An Equal Data Setting for Attention-Based Encoder-Decoder and HMM/DNN Models: A Case Study in Finnish ASR.
Proceedings of the Speech and Computer - 23rd International Conference, 2021
Speaker Verification Experiments for Adults and Children Using Shared Embedding Spaces.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021
Self-Supervised End-to-End ASR for Low Resource L2 Swedish.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Multimodal machine translation through visuals and speech.
Mach. Transl., 2020
Finnish Language Modeling with Deep Transformer Models.
CoRR, 2020
Finnish ASR with Deep Transformer Models.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Speaker-Aware Training of Attention-Based End-to-End Speech Recognition Using Neural Speaker Embeddings.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Spherediar: An Effective Speaker Diarization System for Meeting Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
The MeMAD Submission to the IWSLT 2018 Speech Translation Task.
Proceedings of the 15th International Conference on Spoken Language Translation, 2018
Captaina: Integrated Pronunciation Practice and Data Collection Portal.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
A pipeline for automatic assessment of foreign language pronunciation.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017
Reading Validation for Pronunciation Evaluation in the Digitala Project.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Digitala: An Augmented Test and Review Process Prototype for High-Stakes Spoken Foreign Language Examination.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016