Tamás Grósz

Anja Virkkunen

Dejan Porjazovski

Proceedings of the 4th on Multimodal Sentiment Analysis Challenge and Workshop: Mimicked Emotions, 2023

Advancing Audio Emotion and Intent Recognition with Large Pre-Trained Models and Bayesian Inference.

[BibT_eX]

[DOI]

Proceedings of the 31st ACM International Conference on Multimedia, 2023

Investigating wav2vec2 context representations and the effects of fine-tuning, a case-study of a Finnish model.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Topic Identification for Spontaneous Speech: Enriching Audio Features with Embedded Linguistic Information.

[BibT_eX]

[DOI]

Dejan Porjazovski

Proceedings of the 31st European Signal Processing Conference, 2023

2022

End-to-end Ensemble-based Feature Selection for Paralinguistics Tasks.

[BibT_eX]

[DOI]

Mittul Singh

Sudarsana Reddy Kadiri

Hemant Kumar Kathania

CoRR, 2022

Lahjoita puhetta - a large-scale corpus of spoken Finnish with some benchmarks.

[BibT_eX]

[DOI]

CoRR, 2022

Wav2vec2-based Paralinguistic Systems to Recognise Vocalised Emotions and Stuttering.

[BibT_eX]

[DOI]

Dejan Porjazovski

Yaroslav Getman

Sudarsana Reddy Kadiri

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Comparison and Analysis of New Curriculum Criteria for End-to-End ASR.

[BibT_eX]

[DOI]

Georgios Karakasidis

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

wav2vec2-based Speech Rating System for Children with Speech Sound Disorder.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Tracing Signs of Urbanity in the Finnish Fiction Film of the 1950s: Toward a Multimodal Analysis of Audiovisual Data.

[BibT_eX]

[DOI]

Proceedings of the 6th Digital Humanities in the Nordic and Baltic Countries Conference (DHNB 2022), 2022

2021

LSTM-XL: Attention Enhanced Long-Term Memory for LSTM Cells.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech, and Dialogue - 24th International Conference, 2021

2020

Social Signal Detection by Probabilistic Sampling DNN Training.

[BibT_eX]

[DOI]

IEEE Trans. Affect. Comput., 2020

Aalto's End-to-End DNN systems for the INTERSPEECH 2020 Computational Paralinguistics Challenge.

[BibT_eX]

[DOI]

Mittul Singh

Sudarsana Reddy Kadiri

Hemant Kumar Kathania

CoRR, 2020

Deep learning in static, metric-based bug prediction.

[BibT_eX]

[DOI]

Array, 2020

Visual Interpretation of DNN-based Acoustic Models using Deep Autoencoders.

[BibT_eX]

[DOI]

Proceedings of the 3rd Workshop on Machine Learning Methods in Visualisation for Big Data, 2020

Data Augmentation Using Prosody and False Starts to Recognize Non-Native Children's Speech.

[BibT_eX]

[DOI]

Hemant Kumar Kathania

Mittul Singh

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019

Automatic segmentation of hyperreflective foci in OCT images.

[BibT_eX]

[DOI]

Comput. Methods Programs Biomed., 2019

Ultrasound-Based Silent Speech Interface Built on a Continuous Vocoder.

[BibT_eX]

[DOI]

Tamás Gábor Csapó

Mohammed Salah Al-Radhi

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Autoencoder-Based Articulatory-to-Acoustic Mapping for Ultrasound Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

A Reconstruction-Free Projection Selection Procedure for Binary Tomography Using Convolutional Neural Networks.

[BibT_eX]

[DOI]

Gergely Pap

Gábor Lékó

Proceedings of the Image Analysis and Recognition - 16th International Conference, 2019

Using Deep Rectifier Neural Nets and Probabilistic Sampling for Topical Unit Classification.

[BibT_eX]

[DOI]

György Kovács

Tamás Váradi

Proceedings of the Cognitive Infocommunications, Theory and Applications, 2019

2018

Training Methods for Deep Neural Network-Based Acoustic Models in Speech Recognition

[BibT_eX]

[DOI]

PhD thesis, 2018

Efficient visual code localization with neural networks.

[BibT_eX]

[DOI]

Pattern Anal. Appl., 2018

Multi-Task Learning of Speech Recognition and Speech Synthesis Parameters for Ultrasound-based Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

General Utterance-Level Feature Extraction for Classifying Crying Sounds, Atypical & Self-Assessed Affect and Heart Beats.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Automatic Detection and Characterization of Biomarkers in OCT Images.

[BibT_eX]

[DOI]

Proceedings of the Image Analysis and Recognition - 15th International Conference, 2018

F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017

A Comparative Evaluation of GMM-Free State Tying Methods for ASR.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Training Context-Dependent DNN Acoustic Models Using Probabilistic Sampling.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

DNN-Based Feature Extraction and Classifier Combination for Child-Directed Speech, Cold and Snoring Identification.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

DNN-Based Ultrasound-to-Speech Conversion for a Silent Speech Interface.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016

Detecting Mild Cognitive Impairment from Spontaneous Speech by Correlation-Based Phonetic Feature Selection.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

GMM-Free Flat Start Sequence-Discriminative DNN Training.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Estimating the Sincerity of Apologies in Speech by DNN Rank Learning and Prosodic Analysis.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Determining Native Language and Deception Using Phonetic Features and Classifier Combination.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Topical unit classification using deep neural nets and probabilistic sampling.

[BibT_eX]

[DOI]

György Kovács

Tamás Váradi

Proceedings of the 7th IEEE International Conference on Cognitive Infocommunications, 2016

2015

Assessing the degree of nativeness and parkinson's condition using Gaussian processes and deep rectifier neural networks.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Building context-dependent DNN acoustic models using Kullback-Leibler divergence-based state tying.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Document Classification with Deep Rectifier Neural Networks and Probabilistic Sampling.

[BibT_eX]

[DOI]

István Nagy T.

Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Robust Multi-Band ASR Using Deep Neural Nets and Spectro-temporal Features.

[BibT_eX]

[DOI]

György Kovács

Proceedings of the Speech and Computer - 16th International Conference, 2014

A Sequence Training Method for Deep Rectifier Neural Networks in Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Speech and Computer - 16th International Conference, 2014

QR code localization using deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2014

Detecting the intensity of cognitive and physical load using AdaBoost and deep rectifier neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Localization of Visual Codes in the DCT Domain Using Deep Rectifier Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the International Workshop on Artificial Neural Networks and Intelligent Information Processing, 2014

2013

A Comparison of Deep Neural Network Training Methods for Large Vocabulary Speech Recognition.

[BibT_eX]

[DOI]