Josef V. Psutka
Orcid: 0000-0003-4761-1645Affiliations:
- University of West Bohemia, NIIS, Pilsen, Czech Republic
According to our database1,
Josef V. Psutka
authored at least 59 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives.
CoRR, 2024
2023
Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak.
Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023
Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech.
CoRR, 2022
Transformer-Based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project.
Proceedings of the Text, Speech, and Dialogue - 25th International Conference, 2022
2021
CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task.
Proceedings of the Text, Speech, and Dialogue - 24th International Conference, 2021
Recognition of Heavily Accented and Emotional Speech of English and Czech Holocaust Survivors Using Various DNN Architectures.
Proceedings of the Speech and Computer - 23rd International Conference, 2021
Various DNN-HMM Architectures Used in Acoustic Modeling with Single-Speaker and Single-Channel.
Proceedings of the Statistical Language and Speech Processing, 2021
Spoken Term Detection and Relevance Score Estimation Using Dot-Product of Pronunciation Embeddings.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
2020
Multim. Tools Appl., 2020
Proceedings of the Text, Speech, and Dialogue, 2020
Proceedings of the Speech and Computer - 22nd International Conference, 2020
Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx.
Proceedings of the Speech and Computer - 22nd International Conference, 2020
2019
Sample size for maximum-likelihood estimates of Gaussian model depending on dimensionality of pattern space.
Pattern Recognit., 2019
Proceedings of the Speech and Computer - 21st International Conference, 2019
2018
Proceedings of the Speech and Computer - 20th International Conference, 2018
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Proceedings of the Text, Speech, and Dialogue - 20th International Conference, 2017
Proceedings of the Speech and Computer - 19th International Conference, 2017
A Relevance Score Estimation for Spoken Term Detection Based on RNN-Generated Pronunciation Embeddings.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
2015
Proceedings of the Computer Analysis of Images and Patterns, 2015
Gaussian Mixture Model Selection Using Multiple Random Subsampling with Initialization.
Proceedings of the Computer Analysis of Images and Patterns, 2015
2014
Captioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights.
Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014
2013
Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013
Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data.
Proceedings of the Speech and Computer - 15th International Conference, 2013
Proceedings of the SIGMAP and WINSYS 2013, 2013
2012
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors.
IEEE Trans. Speech Audio Process., 2012
Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012
Influence of Different Phoneme Mappings on the Recognition Accuracy of Electrolaryngeal Speech.
Proceedings of the SIGMAP and WINSYS 2012, 2012
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012
Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
2011
System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive.
EURASIP J. Audio Speech Music. Process., 2011
Speaker-Clustered Acoustic Models Evaluated on GPU for On-line Subtitling of Parliament Meetings.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
2010
Gender-Dependent Acoustic Models Fusion Developed for Automatic Subtitling of Parliament Meetings Broadcasted by the Czech TV.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010
Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010
2009
IEEE Trans. Speech Audio Process., 2009
Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009
Training of Speaker-clustered Acoustic Models for use in Real-time Recognizers.
Proceedings of the SIGMAP 2009, 2009
Fast Speaker Adaptation in Automatic Online Subtitling.
Proceedings of the SIGMAP 2009, 2009
2007
Benefit of Maximum Likelihood Linear Transform (MLLT) Used at Different Levels of Covariance Matrices Clustering in ASR Systems.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007
Feature space reduction and decorrelation in a large number of speech recognition experiments.
Proceedings of the Signal and Image Processing (SIP 2007), 2007
Searching for a Robust MFCC-Based Parameterization for ASR Application.
Proceedings of the SIGMAP 2007, 2007
Live TV Subtitling - Fast 2-pass LVCSR System for Online Subtitling.
Proceedings of the SIGMAP 2007, 2007
What Can and Cannot Be Found in Czech Spontaneous Speech Using Document-Oriented IR Methods - UWB at CLEF 2007 CL-SR Track.
Proceedings of the Advances in Multilingual and Multimodal Information Retrieval, 2007
2006
Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006
Benefit of a Class-based Language Model for Real-time Closed-captioning of TV Ice-hockey Commentaries.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Adaptive language model in automatic online subtitling.
Proceedings of the Second IASTED International Conference on Computational Intelligence, 2006
2005
Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
2004
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004
2003
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003
Building LVCSR System for Transcription of Spontaneously Pronounced Russian Testimonies in the MALACH Project: Initial Steps and First Results.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
2002
Automatic Transcription of Czech Language Oral History in the MALACH Project: Resources and Initial Experiments.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002
2001
The Influence of a Filter Shape in Telephone-Based Recognition Module Using PLP Parameterization.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001
Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001