Josef V. Psutka

Orcid: 0000-0003-4761-1645

Affiliations:
  • University of West Bohemia, NIIS, Pilsen, Czech Republic


According to our database1, Josef V. Psutka authored at least 59 papers between 2001 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Comparative Analysis of Bilingual and Trilingual Wav2Vec Models for Automatic Speech Recognition in Multilingual Oral History Archives.
CoRR, 2024

2023
Transfer Learning of Transformer-Based Speech Recognition Models from Czech to Slovak.
Proceedings of the Text, Speech, and Dialogue - 26th International Conference, 2023

Transformer-based Speech Recognition Models for Oral History Archives in English, German, and Czech.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

2022
Exploring Capabilities of Monolingual Audio Transformers using Large Datasets in Automatic Speech Recognition of Czech.
CoRR, 2022

Transformer-Based Automatic Speech Recognition of Formal and Colloquial Czech in MALACH Project.
Proceedings of the Text, Speech, and Dialogue - 25th International Conference, 2022

2021
CNN-TDNN-Based Architecture for Speech Recognition Using Grapheme Models in Bilingual Czech-Slovak Task.
Proceedings of the Text, Speech, and Dialogue - 24th International Conference, 2021

Recognition of Heavily Accented and Emotional Speech of English and Czech Holocaust Survivors Using Various DNN Architectures.
Proceedings of the Speech and Computer - 23rd International Conference, 2021

Various DNN-HMM Architectures Used in Acoustic Modeling with Single-Speaker and Single-Channel.
Proceedings of the Statistical Language and Speech Processing, 2021

Spoken Term Detection and Relevance Score Estimation Using Dot-Product of Pronunciation Embeddings.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Live TV Subtitling Through Respeaking.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Live TV subtitling through respeaking with remote cutting-edge technology.
Multim. Tools Appl., 2020

Complexity of the TDNN Acoustic Model with Respect to the HMM Topology.
Proceedings of the Text, Speech, and Dialogue, 2020

Diarization Based on Identification with X-Vectors.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

Increasing the Accuracy of the ASR System by Prolonging Voiceless Phonemes in the Speech of Patients Using the Electrolarynx.
Proceedings of the Speech and Computer - 22nd International Conference, 2020

2019
Sample size for maximum-likelihood estimates of Gaussian model depending on dimensionality of pattern space.
Pattern Recognit., 2019

Diarization of the Language Consulting Center Telephone Calls.
Proceedings of the Speech and Computer - 21st International Conference, 2019

2018
First Insight into the Processing of the Language Consulting Center Data.
Proceedings of the Speech and Computer - 20th International Conference, 2018

Towards Processing of the Oral History Interviews and Related Printed Documents.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

On the Use of Grapheme Models for Searching in Large Spoken Archives.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Recognition of the Electrolaryngeal Speech: Comparison Between Human and Machine.
Proceedings of the Text, Speech, and Dialogue - 20th International Conference, 2017

An Analysis of the RNN-Based Spoken Term Detection Training.
Proceedings of the Speech and Computer - 19th International Conference, 2017

A Relevance Score Estimation for Spoken Term Detection Based on RNN-Generated Pronunciation Embeddings.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2015
Sample Size for Maximum Likelihood Estimates of Gaussian Model.
Proceedings of the Computer Analysis of Images and Patterns, 2015

Gaussian Mixture Model Selection Using Multiple Random Subsampling with Initialization.
Proceedings of the Computer Analysis of Images and Patterns, 2015

2014
Captioning of Live TV Commentaries from the Olympic Games in Sochi: Some Interesting Insights.
Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

2013
Online Speaker Adaptation of an Acoustic Model Using Face Recognition.
Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Covariance Matrix Enhancement Approach to Train Robust Gaussian Mixture Models of Speech Data.
Proceedings of the Speech and Computer - 15th International Conference, 2013

Towards Live Subtitling of TV Ice-hockey Commentary.
Proceedings of the SIGMAP and WINSYS 2013, 2013

2012
Optimized Acoustic Likelihoods Computation for NVIDIA and ATI/AMD Graphics Processors.
IEEE Trans. Speech Audio Process., 2012

Captioning of Live TV Programs through Speech Recognition and Re-speaking.
Proceedings of the Text, Speech and Dialogue - 15th International Conference, 2012

Influence of Different Phoneme Mappings on the Recognition Accuracy of Electrolaryngeal Speech.
Proceedings of the SIGMAP and WINSYS 2012, 2012

Full covariance Gaussian mixture models evaluation on GPU.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012

Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011
System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive.
EURASIP J. Audio Speech Music. Process., 2011

Speaker-Clustered Acoustic Models Evaluated on GPU for On-line Subtitling of Parliament Meetings.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

Optimization of the Gaussian Mixture Model Evaluation on GPU.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010
Gender-Dependent Acoustic Models Fusion Developed for Automatic Subtitling of Parliament Meetings Broadcasted by the Czech TV.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

Fast Phonetic/Lexical Searching in the Archives of the Czech Holocaust Testimonies: Advancing Towards the MALACH Project Visions.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

2009
Using Morphological Information for Robust Language Modeling in Czech ASR System.
IEEE Trans. Speech Audio Process., 2009

Discriminative Training of Gender-Dependent Acoustic Models.
Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Training of Speaker-clustered Acoustic Models for use in Real-time Recognizers.
Proceedings of the SIGMAP 2009, 2009

Fast Speaker Adaptation in Automatic Online Subtitling.
Proceedings of the SIGMAP 2009, 2009

2007
Benefit of Maximum Likelihood Linear Transform (MLLT) Used at Different Levels of Covariance Matrices Clustering in ASR Systems.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Feature space reduction and decorrelation in a large number of speech recognition experiments.
Proceedings of the Signal and Image Processing (SIP 2007), 2007

Searching for a Robust MFCC-Based Parameterization for ASR Application.
Proceedings of the SIGMAP 2007, 2007

Live TV Subtitling - Fast 2-pass LVCSR System for Online Subtitling.
Proceedings of the SIGMAP 2007, 2007

What Can and Cannot Be Found in Czech Spontaneous Speech Using Document-Oriented IR Methods - UWB at CLEF 2007 CL-SR Track.
Proceedings of the Advances in Multilingual and Multimodal Information Retrieval, 2007

2006
Automatic Online Subtitling of the Czech Parliament Meetings.
Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Benefit of a Class-based Language Model for Real-time Closed-captioning of TV Ice-hockey Commentaries.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Comparison of keyword spotting methods for searching in speech.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Adaptive language model in automatic online subtitling.
Proceedings of the Second IASTED International Conference on Computational Intelligence, 2006

2005
Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Issues in Annotation of the Czech Spontaneous Speech Corpus in the MALACH project.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

2003
Towards Automatic Transcription of Spontaneous Czech Speech in the MALACH Project.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Building LVCSR System for Transcription of Spontaneously Pronounced Russian Testimonies in the MALACH Project: Initial Steps and First Results.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Large vocabulary ASR for spontaneous czech in the MALACH project.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Automatic Transcription of Czech Language Oral History in the MALACH Project: Resources and Initial Experiments.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

2001
The Influence of a Filter Shape in Telephone-Based Recognition Module Using PLP Parameterization.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Comparison of MFCC and PLP parameterizations in the speaker independent continuous speech recognition task.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001


  Loading...