Zdravko Kacic

According to our database1, Zdravko Kacic authored at least 94 papers between 1988 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Capturing Conversational Gestures for Embodied Conversational Agents Using an Optimized Kaneda-Lucas-Tomasi Tracker and Denavit-Hartenberg-Based Kinematic Model.
Sensors, 2022

Binocular Phase-Coded Visual Stimuli for SSVEP-Based BCI.
IEEE Access, 2019

A speech-based distributed architecture platform for an intelligent ambience.
Comput. Electr. Eng., 2018

Improved Differential Evolution for Large-Scale Black-Box Optimization.
IEEE Access, 2018

Real-time fingerprint image enhancement with a two-stage algorithm and block-local normalization.
J. Real Time Image Process., 2017

Context-dependent factored language models.
EURASIP J. Audio Speech Music. Process., 2017

The TTS-driven affective embodied conversational agent EVA, based on a novel conversational-behavior generation algorithm.
Eng. Appl. Artif. Intell., 2017

Quick and efficient definition of hangbefore and hangover criteria for voice activity detection.
Proceedings of the International Conference on Systems, Signals and Image Processing, 2016

QoS Estimation and Prediction of Input Modality in Degraded IP Networks.
Wirel. Pers. Commun., 2015

Detection of objects on waters' surfaces using CEIEMV method.
Comput. Electr. Eng., 2015

Statistical machine translation of subtitles for highly inflected language pair.
Pattern Recognit. Lett., 2014

Manual sorting of numerals in an inflective language for language modelling.
Int. J. Speech Technol., 2014

Describing and Animating Complex Communicative Verbal and Nonverbal Behavior Using Eva-Framework.
Appl. Artif. Intell., 2014

Acoustic classification and segmentation using modified spectral roll-off and variance-based features.
Digit. Signal Process., 2013

The Use of Several Language Models and Its Impact on Word Insertion Penalty in LVCSR.
Proceedings of the Speech and Computer - 15th International Conference, 2013

Voice activity detection algorithm using nonlinear spectral weights, hangover and hangbefore criteria.
Comput. Electr. Eng., 2012

Gradient-Descent Based Unit-Selection Optimization Algorithm Used for Corpus-Based Text-to-Speech Synthesis.
Appl. Artif. Intell., 2011

Form-Oriented Annotation for Building a Functionally Independent Dictionary of Synthetic Movement.
Proceedings of the Cognitive Behavioural Systems, 2011

Remote-based text-to-speech modules' evaluation framework: the RES framework.
Lang. Resour. Evaluation, 2010

Acquisition and Annotation of Slovenian Lombard Speech Database.
Proceedings of the International Conference on Language Resources and Evaluation, 2010

Statistical Modelling of Highly Inflective Languages.
Proceedings of the Encyclopedia of Artificial Intelligence (3 Volumes), 2009

Noise robust F0 determination and epoch-marking algorithms.
Signal Process., 2009

Using Data-Driven Subword Units in Language Model of Highly Inflective Slovenian Language.
Int. J. Pattern Recognit. Artif. Intell., 2009

Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy.
EURASIP J. Adv. Signal Process., 2009

Evaluation of Modules and Tools for Speech Synthesis: the ECESS Framework.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Evaluation of voice activity and voicing detection.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Two step speaker segmentation method using Bayesian information criterion and adapted Gaussian mixtures models.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Large vocabulary continuous speech recognition of an inflected language using stems and endings.
Speech Commun., 2007

Time and space-efficient architecture for a corpus-based text-to-speech synthesis system.
Speech Commun., 2007

A noise robust feature extraction algorithm using joint wavelet packet subband decomposition and AR modeling of speech signals.
Signal Process., 2007

A framework for efficient development of Slovenian written language resources used in speech processing applications.
Int. J. Speech Technol., 2007

A Comprehensive Noise Robust Speech Parameterization Algorithm Using Wavelet Packet Decomposition-Based Denoising and Speech Feature Representation Techniques.
EURASIP J. Adv. Signal Process., 2007

Statistical machine translation from Slovenian to English.
J. Comput. Inf. Technol., 2007

A Unified Approach to Grapheme-to-Phoneme Conversion for the Plattos Slovenian Text-to-Speech System.
Appl. Artif. Intell., 2007

Embodied Conversational Agents in Wizard-of-Oz and Multimodal Interaction Applications.
Proceedings of the Verbal and Nonverbal Communication Behaviours, 2007

ECESS Platform for Web Based TTS Modules and Systems Evaluation.
Proceedings of the Verbal and Nonverbal Features of Human-Human and Human-Machine Interaction, 2007

SINOD - Slovenian non-native speech database.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Sloparl - slovenian parliamentary speech and text corpus for large vocabulary continuous speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Conversion from phoneme based to grapheme based acoustic models for speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Data-driven generation of phonetic broad classes, based on phoneme confusion matrix similarity.
Speech Commun., 2005

A Computationally Efficient Mel-Filter Bank VAD Algorithm for Distributed Speech Recognition Systems.
EURASIP J. Adv. Signal Process., 2005

The COST278 broadcast news segmentation and speaker clustering evaluation - overview, methodology, systems, results.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

BNSI Slovenian broadcast news database - speech and text corpus.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Training the tilt intonation model using the JEMA methodology.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Modelling highly inflected languages.
Inf. Sci., 2004

Using Finite-State Tranducer Theory for Representation on Very Large Scale Lexicons.
Informatica (Slovenia), 2004

Acquisition and Annotation of Slovenian Broadcast News Database.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

The COST 278 MASPER Initiative - Crosslingual Speech Recognition with Large Telephone Databases.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Creating Slovenian Language Resources for Development of Speech-to-speech Translation Components.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

A Data-driven Adaptation of Prosody in a Multilingual TTS.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

The Development and Integration of the LDA-Toolkit Into COST249 SpeechDat(II) SIG Reference Recognizer.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Clustering of triphones using phoneme similarity estimation for the definition of a multilingual set of triphones.
Speech Commun., 2003

Efficient Development of Lexical Language Resources and their Representation.
Int. J. Speech Technol., 2003

Int. J. Speech Technol., 2003

Context-Independent Multilingual Emotion Recognition from Speech Signals.
Int. J. Speech Technol., 2003

Comparison of Acoustic Adaptation Methods in Multilingual Speech Recognition Environment.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

Data driven generation of broad classes for decision tree construction in acoustic modeling.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Noise robust speech parameterization based on joint wavelet packet decomposition and autoregressive modeling.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Improved emotion recognition with large set of statistical features.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Overall risk criterion estimation of hidden Markov model parameters.
Speech Commun., 2002

Uniform Speech Recognition Platform for Evaluation of New Algorithms.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

Large Vocabulary Speech Recognition of Slovenian Language Using Data-Driven Morphological Models.
Proceedings of the Text, Speech and Dialogue, 5th International Conference, 2002

Preliminary Evaluation of Slovenian Mobile Database PoliDat.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Design and Implementation of the Slovenian Phonetic and Morphology Lexicons for the Use in Spoken Language Applications.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Interface Databases: Design and Collection of a Multilingual Emotional Speech Database.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Objective analysis of emotional speech for English and Slovenian Interface emotional speech databases.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

A comparison of HTK, ISIP and julius in slovenian large vocabulary continuous speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Efficient additive and convolutional noise reduction procedures.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Large Vocabulary Continuous Speech Recognizer for Slovenian Language.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Crosslingual speech recognition with multilingual acoustic models based on agglomerative and tree-based triphone clustering.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

The use of noisy frame elimination and frequency spectrum magnitude reduction in noise robust speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A multilingual, multimodal, speech training system, SPECO.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Representation of large lexica using finite-state transducers for the multilingual text-to-speech synthesis systems.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Speaker normalization based on test to reference speaker mapping.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Topic detection for language model adaptation of highly-inflected languages by using a fuzzy comparison function.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A computational efficient real time noise robust speech recognition based on improved spectral subtraction method.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A multiconditional robust front-end feature extraction with a noise reduction procedure based on improved spectral subtraction algorithm.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A Multimedia, Multilingual Teaching and Training System for Children with Speech Disorders.
Int. J. Speech Technol., 2000

Topic-Sensitive Language Modelling.
Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

Design of Optimal Slovenian Speech Corpus for Use in the Concatenative Speech Synthesis System.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

A Computational Platform for Development of Morphologic and Phonetic Lexica.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Issues in Design and Collection of Large Telephone Speech Corpus for Slovenian Language.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

The COST 249 SpeechDat Multilingual Reference Recogniser.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Normalized time-frequency speech representation in articulation training systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Looking for topic similarities of highly inflected languages for language model adaptation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A Noise Robust Multilingual Reference Recogniser Based on Speechdat(II).
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A novel loss function for the overall risk criterion based discriminative training of HMM models.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Agglomerative vs. tree-based clustering for the definition of multilingual set of triphones.
Proceedings of the IEEE International Conference on Acoustics, 2000

SPECO - a multimedia multilingual teaching and training system for speech handicapped children.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speaker normalization for audio-visual articulation training.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

A study of harmonic features for the speaker recognition.
Speech Commun., 1997

The use of harmonic features in speaker recognition.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Using isofrequency neural column for harmonic sound scene decomposition.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

A Methodology for Efficiency Estimation of the Speech Signal Feature Extraction Methods.
Proceedings of the Pattern Recognition, 1988
