Paavo Alku

Orcid: 0000-0002-8173-9418

According to our database1, Paavo Alku authored at least 256 papers between 1988 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Exploring the Impact of Fine-Tuning the Wav2vec2 Model in Database-Independent Detection of Dysarthric Speech.
IEEE J. Biomed. Health Informatics, August, 2024

Automatic classification of the severity level of Parkinson's disease: A comparison of speaking tasks, features, and classifiers.
Comput. Speech Lang., January, 2024

Investigation of self-supervised pre-trained models for classification of voice quality from speech and neck surface accelerometer signals.
Comput. Speech Lang., January, 2024

A comparison of data augmentation methods in voice pathology detection.
Comput. Speech Lang., January, 2024

Automatic classification of neurological voice disorders using wavelet scattering features.
Speech Commun., 2024

Pre-trained models for detection and severity level classification of dysarthria from speech.
Speech Commun., 2024

AVID: A speech database for machine learning studies on vocal intensity.
Speech Commun., 2024

Can a Machine Distinguish High and Low Amount of Social Creak in Speech?
CoRR, 2024

2023
Classification of functional dysphonia using the tunable Q wavelet transform.
Speech Commun., November, 2023

Refining a deep learning-based formant tracker using linear prediction methods.
Comput. Speech Lang., June, 2023

Exemplar-Based Sparse Representations for Detection of Parkinson's Disease From Speech.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Automatic Assessment of Parkinson's Disease Using Speech Representations of Phonation and Articulation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Analysis of Instantaneous Frequency Components of Speech Signals for Epoch Extraction.
Comput. Speech Lang., 2023

Classification of Vocal Intensity Category from Speech using the Wav2vec2 and Whisper Embeddings.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Severity Classification of Parkinson's Disease from Speech using Single Frequency Filtering-based Features.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Utilizing Wav2Vec In Database-Independent Voice Disorder Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

Automatic Classification of Vocal Intensity Category from Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Wav2vec-Based Detection and Severity Level Classification of Dysarthria From Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
End-to-End Pathological Speech Detection Using Wavelet Scattering Network.
IEEE Signal Process. Lett., 2022

Glottal flow characteristics in vowels produced by speakers with heart failure.
Speech Commun., 2022

A formant modification method for improved ASR of children's speech.
Speech Commun., 2022

Subjective Evaluation of Basic Emotions from Audio-Visual Data.
Sensors, 2022

Convolutional Neural Networks for Classification of Voice Qualities from Speech and Neck Surface Accelerometer Signals.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Comparing 1-dimensional and 2-dimensional spectral feature representations in voice pathology detection using machine learning and deep learning classifiers.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
The Detection of Parkinson's Disease From Speech Using Voice Source Information.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Extraction and Utilization of Excitation Information of Speech: A Review.
Proc. IEEE, 2021

The automatic detection of heart failure using speech signals.
Comput. Speech Lang., 2021

Automatic assessment of intelligibility in speakers with dysarthria from coded telephone speech using glottal features.
Comput. Speech Lang., 2021

Glottal features for classification of phonation type from speech and neck surface accelerometer signals.
Comput. Speech Lang., 2021

A Comparison of Cepstral Features in the Detection of Pathological Voices by Varying the Input and Filterbank of the Cepstrum Computation.
IEEE Access, 2021

Formant Tracking Using Quasi-Closed Phase Forward-Backward Linear Prediction Analysis and Deep Neural Networks.
IEEE Access, 2021

Spectral modification for recognition of children's speech undermismatched conditions.
Proceedings of the 23rd Nordic Conference on Computational Linguistics, 2021

2020
Time-Varying Quasi-Closed-Phase Analysis for Accurate Formant Tracking in Speech Signals.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Automatic intelligibility assessment of dysarthric speech using glottal parameters.
Speech Commun., 2020

Analysis and classification of phonation types in speech and singing voice.
Speech Commun., 2020

Duration of the rhotic approximant /ɹ/ in spastic dysarthria of different severity levels.
Speech Commun., 2020

Analysis and Detection of Pathological Voice Using Glottal Source Features.
IEEE J. Sel. Top. Signal Process., 2020

Excitation Features of Speech for Emotion Recognition Using Neutral Speech as Reference.
Circuits Syst. Signal Process., 2020

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.
Comput. Speech Lang., 2020

Detection of Specific Language Impairment in Children Using Glottal Source Features.
IEEE Access, 2020

Glottal Source Information for Pathological Voice Detection.
IEEE Access, 2020

Mel-Weighted Single Frequency Filtering Spectrogram for Dialect Identification.
IEEE Access, 2020

Excitation Features of Speech for Speaker-Specific Emotion Detection.
IEEE Access, 2020

Parkinson's Disease Detection from Speech Using Single Frequency Filtering Cepstral Coefficients.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Study of Formant Modification for Children ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Comparison of Glottal Closure Instants Detection Algorithms for Emotional Speech.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
GlotNet - A Raw Waveform Model for the Glottal Excitation in Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Estimation of the glottal source from coded telephone speech using deep neural networks.
Speech Commun., 2019

Dysarthric speech classification from coded telephone speech using glottal features.
Speech Commun., 2019

Analysis of phonation onsets in vowel production, using information from glottal area and flow estimate.
Speech Commun., 2019

Normal-to-Lombard adaptation of speech synthesis using long short-term memory recurrent neural networks.
Speech Commun., 2019

OPENGLOT - An open environment for the evaluation of glottal inverse filtering.
Speech Commun., 2019

Vocal effort compensation for MFCC feature extraction in a shouted versus normal speaker recognition task.
Comput. Speech Lang., 2019

The ASVspoof 2019 database.
CoRR, 2019

Vocal Effort Based Speaking Style Conversion Using Vocoder Features and Parallel Learning.
IEEE Access, 2019

Augmented CycleGANs for Continuous Scale Normal-to-Lombard Speaking Style Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Mel-Frequency Cepstral Coefficients of Voice Source Waveforms for Classification of Phonation Types in Speech.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

GELP: GAN-Excited Linear Prediction for Speech Synthesis from Mel-Spectrogram.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Lombard Speech Synthesis Using Transfer Learning in a Tacotron Text-to-Speech System.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cycle-consistent Adversarial Networks for Non-parallel Vocal Effort Based Speaking Style Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Waveform Generation for Text-to-speech Synthesis Using Pitch-synchronous Multi-scale Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

Data Augmentation Strategies for Neural Network F0 Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
A Comparison Between STRAIGHT, Glottal, and Sinusoidal Vocoding in Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speaker recognition from whispered speech: A tutorial survey and an application of time-varying linear prediction.
Speech Commun., 2018

Parameterization of a computational physical model for glottal flow using inverse filtering and high-speed videoendoscopy.
Speech Commun., 2018

Estimation of the glottal flow from speech pressure signals: Evaluation of three variants of iterative adaptive inverse filtering using computational physical modelling of voice production.
Speech Commun., 2018

Comparison of spectral tilt measures for sentence prominence in speech - Effects of dimensionality and adverse noise conditions.
Speech Commun., 2018

Speaking style adaptation in Text-To-Speech synthesis using Sequence-to-sequence models with attention.
CoRR, 2018

Dysarthric Speech Classification Using Glottal Features Computed from Non-words, Words and Sentences.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speaker-independent Raw Waveform Model for Glottal Excitation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Time-regularized Linear Prediction for Noise-robust Extraction of the Spectral Envelope of Speech.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Speech Waveform Synthesis from MFCC Sequences with Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Intelligibility Enhancement of Telephone Speech Using Gaussian Process Regression for Normal-to-Lombard Spectral Tilt Conversion.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

The Linear Predictive Modeling of Speech From Higher-Lag Autocorrelation Coefficients Applied to Noise-Robust Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Quadratic Programming Approach to Glottal Inverse Filtering by Joint Norm-1 and Norm-2 Optimization.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Glottal Vocoding With Frequency-Warped Time-Weighted Linear Prediction.
IEEE Signal Process. Lett., 2017

Comparison of parametrization methods of electroglottographic and inverse filtered acoustic speech pressure signals in distinguishing between phonation types.
Biomed. Signal Process. Control., 2017

Time-Varying Autoregressions for Speaker Verification in Reverberant Conditions.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Glottal Source Estimation from Coded Telephone Speech Using a Deep Neural Network.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Speaking Style Conversion from Normal to Lombard Speech Using a Glottal Vocoder and Bayesian GMMs.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Evaluation of Spectral Tilt Measures for Sentence Prominence Under Different Noise Conditions.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Reducing Mismatch in Training of DNN-Based Glottal Excitation Models in a Statistical Parametric Text-to-Speech System.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Generative Adversarial Network-Based Glottal Waveform Model for Statistical Parametric Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Effects of Training Data Variety in Generating Glottal Pulses from Acoustic Features with DNNs.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Normal-to-shouted speech spectral mapping for speaker recognition under vocal effort mismatch.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Non-parallel voice conversion using i-vector PLDA: towards unifying speaker verification and transformation.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Lombard speech synthesis using long short-term memory recurrent neural networks.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Frequency-warped time-weighted linear prediction for glottal vocoding.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Feature Extraction Using Power-Law Adjusted Linear Prediction With Application to Speaker Recognition Under Severe Vocal Effort Mismatch.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Phase perception of the glottal excitation and its relevance in statistical parametric speech synthesis.
Speech Commun., 2016

Phase modification for increasing the intelligibility of telephone speech in near-end noise conditions - evaluation of two methods.
Speech Commun., 2016

Previous exposure to intact speech increases intelligibility of its digitally degraded counterpart as a function of stimulus complexity.
NeuroImage, 2016

Comparing human and automatic speech recognition in a perceptual restoration experiment.
Comput. Speech Lang., 2016

Analysis of Face Mask Effect on Speaker Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Majorisation-Minimisation Based Optimisation of the Composite Autoregressive System with Application to Glottal Inverse Filtering.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The Use of Read versus Conversational Lombard Speech in Spectral Tilt Modeling for Intelligibility Enhancement in Near-End Noise Conditions.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Intelligibility Enhancement at the Receiving End of the Speech Transmission System - Effects of Far-End Noise Reduction.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Time-Varying Quasi-Closed-Phase Weighted Linear Prediction Analysis of Speech for Accurate Formant Detection and Tracking.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Automatic Glottal Inverse Filtering with Non-Negative Matrix Factorization.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

GlottDNN - A Full-Band Glottal Vocoder for Statistical Parametric Speech Synthesis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

High-pitched excitation generation for glottal vocoding in statistical parametric speech synthesis using a deep neural network.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Quasi closed phase analysis of speech signals using time varying weighted linear prediction for accurate formant tracking.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A subjective listening test of six different artificial bandwidth extension approaches in English, Chinese, German, and Korean.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Vowel Enhancement in Early Stage Spanish Esophageal Speech Using Natural Glottal Flow Pulse and Vocal Tract Frequency Warping.
Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Speaker recognition for speech under face cover.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Accounting for uncertainty of i-vectors in speaker recognition using uncertainty propagation and modified imputation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Phase perception of the glottal excitation of vocoded speech.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Speech quality evaluation of artificial bandwidth extension: comparing subjective judgments and instrumental predictions.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Comparison of Gaussian process regression and Gaussian mixture models in spectral tilt modelling for intelligibility enhancement of telephone speech.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

AM-FM based filter bank analysis for estimation of spectro-temporal envelopes and its application for speaker recognition in noisy reverberant environments.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Glottal inverse filtering based on quadratic programming.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Learning of a non-native vowel through instructed production training.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Non-native production training with an acoustic model and orthographic or transcription cues.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Does interest in language learning affect the non-native phoneme production in elderly learners?
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

Noise robust estimation of the voice source using a deep neural network.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Quasi Closed Phase Glottal Inverse Filtering Analysis With Weighted Linear Prediction.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Mixture Linear Prediction in Speaker Verification Under Vocal Effort Mismatch.
IEEE Signal Process. Lett., 2014

Synthesis and perception of breathy, normal, and Lombard speech in the presence of noise.
Comput. Speech Lang., 2014

An adaptive post-filtering method producing an artificial Lombard-like effect for intelligibility enhancement of narrowband telephone speech.
Comput. Speech Lang., 2014

Glottal source processing: From analysis to applications.
Comput. Speech Lang., 2014

Automatic glottal inverse filtering with the Markov chain Monte Carlo method.
Comput. Speech Lang., 2014

The harmonic and noise information of the glottal pulses in speech.
Biomed. Signal Process. Control., 2014

Spectral tilt modelling with extrapolated GMMs for intelligibility enhancement of narrowband telephone speech.
Proceedings of the 14th International Workshop on Acoustic Signal Enhancement, 2014

Deep neural network based trainable voice source model for synthesis of speech with varying vocal effort.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Subjective voice quality evaluation of artificial bandwidth extension: comparing different audio bandwidths and speech codecs.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Filtering and subspace selection for spectral features in detecting speech under physical stress.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Enhancement of speech intelligibility in near-end noise conditions with phase modification.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Spectral tilt modelling with GMMs for intelligibility enhancement of narrowband telephone speech.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Automatic estimation of the lip radiation effect in glottal inverse filtering.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Parameterization of the glottal source with the phase plane plot.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Gaussian mixture linear prediction.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multi-scale modulation filtering in automatic detection of emotions in telephone speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

Comparison of post-processing methods for intelligibility enhancement of narrowband speech in a mobile phone framework.
Proceedings of the IEEE International Conference on Acoustics, 2014

Voice source modelling using deep neural networks for statistical parametric speech synthesis.
Proceedings of the 22nd European Signal Processing Conference, 2014

2013
Wavelets for intonation modeling in HMM speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Robust spectral representation using group delay function and stabilized weighted linear prediction for additive noise degradations.
Proceedings of the 7th Conference on Speech Technology and Human-Computer Dialogue, 2013

Lombard modified text-to-speech synthesis for improved intelligibility: submission for the hurricane challenge 2013.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Using group delay functions from all-pole models for speaker recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Analysis and synthesis of shouted speech.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Extended weighted linear prediction using the autocorrelation snapshot - a robust speech analysis method and its application to recognition of vocal emotions.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Speech quality prediction for artificial bandwidth extension algorithms.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Frequency-adaptive post-filtering for intelligibility enhancement of narrowband telephone speech.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Comparison of spectrum estimators in speaker verification: mismatch conditions induced by vocal effort.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robust formant detection using group delay function and stabilized weighted linear prediction.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Effect of MPEG audio compression on HMM-based speech synthesis.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Quasi closed phase analysis for glottal inverse filtering.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Comparing glottal-flow-excited statistical parametric speech synthesis methods.
Proceedings of the IEEE International Conference on Acoustics, 2013

Automatic detection of anger in telephone speech with robust autoregressive modulation filtering.
Proceedings of the IEEE International Conference on Acoustics, 2013

Speaker identification from shouted speech: Analysis and compensation.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Bandwidth Extension of Telephone Speech to Low Frequencies Using Sinusoidal Synthesis and a Gaussian Mixture Model.
IEEE Trans. Speech Audio Process., 2012

Conversational Evaluation of Speech Bandwidth Extension Using a Mobile Handset.
IEEE Signal Process. Lett., 2012

Regularized All-Pole Models for Speaker Verification Under Noisy Environments.
IEEE Signal Process. Lett., 2012

Cortical processing of degraded speech sounds: Effects of distortion type and continuity.
NeuroImage, 2012

Regularization of all-pole models for speaker verification under additive noise.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Effect of noise type and level on focus related fundamental frequency changes.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Wideband Parametric Speech Synthesis Using Warped Linear Prediction.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Automatic Detection of High Vocal Effort in Telephone Speech.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Voice source analysis using biomechanical modeling and glottal inverse filtering.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Towards Glottal Source Controllability in Expressive Speech Synthesis.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Utilization of the Lombard effect in post-filtering for intelligibility enhancement of telephone speech.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Utilizing Markov Chain Monte Carlo (MCMC) Method for Improved Glottal Inverse Filtering.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Improved formant frequency estimation from high-pitched vowels by downgrading the contribution of the glottal source with weighted linear prediction.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

On measuring the intelligibility of synthetic speech in noise - Do we need a realistic noise environment?
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Conversational evaluation of artificial bandwidth extension of telephone speech using a mobile handset.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Robust speech analysis by lag-weighted linear prediction.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Comparing spectrum estimators in speaker verification under additive noise degradation.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Comparison of post-filtering methods for intelligibility enhancement of telephone speech.
Proceedings of the 20th European Signal Processing Conference, 2012

The GlottHMM Entry for Blizzard Challenge 2012: Hybrid Approach.
Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012

2011
HMM-Based Speech Synthesis Utilizing Glottal Inverse Filtering.
IEEE Trans. Speech Audio Process., 2011

Bandwidth Extension of Telephone Speech Using a Neural Network and a Filter Bank Implementation for Highband Mel Spectrum.
IEEE Trans. Speech Audio Process., 2011

Cortical encoding of aperiodic and periodic speech sounds: Evidence for distinct neural populations.
NeuroImage, 2011

Estimation of harmonic and noise components of the glottal excitation.
Proceedings of the 7th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2011

Analysis of HMM-Based Lombard Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Detection of Shouted Speech in the Presence of Ambient Noise.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Noise Robust Feature Extraction Based on Extended Weighted Linear Prediction in LVCSR.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Utilizing glottal source pulse library for generating improved excitation signal for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

Speech bandwidth extension using Gaussian mixture model-based estimation of the highband mel spectrum.
Proceedings of the IEEE International Conference on Acoustics, 2011

Shout detection in noise.
Proceedings of the IEEE International Conference on Acoustics, 2011

Glottal inverse filtering using stabilised weighted linear prediction.
Proceedings of the IEEE International Conference on Acoustics, 2011

The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2011: Utilizing Source Unit Selection in HMM-Based Speech Synthesis for Improved Excitation Generation.
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011

2010
Temporally Weighted Linear Prediction Features for Tackling Additive Noise in Speaker Verification.
IEEE Signal Process. Lett., 2010

Comparison of formant enhancement methods for HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Temporally Weighted Linear Prediction Features for Speaker Verification in Additive Noise.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

Laryngeal voice quality in the expression of focus.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Extended weighted linear prediction (XLP) analysis of speech and its application to speaker verification in adverse conditions.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Bandwidth extension of telephone speech using a filter bank implementation for highband MEL spectrum.
Proceedings of the 18th European Signal Processing Conference, 2010

The GlottHMM Speech Synthesis Entry for Blizzard Challenge 2010.
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009
Development, evaluation and implementation of an artificial bandwidth extension method of telephone speech in mobile terminal.
IEEE Trans. Consumer Electron., 2009

Stabilised weighted linear prediction.
Speech Commun., 2009

A LF-pulse from a simple glottal flow model.
Proceedings of the Sixth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2009

New method for delexicalization and its application to prosodic tagging for text-to-speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Weighted linear prediction for speech analysis in noisy conditions.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

On separating glottal source and vocal tract information in telephony speaker verification.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Evaluation of an Artificial Speech Bandwidth Extension Method in Three Languages.
IEEE Trans. Speech Audio Process., 2008

Simple proofs of root locations of two symmetric linear prediction models.
Signal Process., 2008

HMM-based Finnish text-to-speech system utilizing glottal inverse filtering.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

DC-constrained linear prediction for glottal inverse filtering.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

2007
Neural Network-Based Artificial Bandwidth Expansion of Speech.
IEEE Trans. Speech Audio Process., 2007

Minimum Separation of Line Spectral Frequencies.
IEEE Signal Process. Lett., 2007

Laryngeal voice quality changes in expression of prominence in continuous speech.
Proceedings of the Fifth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2007

The effect of highband harmonic structure in the artificial bandwidth expansion of telephone speech.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Stabilised weighted linear prediction - a robust all-pole method for speech processing.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Comparison of multiple voice source parameters in different phonation types.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

2006
Emotions in Vowel Segments of Continuous Speech: Analysis of the Glottal Flow Using the Normalised Amplitude Quotient.
Phonetica, 2006

Quality improvement of telephone speech by artificial bandwidth expansion - listening tests in three languages.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005
Assessment of glottal inverse filtering by using aeroelastic modelling of phonation and FE modelling of vocal tract.
Proceedings of the Fourth International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, 2005

Subglottal pressure and NAQ variation in voice production of classically trained baritone singers.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Group delay function as a means to assess quality of glottal inverse filtering.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

A toolkit for voice inverse filtering and parametrisation.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Artificial Bandwidth Expansion Method to Improve Intelligibility and Quality of AMR-Coded Narrowband Speech.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Objective Quality Measures for Glottal Inverse Filtering of Speech Pressure Signals.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
A time-domain interpretation for the LSP decomposition.
IEEE Trans. Speech Audio Process., 2004

Linear predictive method for improved spectral modeling of lower frequencies of speech with small prediction orders.
IEEE Trans. Speech Audio Process., 2004

Analysis of the voice source in different phonation types: simultaneous high-sped imaging of the vocal fold vibration and glottal inverse filtering.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Evaluation of an inverse filtering technique using physical modeling of voice production.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Emotions in Short Vowel Segments: Effects of the Glottal Flow as Reflected by the Normalized Amplitude Quotient.
Proceedings of the Affective Dialogue Systems, Tutorial and Research Workshop, 2004

2003
On line spectral frequencies.
IEEE Signal Process. Lett., 2003

All-pole modeling technique based on weighted sum of LSP polynomials.
IEEE Signal Process. Lett., 2003

A constrained linear predictive model with the minimum-phase property.
Signal Process., 2003

Linear predictive method with low-frequency emphasis.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

On the stability of constrained linear predictive models.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

All-pole modeling of wide-band speech with symmetric linear prediction.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Time-domain parameterization of the closing phase of glottal airflow waveform from voices over a large intensity range.
IEEE Trans. Speech Audio Process., 2002

Measuring the effect of fundamental frequency raising as a strategy for increasing vocal intensity in soft, normal and loud phonation.
Speech Commun., 2002

Human Cortical Dynamics Determined by Speech Fundamental Frequency.
NeuroImage, 2002

All-pole modeling of wide-band speech using weighted sum of the LSP polynomials.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A time domain reformulation of linear prediction equivalent to the LSP decomposition.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Memory Traces for Words as Revealed by the Mismatch Negativity.
NeuroImage, 2001

One-delayed-mass model for efficient synthesis of glottal flow.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

The use of fundamental frequency raising as a strategy for increasing vocal intensity in soft, normal, and loud phonation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
A linear predictive method for highly compressed presentation of speech spectra.
Proceedings of the IEEE International Symposium on Circuits and Systems, 2000

Neuromagnetic study on localization of speech sounds.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

MEG-measurements of brain activity reveal the link between human speech production and perception.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Analysis of voice production in breathy, normal and pressed phonation by comparing inverse filtering and videokymography.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

All-pole spectral modelling of voiced speech with a highly compressed set of parameters.
Proceedings of the 10th European Signal Processing Conference, 2000

1999
On the linearity of the relationship between the sound pressure level and the negative peak amplitude of the differentiated glottal flow in vowel production.
Speech Commun., 1999

A new predictive method for all-pole modelling of speech spectra with a compressed set of parameters.
Proceedings of the 1999 International Symposium on Circuits and Systems, ISCAS 1999, Orlando, Florida, USA, May 30, 1999

1998
Separated Linear Prediction - A new all-pole modelling technique for speech analysis.
Speech Commun., 1998

Estimation of amplitude features of the glottal flow by inverse filtering speech pressure signals.
Speech Commun., 1998

Analyzing the effect of secondary excitations of the vocal tract on vocal intensity in different loudness conditions.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A new linear predictive method for compression of speech signals.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Spectral estimation of voiced speech with regressive linear prediction.
Proceedings of the 9th European Signal Processing Conference, 1998

1997
Parabolic spectral parameter - A new method for quantification of the glottal flow.
Speech Commun., 1997

1996
Amplitude domain quotient for characterization of the glottal volume velocity waveform estimated by inverse filtering.
Speech Commun., 1996

A frequency domain method for parametrization of the voice source.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

1994
Estimation of the glottal pulseform based on discrete all-pole modeling.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1992
Glottal wave analysis with Pitch Synchronous Iterative Adaptive Inverse Filtering.
Speech Commun., 1992

Inverse filtering of the glottal waveform using the Itakura-saito distortion measure.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

An automatic method to estimate the time-based parameters of the glottal pulseform.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1990
A comparison of egg and a new automatic inverse filtering method in phonation change from breathy to normal.
Proceedings of the First International Conference on Spoken Language Processing, 1990

Glottal-LPC based coding of telephone band vowels with simple all-pole excitation.
Proceedings of the First International Conference on Spoken Language Processing, 1990

1989
Speech processing in the object-oriented DSP environment quicksig.
Proceedings of the First European Conference on Speech Communication and Technology, 1989

A new glottal LPC method of low complexity for speech analysis and coding.
Proceedings of the First European Conference on Speech Communication and Technology, 1989

1988
QuickSig-an object-oriented signal processing environment.
Proceedings of the IEEE International Conference on Acoustics, 1988


  Loading...