Hynek Hermansky

Orcid: 0000-0001-8032-4811

According to our database1, Hynek Hermansky authored at least 244 papers between 1983 and 2023.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2001, "For invention and development of perceptually-based speech processing methods.".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2023
Self-supervised Learning with Speech Modulation Dropout.
CoRR, 2023

Stabilized training of joint energy-based models and their practical applications.
CoRR, 2023

Importance of Different Temporal Modulations of Speech: a Tale of two Perspectives.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Blind Signal Dereverberation for Machine Speech Recognition.
CoRR, 2022

Dealing with Unknowns in Continual Learning for End-to-end Automatic Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Complex Frequency Domain Linear Prediction: A Tool to Compute Modulation Spectrum of Speech.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Two-Stage Augmentation and Adaptive CTC Fusion for Improved Robustness of Multi-Stream end-to-end ASR.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Radically Old Way of Computing Spectra: Applications in End-to-End ASR.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020
Multi-Stream End-to-End Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Continual Learning in Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

An Alternative to MFCCs for ASR.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Practical Two-Stage Training Strategy for Multi-Stream End-to-End Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
DNN-based performance measures for predicting error rates in automatic speech recognition and optimizing hearing aid parameters.
Speech Commun., 2019

Coding and decoding of messages in human speech communication: Implications for machine recognition of speech.
Speech Commun., 2019

Exploring Methods for the Automatic Detection of Errors in Manual Transcription.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Modulation Vectors as Robust Feature Representation for ASR in Domain Mismatched Conditions.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Performance Monitoring for End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Towards Automatic Methods to Detect Errors in Transcriptions of Speech Recordings.
Proceedings of the IEEE International Conference on Acoustics, 2019

Stream Attention-based Multi-array End-to-end Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

M-vectors: Sub-band Based Energy Modulation Features for Multi-stream Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Deriving Spectro-temporal Properties of Hearing from Speech Data.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Multi-encoder multi-resolution framework for end-to-end speech recognition.
CoRR, 2018

Stream Attention for Distributed Multi-Microphone Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
Stream Attention for far-field multi-microphone ASR.
CoRR, 2017

Predicting error rates for unknown data in automatic speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Performance monitoring for automatic speech recognition in noisy multi-channel environments.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Assessing Speech Quality in Speech-Aware Hearing Aids Based on Phoneme Posteriorgrams.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A Framework for Practical Multistream ASR.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

A new efficient measure for accuracy prediction and its application to multistream-based unsupervised adaptation.
Proceedings of the 23rd International Conference on Pattern Recognition, 2016

Novel neural network based fusion for multistream ASR.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
DNN derived filters for processing of modulation spectrum of speech.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Autoencoder based multi-stream combination for noise robust speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Towards machines that know when they do not know: Summary of work done at 2014 Frederick Jelinek Memorial Workshop.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Uncertainty estimation of DNN classifiers.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Robust speech recognition in unknown reverberant and noisy conditions.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Robust Feature Extraction Using Modulation Filtering of Autoregressive Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Evaluating speech features with the minimal-pair ABX task (II): resistance to noise.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Principal components of auditory spectro-temporal receptive fields.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A long, deep and wide artificial neural net for robust speech recognition in unknown noise.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Featherweight phonetic keyword search for conversational speech.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Factor Analysis of Auto-Associative Neural Networks With Application in Speaker Verification.
IEEE Trans. Neural Networks Learn. Syst., 2013

Perceptual Properties of Current Speech Recognition Technology.
Proc. IEEE, 2013

Multistream Recognition of Speech: Dealing With Unknown Unknowns.
Proc. IEEE, 2013

Long, Deep and Wide Artificial Neural Nets for Dealing with Unexpected Noise in Machine Recognition of Speech.
Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Multi-stream recognition of noisy speech with performance monitoring.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Evaluating speech features with the minimal-pair ABX task: analysis of the classical MFC/PLP pipeline.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Stream selection and integration in multistream ASR using GMM-based performance monitoring.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Robust speaker recognition using spectro-temporal autoregressive models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Improvements in language identification on the RATS noisy speech corpus.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Text-to-speech inspired duration modeling for improved whole-word acoustic models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Deep neural network features and semi-supervised training for low resource speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Developing a speaker identification system for the DARPA RATS project.
Proceedings of the IEEE International Conference on Acoustics, 2013

Filter-bank optimization for Frequency Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2013

Effect of filter bandwidth and spectral sampling rate of analysis filterbank on automatic phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Weak top-down constraints for unsupervised acoustic model training.
Proceedings of the IEEE International Conference on Acoustics, 2013


Mean temporal distance: Predicting ASR error from temporal properties of speech signal.
Proceedings of the IEEE International Conference on Acoustics, 2013

Frequency offset correction in speech without detecting pitch.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
DIRAC: Detection and Identification of Rare Audio-Visual Events.
Proceedings of the Detection and Identification of Rare Audiovisual Cues, 2012

Sparse Multilayer Perceptron for Phoneme Recognition.
IEEE Trans. Speech Audio Process., 2012

Regularized Auto-Associative Neural Networks for Speaker Verification.
IEEE Signal Process. Lett., 2012

Phase AutoCorrelation (PAC) features for noise robust speech recognition.
Speech Commun., 2012

Beyond Novelty Detection: Incongruent Events, When General and Specific Classifiers Disagree.
IEEE Trans. Pattern Anal. Mach. Intell., 2012

Adaptation transforms of auto-associative neural networks as features for speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Factor analysis of mixture of auto-associative neural networks for speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Feature extraction using 2-d autoregressive models for speaker recognition.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012

Estimating Classifier Performance in Unknown Noise.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Acoustic and Data-driven Features for Robust Speech Activity Detection.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Data-driven Posterior Features for Low Resource Speech Recognition Applications.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Phone recognition in critical bands using sub-band temporal modulations.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

MAP Estimation of Whole-Word Acoustic Models with Dictionary Priors.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Inverting the Point Process Model for Fast Phonetic Keyword Search.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Intrinsic Spectral Analysis for Zero and High Resource Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Analysis of Temporal Resolution in Frequency Domain Linear Prediction.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multilingual MLP features for low-resource LVCSR systems.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012


Comparison of Different Approaches for Speech Recognition in Hands-free Mode.
Proceedings of the 10th ITG Conference on Speech Communication, 2012

2011
Analysis of MLP-Based Hierarchical Phoneme Posterior Probability Estimator.
IEEE Trans. Speech Audio Process., 2011

Multi-layer perceptron based speech activity detection for speaker verification.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2011

Dealing with Unexpected Words in Automatic Recognition of Speech.
Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

Performance monitoring for robustness in automatic recognition of speechi.
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011

Mixture of Auto-Associative Neural Networks for Speaker Verification.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Adaptive Stream Fusion in Multistream Recognition of Speech.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Modulation Spectrum Analysis for Recognition of Reverberant Speech.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Event Selection from Phone Posteriorgrams Using Matched Filters.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Rapid Evaluation of Speech Representations for Spoken Term Discovery.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop.
Proceedings of the IEEE International Conference on Acoustics, 2011

MLP based phoneme detectors for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Multilayer perceptron with sparse hidden outputs for phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Autoregressive Models of Amplitude Modulations in Audio Compression.
IEEE Trans. Speech Audio Process., 2010

Data-Driven and Feedback Based Spectro-Temporal Features for Speech Recognition.
IEEE Signal Process. Lett., 2010

Wide-Band Audio Coding Based on Frequency-Domain Linear Prediction.
EURASIP J. Audio Speech Music. Process., 2010

Recovery of Rare Words in Lecture Speech.
Proceedings of the Text, Speech and Dialogue, 13th International Conference, 2010

The use of spike-based representations for hardware audition systems.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

Fully integrated 500uW speech detection wake-up circuit.
Proceedings of the International Symposium on Circuits and Systems (ISCAS 2010), May 30, 2010

A phoneme recognition framework based on auditory spectro-temporal receptive fields.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Cross-lingual and multi-stream posterior features for low resource LVCSR systems.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Sparse auto-associative neural networks: theory and application to speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A multistream multiresolution framework for phoneme recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Towards spoken term discovery at scale with zero resources.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Sparse coding for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

History of modulation spectrum in ASR.
Proceedings of the IEEE International Conference on Acoustics, 2010

Comparison of modulation features for phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Robust spectro-temporal features based on autoregressive models of Hilbert envelopes.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Intelligent Multi-modal Interfaces for Mobile Applications in Hostile Environment(IM-HOST).
Proceedings of the Human Machine Interaction, Research Results of the MMI Program, 2009

Applications of signal analysis using autoregressive models for amplitude modulation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Error Resilient Speech Coding Using Sub-band Hilbert Envelopes.
Proceedings of the Text, Speech and Dialogue, 12th International Conference, 2009

Tandem representations of spectral envelope and modulation frequency features for ASR.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Arithmetic coding of sub-band residuals in FDLP speech/audio codec.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Discriminant spectrotemporal features for phoneme recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Posterior-based out of vocabulary word detection in telephone speech.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Static and dynamic modulation spectrum for speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Phoneme recognition using spectral envelope and modulation frequency features.
Proceedings of the IEEE International Conference on Acoustics, 2009

Volterra series for analyzing MLP based phoneme posterior estimator.
Proceedings of the IEEE International Conference on Acoustics, 2009

Reconciliation of human and machine speech recognition performance.
Proceedings of the IEEE International Conference on Acoustics, 2009

Temporal envelope subtraction for robust speech recognition using modulation spectrum.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Recognition of Reverberant Speech Using Frequency Domain Linear Prediction.
IEEE Signal Process. Lett., 2008

Emulating Temporal Receptive Fields of Higher Level Auditory Neurons for ASR.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Reverse Correlation for Analyzing MLP Posterior Features in ASR.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Exploiting Contextual Information for Speech/Non-Speech Detection.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Perceptually Motivated Sub-band Decomposition for FDLP Audio Coding.
Proceedings of the Text, Speech and Dialogue, 11th International Conference, 2008

Beyond Novelty Detection: Incongruent Events, when General and Specific Classifiers Disagree.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Hilbert Envelope Based Features for Far-Field Speech Recognition.
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008

On the combination of auditory and modulation frequency channels for ASR applications.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Introducing temporal asymmetries in feature extraction for automatic speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Combining evidence from a generative and a discriminative model in phoneme recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Front-end for far-field speech recognition based on frequency domain linear prediction.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Spectral noise shaping: improvements in speech/audio codec based on linear prediction in spectral domain.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

The DIRAC AWEAR audio-visual platform for detection of unexpected and incongruent events.
Proceedings of the 10th International Conference on Multimodal Interfaces, 2008

Confidence estimation, OOV detection and language ID using phone-to-word transduction and phone-level alignments.
Proceedings of the IEEE International Conference on Acoustics, 2008

Hierarchical and parallel processing of modulation spectrum for ASR applications.
Proceedings of the IEEE International Conference on Acoustics, 2008

Exploiting contextual information for improved phoneme recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

Temporal masking for bit-rate reduction in audio codec based on Frequency Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2008

Combination of strongly and weakly constrained recognizers for reliable detection of OOVS.
Proceedings of the IEEE International Conference on Acoustics, 2008

Using comparison of parallel phoneme probability streams for OOV word detection.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

Spectro-temporal features for Automatic Speech Recognition using Linear Prediction in spectral domain.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

Emulating temporal receptive fields of auditory mid-brain neurons for automatic speech recognition.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Non-uniform Speech/Audio Coding Exploiting Predictability of Temporal Evolution of Spectral Envelopes.
Proceedings of the Text, Speech and Dialogue, 10th International Conference, 2007

Frequency Domain Linear Prediction for QMF Sub-bands and Applications to Audio Coding.
Proceedings of the Machine Learning for Multimodal Interaction , 2007

Hierarchical neural networks feature extraction for LVCSR system.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Multi-stream features combination based on dempster-shafer rule for LVCSR system.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

MRASTA and PLP in automatic speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Exploiting phoneme similarities in hybrid HMM-ANN keyword spotting.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Detection of out-of-vocabulary words in posterior based ASR.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Combination of Acoustic Classifiers Based on Dempster-Shafer Theory of Evidence.
Proceedings of the IEEE International Conference on Acoustics, 2007

Wide-Band Perceptual Audio Coding Based on Frequency-Domain Linear Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Speech Coding Based on Spectral Dynamics.
Proceedings of the Text, Speech and Dialogue, 9th International Conference, 2006

Discriminant linear processing of time-frequency plane.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Towards ASR Based on Hierarchical Posterior-Based Keyword Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Pushing the envelope - aside [speech recognition].
IEEE Signal Process. Mag., 2005

Editorial.
EURASIP J. Adv. Signal Process., 2005

The Role of Speech in Multimodal Human-Computer Interaction.
Proceedings of the Text, Speech and Dialogue, 8th International Conference, 2005

Multi-resolution RASTA filtering for TANDEM-based ASR.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Entropy based combination of tandem representations for noise robust ASR.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Stochastic techniques in deriving perceptual knowledge.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004

New nonsense syllables database - analyses and preliminary ASR experiments.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

PLP-squared: autoregressive modeling of auditory-like 2-d spectro-temporal patterns.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004

LP-TRAP: linear predictive temporal patterns.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

On use of task independent training data in tandem feature extraction.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Spectral entropy based feature for robust ASR.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Phase autocorrelation (PAC) features in entropy based multi-stream for robust speech recognition.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Data-driven spectral basis functions for automatic speech recognition.
Speech Commun., 2003

Phoneme Recognition Using Temporal Patterns.
Proceedings of the Text, Speech and Dialogue, 6th International Conference, 2003

In search of target class definition in tandem feature extraction.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Novel approaches for one- and two-speaker detection.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Beyond a single critical-band in TRAP based ASR.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Band-independent speech-event categories for TRAP based ASR.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Local averaging and differentiating of spectral plane for TRAP-based ASR.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Segmentation of speech for speaker and language recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Generalized tandem feature extraction.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Analysis of Information in Speech Based on MANOVA.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Bark resolution from speech data.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Distributed speech recognition using noise-robust MFCC and traps-estimated manner features.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Qualcomm-ICSI-OGI features for ASR.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Hierarchical tandem feature extraction.
Proceedings of the IEEE International Conference on Acoustics, 2002

A new speaker change detection method for two-speaker segmentation.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Human Speech Perception: Some Lessons from Automatic Speech Recognition.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Data Driven Design of Filter Bank for Speech Recognition.
Proceedings of the Text, Speech and Dialogue, 4th International Conference, 2001

Speaker verification based on broad phonetic categories.
Proceedings of the 2001: A Speaker Odyssey, 2001

Robust ASR front-end using spectral-based and discriminant features: experiments on the Aurora tasks.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A study of two dimensional linear discriminants for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Relevance of time-frequency features for phonetic and speaker-channel classification.
Speech Commun., 2000

Data-Driven Temporal Filters and Alternatives to GMM in Speaker Verification.
Digit. Signal Process., 2000

Analysis of Information in Speech and Its Application in Speech Recognition.
Proceedings of the Text, Speech and Dialogue - Third International Workshop, 2000

Discriminative MLPs in HMM-based recognition of speech in cellular telephony.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Optimization of units for continuous-digit recognition task.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Temporal patterns of critical-band spectrum for text-to-speech.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Feature extraction using non-linear transformation for robust speech recognition on the Aurora database.
Proceedings of the IEEE International Conference on Acoustics, 2000

Tandem connectionist feature extraction for conventional HMM systems.
Proceedings of the IEEE International Conference on Acoustics, 2000

1999
Speech enhancement using linear prediction residual.
Speech Commun., 1999

On the relative importance of various components of the modulation spectrum for automatic speech recognition.
Speech Commun., 1999

Data-Driven Analysis of Speech.
Proceedings of the Text, Speech and Dialogue - Second International Workshop, 1999

Search for Information Bearing Components in Speech.
Proceedings of the Advances in Neural Information Processing Systems 12, [NIPS Conference, Denver, Colorado, USA, November 29, 1999

The purpose, history, current state, and some evolving trends in feature extraction for speech recognition.
Proceedings of the ISSPA '99. Proceedings of the Fifth International Symposium on Signal Processing and its Applications, 1999

Speech variability in the modulation spectral domain - SANOVA technique -.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Analysis of sources of variability in speech.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Down-sampling speech representation in ASR.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Relevancy of time-frequency features for phonetic classification measured by mutual information.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Temporal patterns (TRAPs) in ASR of noisy speech.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Should recognizers have ears?
Speech Commun., 1998

On the importance of components of the modulation spectrum for speaker verification.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

TRAPS - classifiers of temporal patterns.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Spectral basis functions from discriminant analysis.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Enhancement of reverberant speech using LP residual.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

On properties of modulation spectrum for robust automatic speech recognition.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

1997
On the effects of short-term spectrum smoothing in channel normalization.
IEEE Trans. Speech Audio Process., 1997

Processing linear prediction residual for speech enhancement.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Data-driven design of RASTA-like filters.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Multi-band and adaptation approaches to robust speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Towards decomposing the sources of variability in speech.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

On the importance of various modulation frequencies for speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Multiresolution channel normalization for ASR in reverberant environments.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Sub-band based recognition of noisy speech.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Towards increasing speech recognition error rates.
Speech Commun., 1996

Towards ASR on partially corrupted speech.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Data based filter design for RASTA-like channel normalization in ASR.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Study on the dereverberation of speech based on temporal envelope filtering.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Intelligibility of speech with filtered time trajectories of spectral envelopes.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Towards subband-based speech recognition.
Proceedings of the 8th European Signal Processing Conference, 1996

1995
The challenge of spoken language systems: research directions for the nineties.
IEEE Trans. Speech Audio Process., 1995

Beyond NYQUIST: towards the recovery of broad-bandwidth speech from narrow-bandwidth speech.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Stochastic perceptual models of speech.
Proceedings of the 1995 International Conference on Acoustics, 1995

Speech enhancement based on temporal processing.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
RASTA processing of speech.
IEEE Trans. Speech Audio Process., 1994

Stochastic perceptual auditory-event-based models for speech recognition.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Integrating RASTA-PLP into speech recognition.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1993
Evaluation and optimization of perceptually-based ASR front-end.
IEEE Trans. Speech Audio Process., 1993

Recognition of speech in additive and convolutional noise based on RASTA spectral processing.
Proceedings of the IEEE International Conference on Acoustics, 1993

1992
Towards handling the acoustic environment in spoken language processing.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

RASTA-PLP speech analysis technique.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP).
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Perceptual linear predictive (PLP) analysis-resynthesis technique.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Continuous speech recognition using PLP analysis with multilayer perceptrons.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Towards feature-based speech metric.
Proceedings of the 1990 International Conference on Acoustics, 1990

1989
The effective second formant F2' and the vocal tract front-cavity.
Proceedings of the IEEE International Conference on Acoustics, 1989

1988
Optimization of perceptually-based ASR front-end [automatic speech recognition].
Proceedings of the IEEE International Conference on Acoustics, 1988

1987
Automatic speech recognition and human auditory perception.
Proceedings of the European Conference on Speech Technology, 1987

An efficient speaker-independent automatic speech recognition by simulation of some properties of human auditory perception.
Proceedings of the IEEE International Conference on Acoustics, 1987

1986
Perceptually based processing in automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1986

1985
Low-dimensional representation of vowels based on all-pole modeling in the psychophysical domain.
Speech Commun., 1985

Perceptually based linear predictive analysis of speech.
Proceedings of the IEEE International Conference on Acoustics, 1985

1984
Spectral envelope sampling and interpolation in linear predictive analysis of speech.
Proceedings of the IEEE International Conference on Acoustics, 1984

1983
Analysis and synthesis of speech based on spectral transform linear predictive method.
Proceedings of the IEEE International Conference on Acoustics, 1983


  Loading...