Carol Y. Espy-Wilson

Orcid: 0000-0002-1012-183X

Affiliations:
  • University of Maryland, College Park, USA


According to our database1, Carol Y. Espy-Wilson authored at least 102 papers between 1982 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
CPT-Boosted Wav2vec2.0: Towards Noise Robust Speech Recognition for Classroom Environments.
CoRR, 2024

Self-supervised Multimodal Speech Representations for the Assessment of Schizophrenia Symptoms.
CoRR, 2024

Continued Pretraining for Domain Adaptation of Wav2vec2.0 in Automatic Speech Recognition for Elementary Math Classroom Settings.
CoRR, 2024

Improving Speech Inversion Through Self-Supervised Embeddings and Enhanced Tract Variables.
Proceedings of the 32nd European Signal Processing Conference, 2024

2023
A multi-modal approach for identifying schizophrenia using cross-modal attention.
CoRR, 2023

Kid-Whisper: Towards Bridging the Performance Gap in Automatic Speech Recognition for Children VS. Adults.
CoRR, 2023

Learning to Compute the Articulatory Representations of Speech with the MIRRORNET.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Speaker-independent Speech Inversion for Estimation of Nasalance.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Acoustic-to-Articulatory Speech Inversion Features for Mispronunciation Detection of /ɹ/ in Child Speech Sound Disorders.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing Speech Articulation Analysis Using A Geometric Transformation of the X-ray Microbeam Dataset.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

The Secret Source : Incorporating Source Features to Improve Acoustic-To-Articulatory Speech Inversion.
Proceedings of the IEEE International Conference on Acoustics, 2023

Masked Autoencoders are Articulatory Learners.
Proceedings of the IEEE International Conference on Acoustics, 2023

Audio Data Augmentation for Acoustic-to-Articulatory Speech Inversion.
Proceedings of the 31st European Signal Processing Conference, 2023

2022
Modeling Feature Representations for Affective Speech Using Generative Adversarial Networks.
IEEE Trans. Affect. Comput., 2022

Spoken language interaction with robots: Recommendations for future research.
Comput. Speech Lang., 2022

Acoustic-to-articulatory Speech Inversion with Multi-task Learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multimodal Depression Severity Score Prediction Using Articulatory Coordination Features and Hierarchical Attention Based Text Embeddings.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Acoustic To Articulatory Speech Inversion Using Multi-Resolution Spectro-Temporal Representations Of Speech Signals.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

An Empirical Analysis on the Vulnerabilities of End-to-End Speech Segregation Models.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multimodal Depression Classification using Articulatory Coordination Features and Hierarchical Attention Based text Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2022

Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Speech Based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Multimodal Approach for Assessing Neuromotor Coordination in Schizophrenia Using Convolutional Neural Networks.
Proceedings of the ICMI '21: International Conference on Multimodal Interaction, 2021

2020
Deep Learning Based Generalized Models for Depression Classification.
CoRR, 2020

Spoken Language Interaction with Robots: Research Issues and Recommendations, Report from the NSF Future Directions Workshop.
CoRR, 2020

Extended Study on the Use of Vocal Tract Variables to Quantify Neuromotor Coordination in Depression.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
Multi-Corpus Acoustic-to-Articulatory Speech Inversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Assessing Neuromotor Coordination in Depression Using Inverted Vocal Tract Variables.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018
Noise Robust Acoustic to Articulatory Speech Inversion.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

On Enhancing Speech Emotion Recognition Using Generative Adversarial Networks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Smoothing Model Predictions Using Adversarial Training Procedures for Speech Based Emotion Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Semi-Supervised and Transfer Learning Approaches for Low Resource Sentiment Classification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Hybrid convolutional neural networks for articulatory and acoustic information based speech recognition.
Speech Commun., 2017

SCL-UMD at the Medico Task-MediaEval 2017: Transfer Learning based Classification of Medical Images.
Proceedings of the Working Notes Proceedings of the MediaEval 2017 Workshop co-located with the Conference and Labs of the Evaluation Forum (CLEF 2017), 2017

Analysis of Acoustic-to-Articulatory Speech Inversion Across Different Accents and Languages.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Adversarial Auto-Encoders for Speech Based Emotion Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Affect Prediction Approach Through Depression Severity Parameter Incorporation in Neural Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Joint modeling of articulatory and acoustic spaces for continuous speech recognition tasks.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016
Vocal Tract Length Normalization for Speaker Independent Acoustic-to-Articulatory Speech Inversion.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speech Features for Depression Detection.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015
Analysis of coarticulated speech using estimated articulatory trajectories.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Diversity of tongue shapes for the American English rhotic liquid.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

2014
Articulatory features from deep neural networks and their role in speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
A cine MRI-based study of sibilant fricatives production in post-glossectomy speakers.
Proceedings of the IEEE International Conference on Acoustics, 2013

2012
Automatic intelligibility assessment of pathologic speech in head and neck cancer based on auditory-inspired spectro-temporal modulations.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Multicondition training of Gaussian PLDA models in i-vector space for noise and reverberation robust speaker recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Articulatory Information for Noise Robust Speech Recognition.
IEEE Trans. Speech Audio Process., 2011

A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal Subjects.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Automatic Speech Codec Identification with Applications to Tampering Detection of Speech Recordings.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Analysis of i-vector Length Normalization in Speaker Recognition Systems.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speech inversion: Benefits of tract variables over pellet trajectories.
Proceedings of the IEEE International Conference on Acoustics, 2011

Gesture-based Dynamic Bayesian Network for noise robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Linear versus mel frequency cepstral coefficients for speaker recognition.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Robust speech recognition using articulatory gestures in a Dynamic Bayesian Network framework.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Retrieving Tract Variables From Acoustics: A Comparison of Different Machine Learning Strategies.
IEEE J. Sel. Top. Signal Process., 2010

Joint Factor Analysis for Speaker Recognition Reinterpreted as Signal Coding Using Overcomplete Dictionaries.
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010

A procedure for estimating gestural scores from natural speech.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Robust word recognition using articulatory trajectories and gestures.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An MRI-based articulatory and acoustic study of lateral sound in American English.
Proceedings of the IEEE International Conference on Acoustics, 2010

Automatic acquisition device identification from speech recordings.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Noise robustness of tract variables and their application to speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A noise-type and level-dependent MPO-based speech enhancement architecture with variable frame analysis for noise-robust speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An algorithm for speech segregation of co-channel speech.
Proceedings of the IEEE International Conference on Acoustics, 2009

From acoustics to Vocal Tract time functions.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
An algorithm for multi-pitch tracking in co-channel speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Language and genre detection in audio content analysis.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Intersession variability in speaker recognition: a behind the scene analysis.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Language detection in audio content analysis.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007

An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Acoustic parameters for the automatic detection of vowel nasalization.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A semi-automatic approach for speaker mining of tapped telephone conversations.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Landmark-based approach to speech recognition: an alternative to HMMs.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

2006
Automatic detection of irregular phonation in continuous speech.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An MRI based study of the acoustic effects of sinus cavities and its application to speaker recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A new set of features for text-independent speaker identification.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Speech enhancement using modified phase opponency model.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Modified phase opponency based solution to the speech separation challenge.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005
Use of Temporal Information: Detection of Periodicity, Aperiodicity, and Pitch in Speech.
IEEE Trans. Speech Audio Process., 2005

Speech enhancement using auditory phase opponency model.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Modeling of the Front Cavity and Sublingual Space in American English Rhotic Sounds.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Acoustic parameters for automatic detection of nasal manner.
Speech Commun., 2004

A novel method for computation of periodicity, aperiodicity and pitch of speech signals.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Acoustic modeling of american English lateral approximants.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A measure of aperiodicity and periodicity in speech.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
An event-based acoustic-phonetic approach for speech segmentation and E-set recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

Acoustic-phonetic speech parameters for speaker-independent speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

2000
A new strategy of formant tracking based on dynamic programming.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Detection of speech landmarks using temporal cues.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
Automatic detection of manner events based on temporal parameters.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Improvement of electrolaryngeal speech by introducing normal excitation information.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1997
Acoustic modelling of American English /r/.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

The design of acoustic parameters for speaker-independent speech recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

1996
Enhancement of alaryngeal speech by adaptive filtering.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Coarticulatory stability in american English /r/.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Knowledge-based parameters for HMM speech recognition.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Adaptive enhancement of Fourier spectra.
IEEE Trans. Speech Audio Process., 1995

Speech parameterization based on phonetic features: application to speech recognition.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1986
A phonetically based semivowel recognition system.
Proceedings of the IEEE International Conference on Acoustics, 1986

1982
Effects of noise on signal reconstruction from Fourier transform phase.
Proceedings of the IEEE International Conference on Acoustics, 1982


  Loading...