Mark Hasegawa-Johnson
Orcid: 0000-0002-5631-2893
According to our database1,
Mark Hasegawa-Johnson
authored at least 290 papers
between 1996 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on zbmath.org
-
on orcid.org
On csauthors.net:
Bibliography
2024
Preliminary Technical Validation of LittleBeats™: A Multimodal Sensing Platform to Capture Cardiac Physiology, Motion, and Vocalizations.
Sensors, February, 2024
R2I-rPPG: A Robust Region of Interest Selection Method for Remote Photoplethysmography to Extract Heart Rate.
CoRR, 2024
Fine-Tuning Automatic Speech Recognition for People with Parkinson's: An Effective Strategy for Enhancing Speech Technology Accessibility.
CoRR, 2024
Just ASR + LLM? A Study on Speech Large Language Models' Ability to Identify and Understand Speaker in Spoken Dialogue.
CoRR, 2024
CoRR, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Analysis of Self-Supervised Speech Models on Children's Speech and Infant Vocalizations.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Finding Spoken Identifications: Using GPT-4 Annotation for an Efficient and Fast Dataset Creation Pipeline.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
Proceedings of the IEEE/ACM Conference on Connected Health: Applications, 2024
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback.
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Automated morphological phenotyping using learned shape descriptors and functional maps: A novel approach to geometric morphometrics.
PLoS Comput. Biol., January, 2023
CoRR, 2023
Enhancing Child Vocalization Classification in Multi-Channel Child-Adult Conversations Through Wav2vec2 Children ASR Features.
CoRR, 2023
Proceedings of the 8th Workshop on Representation Learning for NLP, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Mitigating the Exposure Bias in Sentence-Level Grapheme-to-Phoneme (G2P) Transduction.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Towards Robust Family-Infant Audio Analysis Based on Unsupervised Pretraining of Wav2vec 2.0 on Large-Scale Unlabeled Family Audio.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Classification of Infant Sleep/Wake States: Cross-Attention among Large Scale Pretrained Transformer Networks using Audio, ECG, and IMU Data.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
INTapt: Information-Theoretic Adversarial Prompt Tuning for Enhanced Non-Native Speech Recognition.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
2022
Autosegmental Neural Nets 2.0: An Extensive Study of Training Synchronous and Asynchronous Phones and Tones for Under-Resourced Tonal Languages.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Speech Commun., 2022
Frontiers Artif. Intell., 2022
Comput. Speech Lang., 2022
CoRR, 2022
Visualizations of Complex Sequences of Family-Infant Vocalizations Using Bag-of-Audio-Words Approach Based on Wav2vec 2.0 Features.
CoRR, 2022
SpeechSplit 2.0: Unsupervised speech disentanglement for voice conversion Without tuning autoencoder Bottlenecks.
CoRR, 2022
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Cross-lingual articulatory feature information transfer for speech recognition using recurrent progressive neural networks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
WavPrompt: Towards Few-Shot Spoken Language Understanding with Frozen Language Models.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
ContentVec: An Improved Self-Supervised Speech Representation by Disentangling Speakers.
Proceedings of the International Conference on Machine Learning, 2022
Proceedings of the International Conference on Machine Learning, 2022
Detection of Covid-19 from Joint Time and Frequency Analysis of Speech, Breathing and Cough Audio.
Proceedings of the IEEE International Conference on Acoustics, 2022
SpeechSplit2.0: Unsupervised Speech Disentanglement for Voice Conversion without Tuning Autoencoder Bottlenecks.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022
Proceedings of the 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022
Self-supervised Semantic-driven Phoneme Discovery for Zero-resource Speech Recognition.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Analysis of acoustic and voice quality features for the classification of infant and mother vocalizations.
Speech Commun., 2021
CoRR, 2021
Worldly Wise (WoW) - Cross-Lingual Knowledge Fusion for Fact-based Visual Spoken-Question Answering.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 38th International Conference on Machine Learning, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Align or attend? Toward More Efficient and Accurate Spoken Word Discovery Using Speech-to-Image Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Synthesis of New Words for Improved Dysarthric Speech Recognition on an Expanded Vocabulary.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 55th Asilomar Conference on Signals, Systems, and Computers, 2021
2020
Multimodal Word Discovery and Retrieval With Spoken Descriptions and Visual Concepts.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Seeing is Knowing! Fact-based Visual Question Answering using Knowledge Graph Embeddings.
CoRR, 2020
Proceedings of the Statistical Language and Speech Processing, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
That Sounds Familiar: An Analysis of Phonetic Representations Transfer Across Languages.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
A DNN-HMM-DNN Hybrid Model for Discovering Word-Like Units from Spoken Captions and Image Regions.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 37th International Conference on Machine Learning, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
F0-Consistent Many-To-Many Non-Parallel Voice Conversion Via Conditional Autoencoder.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Context-Aware Automatic Text Simplification of Health Materials in Low-Resource Domains.
Proceedings of the 11th International Workshop on Health Text Mining and Information Analysis, 2020
2019
The role of cue enhancement and frequency fine-tuning in hearing impaired phone recognition.
CoRR, 2019
Proceedings of the Statistical Language and Speech Processing, 2019
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Study of the Performance of Automatic Speech Recognition Systems in Speakers with Parkinson's Disease.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 36th International Conference on Machine Learning, 2019
Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Multitask Learning for Phone Recognition of Underresourced Languages Using Mismatched Transcription.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Building an ASR System for Mboshi Using A Cross-Language Definition of Acoustic Units Approach.
Proceedings of the 6th Intl. Workshop on Spoken Language Technologies for Under-Resourced Languages, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Improved ASR for Under-resourced Languages through Multi-task Learning with Acoustic Landmarks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Improving DNNs Trained with Non-Native Transcriptions Using Knowledge Distillation and Target Interpolation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Topic and Keyword Identification for Low-resourced Speech Using Cross-Language Transfer Learning.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the AMIA 2018, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
A multidisciplinary approach to designing and evaluating Electronic Medical Record portal messages that support patient self-care.
J. Biomed. Informatics, 2017
Acoustic Landmarks Contain More Information About the Phone String than Other Frames.
CoRR, 2017
Proceedings of the 26th International Conference on World Wide Web, 2017
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Using Approximated Auditory Roughness as a Pre-Filtering Feature for Human Screaming and Affective Speech AED.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Multi-Task Learning Using Mismatched Transcription for Under-Resourced Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Mismatched Crowdsourcing from Multiple Annotator Languages for Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 5th International Conference on Learning Representations, 2017
Discovering dimensions of perceived vocal expression in semi-structured, unscripted oral history accounts.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017
Low-resource spoken keyword search strategies in georgian inspired by distinctive feature theory.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the AMIA 2017, 2017
Proceedings of the AMIA 2017, 2017
Proceedings of the 51st Asilomar Conference on Signals, Systems, and Computers, 2017
2016
Comput. Speech Lang., 2016
Performance Improvements of Probabilistic Transcript-adapted ASR with Recurrent Neural Network and Language-specific Constraints.
CoRR, 2016
Clustering-based Phonetic Projection in Mismatched Crowdsourcing Channels for Low-resourced ASR.
Proceedings of the 6th Workshop on South and Southeast Asian Natural Language Processing, 2016
Use of particle filtering and MCMC for inference in Probabilistic Acoustic Tube model.
Proceedings of the IEEE Statistical Signal Processing Workshop, 2016
Performance Improvement of Probabilistic Transcriptions with Language-specific Constraints.
Proceedings of the SLTU-2016, 2016
Proceedings of the SLTU-2016, 2016
Proceedings of the 2016 IEEE RIVF International Conference on Computing & Communication Technologies, 2016
Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016
Proceedings of the 2016 Information Theory and Applications Workshop, 2016
Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Automatic Speech Recognition Using Probabilistic Transcriptions in Swahili, Amharic, and Dinka.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
An Investigation on Training Deep Neural Networks Using Probabilistic Transcriptions.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Landmark of Mandarin nasal codas and its application in pronunciation error detection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 International Conference on Asian Language Processing, 2016
2015
Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2015
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015
Classtranscribe: a new tool with new educational opportunities for student crowdsourced college lecture transcription.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Improved hindi broadcast ASR by adapting the language model and pronunciation model using a priori syntactic and morphophonemic knowledge.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Cross-lingual transfer learning during supervised training in low resource scenarios.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Multichannel transient acoustic signal classification using task-driven dictionary with joint sparsity and beamforming.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015
2014
Mixed stereo audio classification using a stereo-input mixed-to-panned level feature.
IEEE ACM Trans. Audio Speech Lang. Process., 2014
Automatic detection of auditory salience with optimized linear filters derived from human annotation.
Pattern Recognit. Lett., 2014
Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Singing-Voice Separation from Monaural Recordings using Deep Recurrent Neural Networks.
Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014
Detecting articulatory compensation in acoustic data through linear regression modeling.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
An iterative approach to decision tree training for context dependent speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Active Planning, Sensing, and Recognition Using a Resource-Constrained Discriminant POMDP.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2014
Proceedings of the COLING 2014, 2014
2013
Saliency-maximized audio visualization and efficient audio-visual browsing for faster-than-real-time human acoustic event detection.
ACM Trans. Appl. Percept., 2013
Acoustic model adaptation using in-domain background models for dysarthric speech recognition.
Comput. Speech Lang., 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
On Improving Dynamic State Space Approaches to Articulatory Inversion With MAP-Based Parameter Estimation.
IEEE Trans. Speech Audio Process., 2012
Soc. Networks, 2012
IEEE Trans. Pattern Anal. Mach. Intell., 2012
On the Applicability of Speaker Diarization to Audio Indexing of Non-Speech and Mixed Non-Speech/Speech Video Soundtracks.
Int. J. Multim. Data Eng. Manag., 2012
Opportunistic sensing: Unattended acoustic sensor selection using crowdsourcing models.
Proceedings of the IEEE International Workshop on Machine Learning for Signal Processing, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Improving faster-than-real-time human acoustic event detection by saliency-maximized audio visualization.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
How to put it into words - using random forests to extract symbol level descriptions from audio content for concept detection.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Singing-voice separation from monaural recordings using robust principal component analysis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Detection of Acoustic-Phonetic Landmarks in Mismatched Conditions using a Biomimetic Model of Human Auditory Processing.
Proceedings of the COLING 2012, 2012
2011
Proceedings of the Intelligent Video Event Analysis and Understanding, 2011
Estimation of Articulatory Trajectories Based on Gaussian Mixture Model (GMM) With Audio-Visual Information Fusion and Dynamic Kalman Smoothing.
IEEE Trans. Speech Audio Process., 2011
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Improving acoustic event detection using generalizable visual features and multi-modality modeling.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the 14th International Conference on Information Fusion, 2011
2010
IEEE Signal Process. Lett., 2010
Pattern Recognit. Lett., 2010
State-Transition Interpolation and MAP Adaptation for HMM-based Dysarthric Speech Recognition.
Proceedings of the Workshop on Speech and Language Processing for Assistive Technologies, 2010
A minimum converted trajectory error (MCTE) approach to high quality speech-to-lips conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Robust automatic speech recognition with decoder oriented ideal binary mask estimation.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Semi-supervised training of Gaussian mixture models by conditional entropy minimization.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Non-frontal view facial expression recognition based on ergodic hidden Markov model supervectors.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010
Toward robust learning of the Gaussian mixture state emission densities for hidden Markov models.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009
2008
IEEE Trans. Multim., 2008
Proceedings of the 9th IEEE Workshop on Applications of Computer Vision (WACV 2008), 2008
Proceedings of the 16th International Conference on Multimedia 2008, 2008
The entropy of the articulatory phonological code: recognizing gestures from tract variables.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Maximum mutual information estimation with unlabeled data for phonetic classification.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Optimal speech estimator considering room response as well as additive noise: Different approaches in low and high frequency range.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2008), 2008
2007
Prosodic effects on acoustic cues to stop voicing and place of articulation: Evidence from Radio News speech.
J. Phonetics, 2007
Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Robust Analysis and Weighting on MFCC Components for Speech Recognition and Speaker Identification.
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007
Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007
Proceedings of the International Conference on Image Processing, 2007
Articulatory Feature-Based Methods for Acoustic and Audio-Visual Speech Recognition: Summary from the 2006 JHU Summer workshop.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the Multimodal Technologies for Perception of Humans, 2007
Proceedings of the Multimodal Technologies for Perception of Humans, 2007
2006
IEEE Trans. Speech Audio Process., 2006
Speech Commun., 2006
Speech Commun., 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Generalized Optimal Multi-Microphone Speech Enhancement Using Sequential Minimum Variance Distortionless Response(MVDR) Beamforming and Postfiltering.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Hmm-Based and Svm-Based Recognition of the Speech of Talkers With Spastic Dysarthria.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
Simultaneous recognition of words and prosody in the Boston University Radio Speech Corpus.
Speech Commun., 2005
Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Proceedings of the ISCA Tutorial and Research Workshop (ITRW) on Disfluency in Spontaneous Speech, 2005
2004
Model enforcement: a unified feature transformation framework for classification and recognition.
IEEE Trans. Signal Process., 2004
Automatic recognition of pitch movements using multilayer perceptron and time-Delay Recursive neural network.
IEEE Signal Process. Lett., 2004
Proceedings of the 9th International Conference on Intelligent User Interfaces, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Intertranscriber reliability of prosodic labeling on telephone conversation using toBI.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Modeling pronunciation variation using artificial neural networks for English spontaneous speech.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Modeling and recognition of phonetic and prosodic factors for improvements to acoustic speech recognition models.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
A factorial HMM approach to simultaneous recognition of isolated digits spoken by multiple talkers on one audio channel.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
An automatic prosody labeling system using ANN-based syntactic-prosodic model and GMM-based acoustic-prosodic model.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
2003
Approximately independent factors of speech using nonlinear symplectic transformation.
IEEE Trans. Speech Audio Process., 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Prosody dependent speech recognition with explicit duration modelling at intonational phrase boundaries.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
An evaluation of using mutual information for selection of acoustic-features representation of phonemes for speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Maximum mutual information based acoustic-features representation of phonological features for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002
Auditory-modeling inspired methods of feature extraction for robust automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002
2001
Proceedings of the IEEE International Conference on Acoustics, 2001
2000
Signal approximation in Hilbert space and its application on articulatory speech synthesis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Time-frequency distribution of partial phonetic information measured using mutual information.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Multivariate-state hidden Markov models for simultaneous transcription of phones and formants.
Proceedings of the IEEE International Conference on Acoustics, 2000
1996
Formant and burst spectral measurements with quantitative error models for speech sound classification.
PhD thesis, 1996