Samuel Thomas
Orcid: 0000-0001-7573-0620Affiliations:
- IBM Research AI, Thomas J. Watson Research Center, NY, USA
- Johns Hopkins University, USA (former)
According to our database1,
Samuel Thomas
authored at least 108 papers
between 2006 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
-
on clsp.jhu.edu
On csauthors.net:
Bibliography
2024
Whisper-Flamingo: Integrating Visual Features into Whisper for Audio-Visual Speech Recognition and Translation.
CoRR, 2024
What, When, and Where? Self-Supervised Spatio- Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
2023
ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2023
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2023
Effective Training of RNN Transducer Models on Diverse Sources of Speech and Text Data.
Proceedings of the IEEE International Conference on Acoustics, 2023
2022
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Extending RNN-T-based speech recognition systems with emotion and language classification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems.
Proceedings of the IEEE International Conference on Acoustics, 2022
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
A New Data Augmentation Method for Intent Classification Enhancement and its Application on Spoken Conversation Datasets.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Knowledge Distillation Based Training of Universal ASR Source Models for Cross-Lingual Transfer.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision, 2021
End-to-End Spoken Language Understanding Using Transformer Networks and Self-Supervised Pre-Trained Features.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 29th European Signal Processing Conference, 2021
2020
CoRR, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Implicit Transfer of Privileged Acoustic Information in a Generalized Knowledge Distillation Framework.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Improvements to N-gram Language Model Using Text Generated from Neural Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2019
Pre-training of Speaker Embeddings for Low-latency Speaker Change Detection in Broadcast News.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Semi-Supervised Training and Data Augmentation for Adaptation of Automatic Broadcast News Captioning Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
CoRR, 2018
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018
Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Effective joint training of denoising feature space transforms and Neural Network based acoustic models.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Investigating factor analysis features for deep neural networks in noisy speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Analyzing convolutional neural networks for speech activity detection in mismatched acoustic conditions.
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Deep neural network features and semi-supervised training for low resource speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
A summary of the 2012 JHU CLSP workshop on zero resource speech technologies and models of early language acquisition.
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
Adaptation transforms of auto-associative neural networks as features for speaker verification.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Comput. Speech Lang., 2011
Proceedings of the 2011 Symposium on Machine Learning in Speech and Language Processing, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Speech recognitionwith segmental conditional random fields: A summary of the JHU CLSP 2010 Summer Workshop.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Robust spectro-temporal features based on autoregressive models of Hilbert envelopes.
Proceedings of the IEEE International Conference on Acoustics, 2010
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
Applications of signal analysis using autoregressive models for amplitude modulation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009
Tandem representations of spectral envelope and modulation frequency features for ASR.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Temporal envelope subtraction for robust speech recognition using modulation spectrum.
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009
2008
IEEE Signal Process. Lett., 2008
Proceedings of the Machine Learning for Multimodal Interaction, 5th International Workshop, 2008
Hilbert envelope based spectro-temporal features for phoneme recognition in telephone speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Front-end for far-field speech recognition based on frequency domain linear prediction.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Spectro-temporal features for Automatic Speech Recognition using Linear Prediction in spectral domain.
Proceedings of the 2008 16th European Signal Processing Conference, 2008
2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
2006
Proceedings of the 14th European Signal Processing Conference, 2006