Ian Lane

Shinji Watanabe

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Branchformer: Parallel MLP-Attention Architectures to Capture Local and Global Context for Speech Recognition and Understanding.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2022

2021

Identifying Actions for Sound Event Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Human-Agent Collaboration Strategies for Vision-Grounded Instruction Following.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Machine Translation with Binary Feedback: a Large-Margin Approach.

[BibT_eX]

[DOI]

Avneesh Saluja

Ying Zhang

Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers, 2020

2019

Deep Speaker Embedding for Speaker-Targeted Automatic Speech Recognition.

[BibT_eX]

[DOI]

John Paul Shen

Proceedings of the NLPIR 2019: The 3rd International Conference on Natural Language Processing and Information Retrieval, Tokushima, Japan, June 28, 2019

Learning Question-Guided Video Representation for Multi-Turn Video Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Visually Grounded Interaction and Language (ViGIL), 2019

BERT-DST: Scalable End-to-End Dialogue State Tracking with Bidirectional Encoder Representations from Transformer.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio-visual TED corpus: enhancing the TED-LIUM corpus with facial information, contextual text and object recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers, 2019

2018

AudioPairBank: towards a large-scale tag-pair-based audio content analysis.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2018

The CAPIO 2017 Conversational Speech Recognition System.

[BibT_eX]

[DOI]

Kyu Jeong Han

CoRR, 2018

Adversarial Learning of Task-Oriented Neural Dialog Models.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

End-to-End Learning of Task-Oriented Dialogs.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, 2018

Online Incremental Learning for Speaker-Adaptive Language Models.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Densely Connected Networks for Conversational Speech Recognition.

[BibT_eX]

[DOI]

Kyu Jeong Han

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Understanding and improving recurrent networks for human activity recognition by continuous attention.

[BibT_eX]

[DOI]

Proceedings of the 2018 ACM International Symposium on Wearable Computers, 2018

Customized Nonlinear Bandits for Online Response Selection in Neural Conversation Models.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Multi-Domain Adversarial Learning for Slot Filling in Spoken Language Understanding.

[BibT_eX]

[DOI]

CoRR, 2017

Speeding up Hyper-parameter Optimization by Extrapolation of Learning Curves Using Previous Builds.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

An End-to-End Trainable Neural Network Model with Belief Tracking for Task-Oriented Dialog.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

End-to-End Speech Recognition with Auditory Attention for Multi-Microphone Distance Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Deep Learning-Based Telephony Speech Recognition in the Wild.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Hierarchical Constrained Bayesian Optimization for Feature, Acoustic Model and Decoder Parameter Optimization.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Dialog context language modeling with recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

An approach for self-training audio event detectors using web data.

[BibT_eX]

[DOI]

Proceedings of the 25th European Signal Processing Conference, 2017

Semi-supervised convolutional neural networks for human activity recognition.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Big Data (IEEE BigData 2017), 2017

Iterative policy learning in end-to-end trainable task-oriented neural dialog models.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

3D Face Detection via Reconstruction Over Hierarchical Features for Single Face Situations.

[BibT_eX]

[DOI]

Bo Yu

Fang Chen

Int. J. Pattern Recognit. Artif. Intell., 2016

AudioSentibank: Large-scale Semantic Ontology of Acoustic Concepts for Audio Content Analysis.

[BibT_eX]

[DOI]

CoRR, 2016

Environmental Noise Embeddings for Robust Speech Recognition.

[BibT_eX]

[DOI]

Bhiksha Raj

CoRR, 2016

Automated optimization of decoder hyper-parameters for online LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Joint Online Spoken Language Understanding and Language Modeling With Recurrent Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2016 Conference, 2016

Task Load Estimation and Mediation Using Psycho-physiological Measures.

[BibT_eX]

[DOI]

Rahul Rajan

Ted Selker

Proceedings of the 21st International Conference on Intelligent User Interfaces, 2016

Attention-Based Recurrent Neural Network Models for Joint Intent Detection and Slot Filling.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Semi-Supervised Speaker Adaptation for In-Vehicle Speech Recognition with Deep Neural Networks.

[BibT_eX]

[DOI]

Wonkyum Lee

Kyu Jeong Han

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Recurrent Models for Auditory Attention in Multi-Microphone Distant Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speaker-Targeted Audio-Visual Models for Speech Recognition in Cocktail-Party Environments.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

On Online Attention-Based Speech Recognition and Joint Mandarin Character-Pinyin Training.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Accelerating multi-user large vocabulary continuous speech recognition on heterogeneous CPU-GPU platforms.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

City-Identification of Flickr Videos Using Semantic Acoustic Features.

[BibT_eX]

[DOI]

Proceedings of the IEEE Second International Conference on Multimedia Big Data, 2016

Effects of Mediating Notifications Based on Task Load.

[BibT_eX]

[DOI]

Rahul Rajan

Ted Selker

Proceedings of the 8th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, 2016

An Oral Exam for Measuring a Dialog System's Capabilities.

[BibT_eX]

[DOI]

David Cohen

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Situated language understanding for a spoken dialog system within vehicles.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2015

Recurrent Models for Auditory Attention in Multi-Microphone Distance Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2015

Deep Recurrent Neural Networks for Acoustic Modelling.

[BibT_eX]

[DOI]

CoRR, 2015

Transferring knowledge from a RNN to a DNN.

[BibT_eX]

[DOI]

Nan Rosemary Ke

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Semi-supervised training in low-resource ASR and KWS.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Deep convolutional neural networks for acoustic modeling in low resource languages.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Multi-task deep learning for image understanding.

[BibT_eX]

[DOI]

Bo Yu

Proceedings of the 6th International Conference of Soft Computing and Pattern Recognition, 2014

Situated Language Understanding at 25 Miles per Hour.

[BibT_eX]

[DOI]

Proceedings of the SIGDIAL 2014 Conference, 2014

The HRI-CMU Corpus of Situated In-Car Interactions.

[BibT_eX]

[DOI]

David Cohen

Antoine Raux

Proceedings of the Situated Dialog in Speech-Based Human-Computer Interaction, 2014

Neural network language models for low resource languages.

[BibT_eX]

[DOI]

Ankur Gandhe

Florian Metze

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Distributed asynchronous optimization of convolutional neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multi-stream combination for LVCSR and keyword search on GPU-accelerated platforms.

[BibT_eX]

[DOI]

Wonkyum Lee

Proceedings of the IEEE International Conference on Acoustics, 2014

Accelerating large vocabulary continuous speech recognition on heterogeneous CPU-GPU platforms.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Optimization of Neural Network Language Models for keyword search.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Modular combination of deep neural networks for acoustic modeling.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Situated multi-modal dialog system in vehicles.

[BibT_eX]

[DOI]

Proceedings of the 6th workshop on Eye gaze in intelligent human machine interaction: gaze in multimodal interaction, 2013

Optimized MFCC feature extraction on GPU.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Using web text to improve keyword spotting in speech.

[BibT_eX]

[DOI]

Ankur Gandhe

Long Qin

Florian Metze

Alexander I. Rudnicky

Matthias Eck

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

HRItk: The Human-Robot Interaction ToolKit Rapid Development of Speech-Centric Interactive Systems in ROS.

[BibT_eX]

[DOI]

Antoine Raux

Proceedings of the Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data, 2012

A Simulation-based Framework for Spoken Language Understanding and Action Selection in Situated Interaction.

[BibT_eX]

[DOI]

David Cohen

Proceedings of the Workshop on Future directions and needs in the Spoken Dialog Community: Tools and Data, 2012

Efficient On-The-Fly Hypothesis Rescoring in a Hybrid GPU/CPU-based Large Vocabulary Continuous Speech Recognition Engine.

[BibT_eX]

[DOI]

Jike Chong

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Unsupervised vocabulary selection for real-time speech recognition of lectures.

[BibT_eX]

[DOI]

Paul Maergner

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Context-aware Language Modeling for Conversational Speech Translation.

[BibT_eX]

[DOI]

Avneesh Saluja

Ying Zhang

Proceedings of Machine Translation Summit XIII: Papers, 2011

Unsupervised Vocabulary Selection for Domain-Independent Simultaneous Lecture Translation.

[BibT_eX]

[DOI]

Paul Maergner

Proceedings of Machine Translation Summit XIII: Papers, 2011

Unsupervised vocabulary selection for simultaneous lecture translation.

[BibT_eX]

[DOI]

Proceedings of the 2011 International Workshop on Spoken Language Translation, 2011

Ad-Hoc Meeting Transcription on Clusters of Mobile Devices.

[BibT_eX]

[DOI]

Michele Cossalter

Priya Sundararajan

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Rapid Training of Acoustic Models Using Graphics Processing Unit.

[BibT_eX]

[DOI]

Senaka Buthpitiya

Jike Chong

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Jibbigo: Speech-to-speech translation on mobile devices.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Tools for Collecting Speech Corpora via Mechanical-Turk.

[BibT_eX]

[DOI]

Proceedings of the 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, 2010

Real-time spoken language identification and recognition for speech-to-speech translation.

[BibT_eX]

[DOI]

Daniel Chung Yong Lim

Proceedings of the 2010 International Workshop on Spoken Language Translation, 2010

Named-entity projection and data-driven morphological decomposition for field maintainable speech-to-speech translation systems.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

2009

Incremental Adaptation of Speech-to-Speech Translation.

[BibT_eX]

[DOI]

Nguyen Bach

Roger Hsiao

Matthias Eck

Paisarn Charoenpornsawat

Proceedings of the Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, May 31, 2009

Language identification for speech-to-speech translation.

[BibT_eX]

[DOI]

Daniel Chung Yong Lim

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Pronunciation modeling for dialectal arabic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008

Class-based statistical machine translation for field maintainable speech-to-speech translation.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Sentence segmentation and punctuation recovery for spoken language translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

Bilingual LSA-based adaptation for statistical machine translation.

[BibT_eX]

[DOI]

Yik-Cheung Tam

Mach. Transl., 2007

A Log-Linear Block Transliteration Model based on Bi-Stream HMMs.

[BibT_eX]

[DOI]

Proceedings of the Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, 2007

Improving spoken language translation by automatic disfluency removal: evidence from conversational speech transcripts.

[BibT_eX]

[DOI]

Sharath Rao

Proceedings of Machine Translation Summit XI: Papers, 2007

The CMU-UKA statistical machine translation systems for IWSLT 2007.

[BibT_eX]

[DOI]

Proceedings of the 2007 International Workshop on Spoken Language Translation, 2007

Optimizing sentence segmentation for spoken language translation.

[BibT_eX]

[DOI]

Sharath Rao

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Handling OOV words in Arabic ASR via flexible morphological constraints.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Bilingual-LSA Based LM Adaptation for Spoken Language Translation.

[BibT_eX]

[DOI]

Yik-Cheung Tam

Proceedings of the ACL 2007, 2007

2006

Flexible spoken language understanding based on topic classification and domain detection.

[BibT_eX]

[DOI]

PhD thesis, 2006

Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2006

The UKA/CMU statistical machine translation system for IWSLT 2006.

[BibT_eX]

[DOI]

Matthias Eck

Nguyen Bach

Sanjika Hewavitharana

Muntsin Kolss

Bing Zhao

Almut Silja Hildebrand

Stephan Vogel

Proceedings of the 2006 International Workshop on Spoken Language Translation, 2006

2005

Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2005

Utterance verification incorporating in-domain confidence and discourse coherence measures.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Incorporating Dialogue Context and Topic Clustering in Out-of-Domain Detection.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Example-based training of dialogue planning incorporating user and situation models.

[BibT_eX]

[DOI]

Ian Richard Lane

Shinichi Ueno

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Topic classification and verification modeling for out-of-domain utterance detection.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Out-of-domain detection based on confidence measures from multiple topic classification.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003

Hierarchical topic classification for dialog speech recognition based on language model switching.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Language model switching based on topic detection for dialog speech recognition.

[BibT_eX]

[DOI]