Takaaki Hori

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Sequence Transduction with Graph-Based Supervision.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Optimizing Latency for Online Video CaptioningUsing Audio-Visual Transformers.

[BibT_eX]

[DOI]

CoRR, 2021

Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Speech Recognition Via Graph-Based Temporal Classification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Capturing Multi-Resolution Context by Dilated Self-Attention.

[BibT_eX]

[DOI]

Aswin Shanmugam Subramanian

Proceedings of the IEEE International Conference on Acoustics, 2021

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Multi-Stream End-to-End Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.

[BibT_eX]

[DOI]

Wangyou Zhang

CoRR, 2020

Multi-Pass Transformer for Machine Translation.

[BibT_eX]

[DOI]

CoRR, 2020

All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Transformer-Based Long-Context End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Streaming Automatic Speech Recognition with the Transformer Model.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Adversarial training and decoding strategies for end-to-end neural conversation models.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2019

Overview of the sixth dialog system technology challenge: DSTC6.

[BibT_eX]

[DOI]

Julien Perez

Ryuichiro Higashinaka

Comput. Speech Lang., 2019

Self-supervised Sequence-to-sequence ASR using Unpaired Speech and Text.

[BibT_eX]

[DOI]

Lukás Burget

Jan Cernocký

CoRR, 2019

End-to-End Multilingual Multi-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Vectorized Beam Search for CTC-Attention-Based Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Analysis of Multilingual Sequence-to-Sequence Speech Recognition Systems.

[BibT_eX]

[DOI]

Martin Karafiát

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Joint Student-Teacher Learning for Audio-Visual Scene-Aware Dialog.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Semi-Supervised Sequence-to-Sequence ASR Using Unpaired Speech and Text.

[BibT_eX]

[DOI]

Lukás Burget

Jan Cernocký

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Stream Attention-based Multi-array End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Triggered Attention for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Cycle-consistency Training for End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features.

[BibT_eX]

[DOI]

Raphael Gontijo Lopes

Proceedings of the IEEE International Conference on Acoustics, 2019

Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition.

[BibT_eX]

[DOI]

Jaejin Cho

Hirofumi Inaguma

Jesús Villalba

Najim Dehak

Proceedings of the IEEE International Conference on Acoustics, 2019

Promising Accurate Prefix Boosting for Sequence-to-sequence ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments<sup>*</sup>.

[BibT_eX]

[DOI]

Proceedings of the 27th European Signal Processing Conference, 2019

Streaming End-to-End Speech Recognition with Joint CTC-Attention Based Models.

[BibT_eX]

[DOI]

Nelson Enrique Yalta Soplin

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

A Comparative Study on Transformer vs RNN in Speech Applications.

[BibT_eX]

[DOI]

Ryuichi Yamamoto

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018

Multi-encoder multi-resolution framework for end-to-end speech recognition.

[BibT_eX]

[DOI]

Ruizhi Li

Xiaofei Wang

Sri Harish Reddy Mallidi

Hynek Hermansky

CoRR, 2018

Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition.

[BibT_eX]

[DOI]

Hiroshi Seki

CoRR, 2018

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments.

[BibT_eX]

[DOI]

CoRR, 2018

End-to-end Speech Recognition With Word-Based Rnn Language Models.

[BibT_eX]

[DOI]

Jaejin Cho

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Back-Translation-Style Data Augmentation for end-to-end ASR.

[BibT_eX]

[DOI]

Kazuya Takeda

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling.

[BibT_eX]

[DOI]

Jaejin Cho

Nelson Enrique Yalta Soplin

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

ESPnet: End-to-End Speech Processing Toolkit.

[BibT_eX]

[DOI]

Jahn Heymann

Matthew Wiesner

Nanxin Chen

Adithya Renduchintala

Tsubasa Ochiai

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

End-to-End Multi-Speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

An End-to-End Language-Tracking Speech Recognizer for Mixed-Language Speech.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speaker Adaptation for Multichannel End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018

A Purely End-to-End System for Multi-speaker Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018

2017

Duration-Controlled LSTM for Polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks.

[BibT_eX]

[DOI]

Speech Commun., 2017

Hybrid CTC/Attention Architecture for End-to-End Speech Recognition.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2017

Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2017

Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2017

Attention-Based Multimodal Fusion for Video Description.

[BibT_eX]

[DOI]

CoRR, 2017

End-to-end Conversation Modeling Track in DSTC6.

[BibT_eX]

[DOI]

CoRR, 2017

Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Multichannel End-to-end Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Attention-Based Multimodal Fusion for Video Description.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Computer Vision, 2017

Student-teacher network learning with enhanced features.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Joint CTC-attention based end-to-end speech recognition using multi-task learning.

[BibT_eX]

[DOI]

Suyoun Kim

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Language independent end-to-end architecture for joint language identification and speech recognition.

[BibT_eX]

[DOI]

John R. Hershey

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Multi-level language modeling and decoding for open vocabulary end-to-end speech recognition.

[BibT_eX]

[DOI]

John R. Hershey

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Early and late integration of audio features for automatic video description.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Joint CTC/attention decoding for end-to-end speech recognition.

[BibT_eX]

[DOI]

John R. Hershey

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

Toolkits for Robust Speech Processing.

[BibT_eX]

[DOI]

Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017

2016

Estimating Speech Recognition Accuracy Based on Error Type Classification.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Automated structure discovery and parameter tuning of neural network language model based on evolution strategy.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Dialog state tracking with attention-based sequence-to-sequence learning.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Driver confusion status detection using recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Multimedia and Expo, 2016

Minimum word error training of long short-term memory recurrent neural network language models for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Bidirectional LSTM-HMM Hybrid System for Polyphonic Sound Event Detection.

[BibT_eX]

[DOI]

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016

2015

Strategies for distant speech recognitionin reverberant environments.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2015

Multiscale recurrent neural network based language model.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Context adaptive deep neural networks for fast acoustic model adaptation.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Double-layer neighborhood graph based similarity search for fast query-by-example spoken term detection.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Restructuring output layers of deep neural networks using minimum risk parameter clustering.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Fast segment search for corpus-based speech enhancement based on speech recognition technology.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Real-time one-pass decoding with recurrent neural network language model for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Zero-resource spoken term detection using hierarchical graph-based similarity search.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014

2013

Speech Recognition Algorithms Based on Weighted Finite-State Transducers

[BibT_eX]

[DOI]

Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02562-4, 2013

Prior-shared feature and model space speaker adaptation by consistently employing map estimation.

[BibT_eX]

[DOI]

Speech Commun., 2013

Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2013

Unsupervised discriminative language modeling using error rate estimator.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Discriminative recognition rate estimation for N-best list and its application to N-best rescoring.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Coupling beamforming with spatial and spectral feature based spectral enhancement and its application to meeting recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Large vocabulary continuous speech recognition based on WFST structured classifiers and deep bottleneck features.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Feature space variational Bayesian linear regression and its combination with model space VBLR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Graph index based query-by-example search on a large speech data set.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Round-Robin Duel Discriminative Language Models.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Efficient training of discriminative language models by sample selection.

[BibT_eX]

[DOI]

Speech Commun., 2012

Model Shrinkage for Discriminative Language Models.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2012

Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Recognition rate estimation based on word alignment network and discriminative error type classification.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Integrating Deep Neural Networks into Structural Classification Approach based on Weighted Finite-State Transducers.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range Normalization.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Bag Of ARCS: New representation of speech segment features based on finite state machines.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Error type classification and word accuracy estimation using alignment features from word confusion network.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Spoken document retrieval by discriminative modeling in a high dimensional feature space.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Handling uncertain observations in unsupervised topic-mixture language model adaptation.

[BibT_eX]

[DOI]

Ekapol Chuangsuwanich

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Topic tracking language model for speech recognition.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2011

Gibbs sampling based Multi-scale Mixture Model for speaker clustering.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Round-robin duel discriminative language models in one-pass decoding with on-the-fly error correction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Improved Sequential Dependency Analysis Integrating Labeling-Based Sentence Boundary Detection.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2010

Application of topic tracking model to language model adaptation and meeting analysis.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Real-time meeting recognition and understanding using distant microphones and omni-directional camera.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Round-robin discrimination model for reranking ASR hypotheses.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improvements of search error risk minimization in viterbi beam search for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A discriminative model for continuous speech recognition based on Weighted Finite State Transducers.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

A comparative study on methods of Weighted language model training for reranking lvcsr N-best hypotheses.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Search error risk minimization in Viterbi beam search for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2008

Sequential dependency analysis for online spontaneous speech processing.

[BibT_eX]

[DOI]

Speech Commun., 2008

2007

Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2007

An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Open-Vocabulary Spoken Utterance Retrieval using Confusion Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Advanced computational models and learning theories for spoken language processing.

[BibT_eX]

[DOI]

IEEE Comput. Intell. Mag., 2006

Sentence boundary detection using sequential dependency analysis combined with CRF-based chunking.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An Extremely Large Vocabulary Approach to Named Entity Extraction from Speech.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Experiments with probabilistic principal component analysis in LVCSR.

[BibT_eX]

[DOI]

Mike Schuster

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Generalized fast on-the-fly composition algorithm for WFST-based speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Efficient Generation of high-order context-dependent Weighted Finite State Transducers for Speech Recognition.

[BibT_eX]

[DOI]

Mike Schuster

Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004

Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Yasuhiro Minami

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003

Speech summarization using weighted finite-state transducers.

[BibT_eX]

[DOI]

Yasuhiro Minami

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Evaluation method for automatic speech summarization.

[BibT_eX]

[DOI]

Sadaoki Furui

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Language model adaptation using WFST-based speaking-style translation.

[BibT_eX]

[DOI]

Daniel Willett

Yasuhiro Minami

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Deriving disambiguous queries in a spoken interactive ODQA system.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Spoken Interactive ODQA System: SPIQA.

[BibT_eX]

[DOI]

Proceedings of the ACL 2003, 2003

2001

Improved phoneme-history-dependent search for large-vocabulary continuous-speech recognition.

[BibT_eX]

[DOI]