Takaaki Hori
Orcid: 0000-0003-4560-8039
According to our database1,
Takaaki Hori
authored at least 143 papers
between 2001 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
2023
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
2022
Momentum Pseudo-Labeling: Semi-Supervised ASR With Continuously Improving Pseudo-Labels.
IEEE J. Sel. Top. Signal Process., 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
CoRR, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
The 2020 ESPnet update: new features, broadened applications, performance improvements, and future plans.
CoRR, 2020
All-in-One Transformer: Unifying Speech Recognition, Audio Tagging, and Event Detection.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Unsupervised Speaker Adaptation Using Attention-Based Speaker Memory for End-to-End ASR.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Adversarial training and decoding strategies for end-to-end neural conversation models.
Comput. Speech Lang., 2019
Comput. Speech Lang., 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Unidirectional Neural Network Architectures for End-to-End Automatic Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
End-to-end Audio Visual Scene-aware Dialog Using Multimodal Attention-based Video Features.
Proceedings of the IEEE International Conference on Acoustics, 2019
Language Model Integration Based on Memory Control for Sequence to Sequence Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments<sup>*</sup>.
Proceedings of the 27th European Signal Processing Conference, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
CoRR, 2018
Vectorization of hypotheses and speech for faster beam search in encoder decoder-based speech recognition.
CoRR, 2018
CoRR, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Multilingual Sequence-to-Sequence Speech Recognition: Architecture, Transfer Learning, and Language Modeling.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Multimodal Attention for Fusion of Audio and Spatiotemporal Features for Video Description.
Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2018
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Error detection and accuracy estimation in automatic speech recognition using deep bidirectional recurrent neural networks.
Speech Commun., 2017
IEEE J. Sel. Top. Signal Process., 2017
Unified Architecture for Multichannel End-to-End Speech Recognition With Neural Beamforming.
IEEE J. Sel. Top. Signal Process., 2017
Multi-microphone speech recognition integrating beamforming, robust feature extraction, and advanced DNN/RNN backend.
Comput. Speech Lang., 2017
Advances in Joint CTC-Attention Based End-to-End Speech Recognition with a Deep CNN Encoder and RNN-LM.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 34th International Conference on Machine Learning, 2017
Proceedings of the IEEE International Conference on Computer Vision, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Language independent end-to-end architecture for joint language identification and speech recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Multi-level language modeling and decoding for open vocabulary end-to-end speech recognition.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
2016
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Automated structure discovery and parameter tuning of neural network language model based on evolution strategy.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Context-Sensitive and Role-Dependent Spoken Language Understanding Using Bidirectional and Attention LSTMs.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the IEEE International Conference on Multimedia and Expo, 2016
Minimum word error training of long short-term memory recurrent neural network language models for speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016
2015
EURASIP J. Adv. Signal Process., 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
ASR error detection and recognition rate estimation using deep bidirectional recurrent neural networks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
WFST-based structural classification integrating dnn acoustic features and RNN language features for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Double-layer neighborhood graph based similarity search for fast query-by-example spoken term detection.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
The MERL/SRI system for the 3RD CHiME challenge using beamforming, robust feature extraction, and advanced speech recognition.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Restructuring output layers of deep neural networks using minimum risk parameter clustering.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Fast segment search for corpus-based speech enhancement based on speech recognition technology.
Proceedings of the IEEE International Conference on Acoustics, 2014
Real-time one-pass decoding with recurrent neural network language model for speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014
Zero-resource spoken term detection using hierarchical graph-based similarity search.
Proceedings of the IEEE International Conference on Acoustics, 2014
Defeating reverberation: Advanced dereverberation and recognition techniques for hands-free speech recognition.
Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014
2013
Synthesis Lectures on Speech and Audio Processing, Morgan & Claypool Publishers, ISBN: 978-3-031-02562-4, 2013
Prior-shared feature and model space speaker adaptation by consistently employing map estimation.
Speech Commun., 2013
Speech recognition in living rooms: Integrated speech enhancement and recognition system based on spatial, spectral and temporal modeling of sounds.
Comput. Speech Lang., 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
A method for structure estimation of weighted finite-state transducers and its application to grapheme-to-phoneme conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Discriminative recognition rate estimation for N-best list and its application to N-best rescoring.
Proceedings of the IEEE International Conference on Acoustics, 2013
Coupling beamforming with spatial and spectral feature based spectral enhancement and its application to meeting recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Large vocabulary continuous speech recognition based on WFST structured classifiers and deep bottleneck features.
Proceedings of the IEEE International Conference on Acoustics, 2013
Feature space variational Bayesian linear regression and its combination with model space VBLR.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
IEEE Trans. Speech Audio Process., 2012
Structural Classification Methods Based on Weighted Finite-State Transducers for Automatic Speech Recognition.
IEEE Trans. Speech Audio Process., 2012
Low-Latency Real-Time Meeting Recognition and Understanding Using Distant Microphones and Omni-Directional Camera.
IEEE Trans. Speech Audio Process., 2012
Speech Commun., 2012
Efficient prior and incremental beam width control to suppress excessive speech recognition time based on score range estimation.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012
Recognition rate estimation based on word alignment network and discriminative error type classification.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012
Integrating Deep Neural Networks into Structural Classification Approach based on Weighted Finite-State Transducers.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Efficient Beam Width Control to Suppress Excessive Speech Recognition Computation Time Based on Prior Score Range Normalization.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Speaker Adaptation Using Variational Bayesian Linear Regression in Normalized Feature Space.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Bag Of ARCS: New representation of speech segment features based on finite state machines.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Error type classification and word accuracy estimation using alignment features from word confusion network.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Spoken document retrieval by discriminative modeling in a high dimensional feature space.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Handling uncertain observations in unsupervised topic-mixture language model adaptation.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Proceedings of the IEEE International Conference on Acoustics, 2011
Round-robin duel discriminative language models in one-pass decoding with on-the-fly error correction.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
Improved Sequential Dependency Analysis Integrating Labeling-Based Sentence Boundary Detection.
IEICE Trans. Inf. Syst., 2010
Application of topic tracking model to language model adaptation and meeting analysis.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010
Real-time meeting recognition and understanding using distant microphones and omni-directional camera.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010
Large vocabulary continuous speech recognition using WFST-based linear classifier for structured data.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Improvements of search error risk minimization in viterbi beam search for speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
A discriminative model for continuous speech recognition based on Weighted Finite State Transducers.
Proceedings of the IEEE International Conference on Acoustics, 2010
A comparative study on methods of Weighted language model training for reranking lvcsr N-best hypotheses.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
2008
Speech Commun., 2008
2007
Efficient WFST-Based One-Pass Decoding With On-The-Fly Hypothesis Rescoring in Extremely Large Vocabulary Continuous Speech Recognition.
IEEE Trans. Speech Audio Process., 2007
An approach to efficient generation of high-accuracy and compact error-corrective models for speech recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
2006
IEEE Comput. Intell. Mag., 2006
Sentence boundary detection using sequential dependency analysis combined with CRF-based chunking.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Efficient Generation of high-order context-dependent Weighted Finite State Transducers for Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
Fast on-the-fly composition for weighted finite-state transducers in 1.8 million-word vocabulary continuous speech recognition.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2001
Improved phoneme-history-dependent search for large-vocabulary continuous-speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001