Gakuto Kurata

According to our database1, Gakuto Kurata authored at least 61 papers between 2002 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems.
Proceedings of the IEEE International Conference on Acoustics, 2024

Robust ASR Error Correction with Conservative Data Filtering.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

2023
Speech-enriched Memory for Inference-time Adaptation of ASR Models to Word Dictionaries.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

2022
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving ASR Robustness in Noisy Condition Through VAD Integration.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Global RNN Transducer Models For Multi-dialect Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021
Knowledge Distillation Leveraging Alternative Soft Targets from Non-Parallel Qualified Speech Data.
CoRR, 2021

Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Generalized Knowledge Distillation from an Ensemble of Specialized Teachers Leveraging Unsupervised Neural Clustering.
Proceedings of the IEEE International Conference on Acoustics, 2021

RNN Transducer Models for Spoken Language Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Knowledge Distillation from Offline to Streaming RNN Transducer for End-to-End Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Spoken Language Understanding Without Full Transcripts.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

New Advances in Speaker Diarization.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Speaker Embeddings Incorporating Acoustic Conditions for Diarization.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Converting Written Language to Spoken Language with Neural Machine Translation for Language Modeling.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Multi-Task CTC Training with Auxiliary Feature Reconstruction for End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Guiding CTC Posterior Spike Timings for Improved Posterior Fusion and Knowledge Distillation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Direct Neuron-Wise Fusion of Cognate Neural Networks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

English Broadcast News Speech Recognition by Humans and Machines.
Proceedings of the IEEE International Conference on Acoustics, 2019

Improvements to N-gram Language Model Using Text Generated from Neural Language Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Data Augmentation Based on Vowel Stretch for Improving Children's Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Improved Knowledge Distillation from Bi-Directional to Uni-Directional LSTM CTC for End-to-End Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Inference-Invariant Transformation of Batch Normalization for Domain Adaptation of Acoustic Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Data Augmentation Improves Recognition of Foreign Accented Speech.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

2017
Symbol Sequence Search from Telephone Conversation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

English Conversational Telephone Speech Recognition by Humans and Machines.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Empirical Exploration of Novel Architectures and Objectives for Language Models.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Factorial Modeling for Effective Suppression of Directional Noise.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Ensembles of Multi-Scale VGG Acoustic Models.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Efficient Knowledge Distillation from an Ensemble of Teachers.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Harmonic feature fusion for robust neural network-based acoustic modeling.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Effective joint training of denoising feature space transforms and Neural Network based acoustic models.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Language modeling with highway LSTM.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Leveraging Sentence-level Information with Encoder LSTM for Natural Language Understanding.
CoRR, 2016

Improved Neural Network-based Multi-label Classification with Better Initialization Leveraging Label Co-occurrence.
Proceedings of the NAACL HLT 2016, 2016

Labeled Data Generation with Encoder-Decoder LSTM for Semantic Slot Filling.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speech recognition robust against speech overlapping in monaural recordings of telephone conversations.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Leveraging Sentence-level Information with Encoder LSTM for Semantic Slot Filling.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

2015
Discriminative re-ranking for automatic speech recognition by leveraging invariant structures.
Speech Commun., 2015

Deep neural network training emphasizing central frames.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A metric for evaluating speech recognizer output based on human-perception model.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

2014
Leveraging phonetic context dependent invariant structure for continuous speech recognition.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

2012
Acoustically discriminative language model training with pseudo-hypothesis.
Speech Commun., 2012

Leveraging word confusion networks for named entity modeling and detection from conversational telephone speech.
Speech Commun., 2012

Discriminative Reranking for LVCSR Leveraging Invariant Structure.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

2011
Continuous Digits Recognition Leveraging Invariant Structure.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Acoustic Model Training with Detecting Transcription Errors in the Training Data.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Named entity recognition from Conversational Telephone Speech leveraging Word Confusion Networks for training and recognition.
Proceedings of the IEEE International Conference on Acoustics, 2011

Training of error-corrective model for ASR without using audio data.
Proceedings of the IEEE International Conference on Acoustics, 2011

2009
Acoustically discriminative training for language models.
Proceedings of the IEEE International Conference on Acoustics, 2009

2007
Automatic Prosody Labeling Using Multiple Models for Japanese.
IEICE Trans. Inf. Syst., 2007

Preliminary experiments toward automatic generation of new TTS voices from recorded speech alone.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Unsupervised Lexicon Acquisition from Speech and Text.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Unsupervised Adaptation of a Stochastic Language Model Using a Japanese Raw Corpus.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Phoneme-to-Text Transcription System with an Infinite Vocabulary.
Proceedings of the ACL 2006, 2006

2005
Class-based variable memory length Markov model.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
GDQA: Graph Driven Question Answering System - NTCIR-4 QAC2 Experiments.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

2002
Corpus-based analysis of English spoken by Japanese students in view of the entire phonemic system of English.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002


  Loading...