Atsuhiko Kai

Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

Comparison of Large Pre-trained Models and Adaptation Methods for Japanese Dialects ASR.

[BibT_eX]

[DOI]

Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

2023

Dialect Speech Recognition Modeling using Corpus of Japanese Dialects and Self-Supervised Learning-based Model XLSR.

[BibT_eX]

[DOI]

Shogo Miwa

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Attention-based CNN and Relative Phase Feature Modeling for Improved Imagined Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Domain Adaptation with Augmented Data by Deep Neural Network Based Method Using Re-Recorded Speech for Automatic Speech Recognition in Real Environment.

[BibT_eX]

[DOI]

Raufun Nahar

Shogo Miwa

Sensors, 2022

2021

Classification of Imagined and Heard Speech Using Amplitude Spectrum and Relative Phase of EEG.

[BibT_eX]

[DOI]

Ryota Sakai

Proceedings of the 3rd IEEE Global Conference on Life Sciences and Technologies, 2021

Robust Query-by-example Spoken Term Detection for Unknown Words Using Speech Retrieval-oriented E2E ASR Modeling.

[BibT_eX]

[DOI]

Takumi Kurokawa

Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

Retrieval-oriented E2E ASR Modeling for Improved Query-by-example Spoken Term Detection.

[BibT_eX]

[DOI]

Takumi Kurokawa

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Effect of Data Augmentation on DNN-Based VAD for Automatic Speech Recognition in Noisy Environment.

[BibT_eX]

[DOI]

Raufun Nahar

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Effects of End-to-end ASR and Score Fusion Model Learning for Improved Query-by-example Spoken Term Detection.

[BibT_eX]

[DOI]

Takumi Kurokawa

Hiroki Kondo

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2018

Multi-Condition Training of Denoising Autoencoder by Augmenting Simulated Reverberant Speech Data.

[BibT_eX]

[DOI]

Raufun Nahar

Takashi Kawai

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

2017

Investigation of efficient semi-automatic correction method using STD for automatic captioning.

[BibT_eX]

[DOI]

Yuji Terada

Kenta Tamiya

Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

2016

Single-channel Dereverberation for Distant-Talking Speech Recognition by Combining Denoising Autoencoder and Temporal Structure Normalization.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

Combination of bottleneck feature extraction and dereverberation for distant-talking speech recognition.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Combining State-level and DNN-based Acoustic Matches for Efficient Spoken Term Detection in NTCIR-12 SpokenQuery&Doc-2 Task.

[BibT_eX]

[DOI]

Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Combining State-Level Spotting and Posterior-Based Acoustic Match for Improved Query-by-Example Spoken Term Detection.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

2015

Environment-dependent denoising autoencoder for distant-talking speech recognition.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2015

Deep neural network-based bottleneck feature and denoising autoencoder-based dereverberation for distant-talking speaker identification.

[BibT_eX]

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Speech selection and environmental adaptation for asynchronous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014

Distant-talking speaker identification by generalized spectral subtraction-based dereverberation and its efficient computation.

[BibT_eX]

[DOI]

Zhaofeng Zhang

EURASIP J. Audio Speech Music. Process., 2014

Speaker Identification by Combining Various Vocal Tract and Vocal Source Features.

[BibT_eX]

[DOI]

Proceedings of the Text, Speech and Dialogue - 17th International Conference, 2014

Combining Subword and State-level Dissimilarity Measures for Improved Spoken Term Detection in NTCIR-11 SpokenQuery&Doc Task.

[BibT_eX]

[DOI]

Mitsuaki Makino

Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Distant-talking speech recognition using multi-channel LMS and multiple-step linear prediction.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Single-sided approach to discriminative PLDA training for text-independent speaker verification without using expanded i-vector.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Utilizing state-level distance vector representation for improved spoken term detection by text and spoken queries.

[BibT_eX]

[DOI]

Mitsuaki Makino

Naoki Yamamoto

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Denoising autoencoder and environment adaptation for distant-talking speech recognition with asynchronous speech recording.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

Spoken Term Detection Using Distance-Vector based Dissimilarity Measures and Its Evaluation on the NTCIR-10 SpokenDoc-2 Task.

[BibT_eX]

[DOI]

Naoki Yamamoto

Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Improvement of distant-talking speaker identification using bottleneck features of DNN.

[BibT_eX]

[DOI]

Takanori Yamada

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Hands-free speaker identification based on spectral subtraction using a multi-channel least mean square approach.

[BibT_eX]

[DOI]

Zhaofeng Zhang

Proceedings of the IEEE International Conference on Acoustics, 2013

Using acoustic dissimilarity measures based on state-level distance vector representation for improved spoken term detection.

[BibT_eX]

[DOI]

Naoki Yamamoto

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Speech recognition using blind source separation and dereverberation method for mixed sound of speech and music.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Dereverberation and denoising based on generalized spectral subtraction by multi-channel LMS algorithm using a small-scale microphone array.

[BibT_eX]

[DOI]

Kyohei Odani

EURASIP J. Adv. Signal Process., 2012

Speech Recognition by Denoising and Dereverberation Based on Spectral Subtraction in a Real Noisy Reverberant Environment.

[BibT_eX]

[DOI]

Kyohei Odani

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Dereverberantion based on generalized spectral subtraction for distant-talking speaker recognition.

[BibT_eX]

[DOI]

Zhaofeng Zhang

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Distant-talking speaker identification using a reverberation model with various artificial room impulse responses.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

On the use of phase information-based joint factor analysis for speaker verification under channel mismatch condition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Evaluation of Hands-Free Large Vocabulary Continuous Speech Recognition by Blind Dereverberation Based on Spectral Subtraction by Multi-channel LMS Algorithm.

[BibT_eX]

[DOI]

Kyohei Odani

Proceedings of the Text, Speech and Dialogue - 14th International Conference, 2011

2007

Spoken language understanding method using confidence measure and dialogue history.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 2007

2004

Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.

[BibT_eX]

Proceedings of the Life-like characters - tools, affective functions, and applications., 2004

An understanding strategy based on plausibility score in recognition history using CSR confidence measure.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2002

Influence of different dialogue situations on user²s behavior in spoken corrections.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Linguistic and acoustic changes of user²s utterances caused by different dialogue situations.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2000

Usability of Browser-Based Pen-Touch/Speech User Interfaces for Form-Based Application in Mobile Environment.

[BibT_eX]

[DOI]

Takahiro Nakano

Proceedings of the Advances in Multimodal Interfaces, 2000

1998

Comparison of continuous speech recognition systems with unknown-word processing for speech disfluencies.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 1998

Dealing with out-of-vocabulary words and speech disfluencies in an n-gram based speech understanding system.

[BibT_eX]

[DOI]

Yoshifumi Hirose

Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1995

Relationship among Recognition Rate, Rejection Rate and False Alarm Rate in a Spoken Word Recognition System.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 1995

Investigation on unknown word processing and strategies for spontaneous speech understanding.

[BibT_eX]

[DOI]

Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

1994

A context-free grammar-driven, one-pass HMM-based continuous speech recognition method.

[BibT_eX]

[DOI]

Syst. Comput. Jpn., 1994

Evaluation of unknown word processing in a spoken word recognition system.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1992

A frame-synchronous continuous speech recognition algorithm using a top-down parsing of context-free grammar.

[BibT_eX]

[DOI]