Kazumasa Yamamoto

According to our database1, Kazumasa Yamamoto authored at least 76 papers between 1995 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Elderly Speech Recognition Using Whisper and Speaker Adaptation.
Proceedings of the 13th IEEE Global Conference on Consumer Electronics, 2024

Data Augmentation Methods and Influence of Speech Recognition Performance for TED Talk's English to Japanese Speech Translation.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

A new speech corpus of super-elderly Japanese for acoustic modeling.
Comput. Speech Lang., 2023

A Study of Speech Recognition, Speech Translation, and Speech Summarization of TED English Lectures.
Proceedings of the 12th IEEE Global Conference on Consumer Electronics, 2023

A Corpus-Based Analysis Of Age-Related Changes In The Acoustic Features Of Elderly To Super Elderly Speech.
Proceedings of the 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2022

Elderly Conversational Speech Corpus with Cognitive Impairment Test and Pilot Dementia Detection Experiment Using Acoustic Characteristics of Speech in Japanese Dialects.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Summarization of Spoken Lectures Based on MMR Method and Important/Unimportant Sentence Classification Using BERT.
Proceedings of the 11th IEEE Global Conference on Consumer Electronics, 2022


Improvement of Elderly Speech Recognition Using Gammatone Filterbank Adaptation.
Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

Effectiveness of Fine Linear Frequency Spectral Feature for Acoustic Event Detection.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Discriminative Learning of Filterbank Layer within Deep Neural Network Based Speech Recognition for Speaker Adaptation.
IEICE Trans. Inf. Syst., 2019

Learning Position Evaluation Functions Used in Monte Carlo Softmax Search.
CoRR, 2019

Evaluation of Real Robot Agent Interface for Spoken Dialogue System.
Proceedings of the IEEE 8th Global Conference on Consumer Electronics, 2019

Rapid Speaker Adaptation of Neural Network Based Filterbank Layer for Automatic Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Automatic Explanation Spot Estimation Method Targeted at Text and Figures in Lecture Slides.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A deep neural network integrated with filterbank learning for speech recognition.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Lyric recognition in monophonic singing using pitch-dependent DNN.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Detection of overlapping acoustic events based on NMF with shared basis vectors.
Proceedings of the IEEE 6th Global Conference on Consumer Electronics, 2017

Speech analysis of sung-speech and lyric recognition in monophonic singing.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Investigation of glottal features and annotation procedures for speech emotion recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Domain adaptation of a speech translation system for lectures by utilizing frequently appearing parallel phrases in-domain.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Robust speech recognition using DNN-HMM acoustic model combining noise-aware training with spectral subtraction.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Combination of syllable based N-gram search and word search for spoken term detection through spoken queries and IV/OOV classification.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Deep neural network based acoustic model using speaker-class information for short time utterance.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Speech recognition for mixed speech and music by NMF using various cost functions and noise adaptive training methods.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Architecture and Evaluation of Low Power Many-Core SoC with Two 32-Core Clusters.
IEICE Trans. Electron., 2014

Single-channel dereverberation by feature mapping using cascade neural networks for robust distant speaker identification and speech recognition.
EURASIP J. Audio Speech Music. Process., 2014

Sopoken Term Detection Based on a Syllable N-gram Index at the NTCIR-11 SpokenQuery&Doc Task.
Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Speech recognition based on Itakura-Saito divergence and dynamics/sparseness constraints from mixed sound of speech and music by non-negative matrix factorization.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A robust/fast spoken term detection method based on a syllable n-gram index with a distance metric.
Speech Commun., 2013

Development and Evaluation of Spoken Dialog Systems with One or Two Agents through Two Domains.
Proceedings of the Text, Speech, and Dialogue - 16th International Conference, 2013

Development and evaluation of spoken dialog systems with one or two agents.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Speaker tracking with spherical microphone arrays.
Proceedings of the IEEE International Conference on Acoustics, 2013

Single channel dereverberation method in log-melspectral domain using limited stereo data for distant speaker identification.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Fast NMF based approach and VQ based approach using MFCC distance measure for speech recognition from mixed sound.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Hidden Conditional Neural Fields for Continuous Phoneme Speech Recognition.
IEICE Trans. Inf. Syst., 2012

Improving the Readability of ASR Results for Lectures Using Multiple Hypotheses and Sentence-Level Knowledge.
IEICE Trans. Inf. Syst., 2012

A low power many-core SoC with two 32-core clusters connected by tree based NoC for multimedia applications.
Proceedings of the Symposium on VLSI Circuits, 2012

Development of large vocabulary continuous speech recognition system for Mongolian language.
Proceedings of the Third Workshop on Spoken Language Technologies for Under-resourced Languages, 2012

Fast NMF based approach and improved VQ based approach for speech recognition from mixed sound.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Microphone array processing for distant speech recognition: Towards real-world deployment.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Soft-clustering technique for training data in Age-and gender-independent speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Speech Recognition in Mixed Sound of Speech and Music Based on Vector Quantization and Non-Negative Matrix Factorization.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Hidden Boosted MMI and Hierarchical State Posterior Feature for Automatic Speech Recognition Based on Hidden Conditional Neural Fields.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Efficient out-of-vocabulary term detection by n-gram array indices with distance from a syllable lattice.
Proceedings of the IEEE International Conference on Acoustics, 2011

Automatic speech recognition using Hidden Conditional Neural Fields.
Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker Recognition by Combining MFCC and Phase Information in Noisy Conditions.
IEICE Trans. Inf. Syst., 2010

Distant Speech Recognition Using a Microphone Array Network.
IEICE Trans. Inf. Syst., 2010

Out-of-vocabulary term detection by n-gram array with distance from continuous syllable recognition results.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Evaluation of Privacy Protection Techniques for Speech Signals.
Proceedings of the Information Processing and Management of Uncertainty in Knowledge-Based Systems. Applications, 2010

Speech recognition using long-term phase information.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improving the readability of class lecture ASR results using a confusion network.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker identification by combining MFCC and phase information in noisy environments.
Proceedings of the IEEE International Conference on Acoustics, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.
Proceedings of the Auditory-Visual Speech Processing, 2010

Estimating the position and orientation of an acoustic source with a microphone array network.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Privacy Protection for Speech Information.
Proceedings of the Fifth International Conference on Information Assurance and Security, 2009

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speech recognition performance of CJLC: corpus of Japanese lecture contents.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Class lecture summarization taking into account consecutiveness of important sentences.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Mel-Wiener Filter for Mel-LPC Based Speech Recognition.
IEICE Trans. Inf. Syst., 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

An improved mel-wiener filter for mel-LPC based speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

AURORA-2J: An Evaluation Framework for Japanese Noisy Speech Recognition.
IEICE Trans. Inf. Syst., 2005

CENSREC-3: Data Collection for In-Car Speech Recognition and Its Common Evaluation Framework.
Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Integration of noise reduction algorithms for Aurora2 task.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech recognition under noisy environments using segmental unit input HMM.
Syst. Comput. Jpn., 2002

Differences of speech rate, interphoneme distance and likelihood caused by speaking style, their relationship, and recognition performance.
Syst. Comput. Jpn., 2002

Evaluation of a generalized dynamic cepstrum in distant speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Relationship among speaking style, inter-phoneme's distance and speech recognition performance.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Forward masking on a generalized logarithmic scale for robust speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

HMM composition of segmental unit input HMM for noisy speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Continuous speech recognition using segmental unit input HMMs with a mixture of probability density functions and context dependency.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Speech recognition using hidden Markov models based on segmental statistics.
Syst. Comput. Jpn., 1997

Evaluation of segmental unit input HMM.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Comparative evaluation of segmental unit input HMM and conditional density HMM.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995
