Tatsuya Kawahara
Orcid: 0000-0002-2686-2296Affiliations:
- Kyoto University, School of Informatics, Japan
According to our database1,
Tatsuya Kawahara
authored at least 391 papers
between 1990 and 2024.
Collaborative distances:
Collaborative distances:
Awards
IEEE Fellow
IEEE Fellow 2017, "For contributions to speech recognition and understanding".
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
On csauthors.net:
Bibliography
2024
Adv. Robotics, February, 2024
Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Waveform-Domain Speech Enhancement Using Spectrogram Encoding for Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Multim. Tools Appl., 2024
Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection.
CoRR, 2024
CoRR, 2024
Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems.
CoRR, 2024
Should RAG Chatbots Forget Unimportant Conversations? Exploring Importance and Forgetting with Psychological Insights.
CoRR, 2024
Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition.
CoRR, 2024
Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Acknowledgment of Emotional States: Generating Validating Responses for Empathetic Dialogue.
CoRR, 2024
Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks.
CoRR, 2024
CoRR, 2024
CoRR, 2024
StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement.
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024
2023
Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings.
Int. J. Asian Lang. Process., December, 2023
Effect of attentive listening robot on pleasure and arousal change in psychiatric daycare.
Adv. Robotics, November, 2023
Dual variational generative model and auxiliary retrieval for empathetic response generation by conversational robot.
Adv. Robotics, November, 2023
Character expression for spoken dialogue systems with semi-supervised learning using Variational Auto-Encoder.
Comput. Speech Lang., April, 2023
Alignment Knowledge Distillation for Online Streaming Attention-Based Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Reasoning before Responding: Integrating Commonsense-based Causality Explanation for Empathetic Response Generation.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023
Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023
RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities.
Proceedings of the 37th Pacific Asia Conference on Language, 2023
Embedding Articulatory Constraints for Low-resource Speech Recognition Based on Large Pre-trained Model.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Two-stage Finetuning of Wav2vec 2.0 for Speech Emotion Recognition with ASR and Gender Pretraining.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors.
Proceedings of the International Conference on Multimodal Interaction, 2023
Time-Domain Speech Enhancement Assisted by Multi-Resolution Frequency Encoder and Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2023
Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language.
Proceedings of the IEEE International Conference on Acoustics, 2023
I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue.
Proceedings of the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023
2022
Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Computationally-Efficient Overdetermined Blind Source Separation Based on Iterative Source Steering.
IEEE Signal Process. Lett., 2022
Can a robot laugh with you?: Shared laughter generation for empathetic spoken dialogue.
Frontiers Robotics AI, 2022
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022
Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Selective Multi-Task Learning For Speech Emotion Recognition Using Corpora Of Different Styles.
Proceedings of the IEEE International Conference on Acoustics, 2022
Phone-Informed Refinement of Synthesized Mel Spectrogram for Data Augmentation in Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022
Alzheimer's Dementia Detection through Spontaneous Dialogue with Proactive Robotic Listeners.
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, 2022
Proceedings of the International Conference on Human-Agent Interaction, 2022
2021
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies.
Int. J. Asian Lang. Process., 2021
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring.
CoRR, 2021
Intelligent Conversational Android ERICA Applied to Attentive Listening and Job Interview.
CoRR, 2021
Semi-autonomous avatar enabling unconstrained parallel conversations -seamless hybrid of WOZ and autonomous dialogue systems-.
Adv. Robotics, 2021
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021
A multi-party attentive listening robot which stimulates involvement from side participants.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021
Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC).
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021
Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Fast Multichannel Nonnegative Matrix Factorization With Directivity-Aware Jointly-Diagonalizable Spatial Covariance Matrices for Blind Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Spoken Language Interaction with Virtual Agents and Robots (SLIVAR): Towards Effective and Ethical Interaction (Dagstuhl Seminar 20021).
Dagstuhl Reports, 2020
An Attentive Listening System with Android ERICA: Comparison of Autonomous and WOZ Interactions.
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2020
Proceedings of The 12th Language Resources and Evaluation Conference, 2020
Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020
Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Generative Adversarial Training Data Adaptation for Very Low-Resource Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020
Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020
Proceedings of the Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, 2020
Semi-supervised Multichannel Speech Separation Based on a Phone- and Speaker-Aware Deep Generative Model of Speech Spectrograms.
Proceedings of the 28th European Signal Processing Conference, 2020
Topic-relevant Response Generation using Optimal Transport for an Open-domain Dialog System.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Integration of Semi-Blind Speech Source Separation and Voice Activity Detection for Flexible Spoken Dialogue.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Computer-Resource-Aware Deep Speech Separation with a Run-Time-Specified Number of BLSTM Layers.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020
2019
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Joint dialog act segmentation and recognition in human conversations using attention to dialog context.
Comput. Speech Lang., 2019
CoRR, 2019
Content Word-based Sentence Decoding and Evaluating for Open-domain Neural Response Generation.
CoRR, 2019
Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human-robot interaction.
Adv. Robotics, 2019
End-to-end Modeling for Selection of Utterance Constructional Units via System Internal States.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019
Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019
Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues.
Proceedings of the International Conference on Multimodal Interaction, 2019
Multi-speaker Sequence-to-sequence Speech Synthesis for Data Augmentation in Acoustic-to-word Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Typing Tutor: Individualized Tutoring in Text Entry for Older Adults Based on Statistical Input Stumble Detection.
J. Inf. Process., 2018
Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening.
Comput. Speech Lang., 2018
Leveraging Sequence-to-Sequence Speech Synthesis for Enhancing Acoustic-to-Word Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
A Unified Neural Architecture for Joint Dialog Act Segmentation and Recognition in Spoken Dialog System.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018
Generating Fillers Based on Dialog Act Pairs for Smooth Turn-Taking by Humanoid Robot.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018
Proceedings of the 23rd International Conference on Intelligent User Interfaces, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Engagement Recognition in Spoken Dialogue via Neural Network by Aggregating Different Annotators' Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Prediction of Turn-taking Using Multitask Learning with Prediction of Backchannels and Fillers.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Evaluation of Real-time Deep Learning Turn-taking Models for Multiple Dialogue Scenarios.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018
Acoustic-to-Word Attention-Based Model Complemented with Character-Level CTC-Based Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
An End-to-End Approach to Joint Social Signal Detection and Automatic Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Efficient Learning of Articulatory Models Based on Multi-Label Training and Label Correction for Pronunciation Learning.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 26th European Signal Processing Conference, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
J. Inf. Process., 2017
Articulatory Modeling for Pronunciation Error Detection without Non-Native Training Data Based on DNN Transfer Learning.
IEICE Trans. Inf. Syst., 2017
CoRR, 2017
Detecting listening difficulty for second language learners using Automatic Speech Recognition errors.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017
Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017
Attentive listening system with backchanneling, response generation and flexible turn-taking.
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017
Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017
Proceedings of the Advanced Social Interaction with Agents, 2017
Analysis of the Relationship Between Prosodic Features of Fillers and its Forms or Occurrence Positions.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Joint Learning of Dialog Act Segmentation and Recognition in Spoken Dialog Using Neural Networks.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Bayesian multichannel nonnegative matrix factorization for audio source separation and localization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 9th International Conference on Agents and Artificial Intelligence, 2017
Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Incremental training and constructing the very deep convolutional residual network acoustic models.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Emotion recognition by combining prosody and sentiment analysis for expressing reactive emotion by humanoid robot.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Proceedings of the SIGDIAL 2016 Conference, 2016
Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication, 2016
Proceedings of the Intelligent Virtual Agents - 16th International Conference, 2016
Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Pronunciation error detection using DNN articulatory model based on multi-lingual and multi-task learning.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016
Proceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction, 2016
Prediction of ice-breaking between participants using prosodic features in the first meeting dialogue.
Proceedings of the 2nd Workshop on Advancements in Social Signal Processing for Multimodal Interaction, 2016
Data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity, 2016
Proceedings of the Human-Harmonized Information Technology, Volume 1 - Vertical Impact, 2016
2015
Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training.
IEICE Trans. Inf. Syst., 2015
Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature.
EURASIP J. Adv. Signal Process., 2015
Conversational system for information navigation based on POMDP with user focus tracking.
Comput. Speech Lang., 2015
ASR technology to empower partial and synchronized caption for L2 listening development.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015
Proceedings of the Computational Linguistics, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Discriminative data selection for lightly supervised training of acoustic model using closed caption texts.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Deep autoencoders augmented with phone-class feature for reverberant speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Language model adaptation for academic lectures using character recognition result of presentation slides.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Synchrony in prosodic and linguistic features between backchannels and preceding utterances in attentive listening.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Automatic classification of usability of ASR result for real-time captioning of lectures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
2014
IEEE Trans. Hum. Mach. Syst., 2014
Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language.
Speech Commun., 2014
Proceedings of the SIGDIAL 2014 Conference, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Exploring deep neural networks and deep autoencoders in reverberant speech recognition.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Japanese-to-English patent translation system based on domain-adapted word segmentation and post-ordering.
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, 2014
2013
Evaluation Framework Design of Spoken Term Detection Study at the NTCIR-9 IR for Spoken Documents Task.
Inf. Media Technol., 2013
IEICE Trans. Inf. Syst., 2013
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013
Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013
Incorporating semantic information to selection of web texts for language model of spoken dialogue system.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the Human-Computer Interaction. Interaction Modalities and Techniques, 2013
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013
2012
J. Inf. Process., 2012
IEICE Trans. Inf. Syst., 2012
A monotonic statistical machine translation approach to speaking style transformation.
Comput. Speech Lang., 2012
Proceedings of the SIGDIAL 2012 Conference, 2012
Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012
Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of English.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet).
Proceedings of the Twenty-Fourth Conference on Innovative Applications of Artificial Intelligence, 2012
Proceedings of the International Conference on Human-Robot Interaction, 2012
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012
Language Modeling for Spoken Dialogue System based on Filtering using Predicate-Argument Structures.
Proceedings of the COLING 2012, 2012
Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012
2011
ACM Trans. Speech Lang. Process., 2011
IEICE Trans. Inf. Syst., 2011
Spoken Dialogue System based on Information Extraction using Similarity of Predicate Argument Structures.
Proceedings of the SIGDIAL 2011 Conference, 2011
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011
Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 2011 Second International Conference on Culture and Computing, 2011
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011
Online Learning of Bayes Risk-Based Optimization of Dialogue Management for Document Retrieval Systems with Speech Interface.
Proceedings of the Spoken Dialogue Systems Technology and Design, 2011
2010
Speech Activity Detection for Multi-Party Conversation Analyses Based on Likelihood Ratio Test on Spatial Magnitude.
IEEE Trans. Speech Audio Process., 2010
Robust Speech Recognition Based on Dereverberation Parameter Optimization Using Acoustic Model Likelihood.
IEEE Trans. Speech Audio Process., 2010
Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition.
IEEE Trans. Speech Audio Process., 2010
Bayes risk-based dialogue management for document retrieval system with speech interface.
Speech Commun., 2010
IEEE J. Sel. Top. Signal Process., 2010
Online Unsupervised Classification With Model Comparison in the Variational Bayes Framework for Voice Activity Detection.
IEEE J. Sel. Top. Signal Process., 2010
Proceedings of the Spoken Dialogue Systems for Ambient Environments, 2010
Automatic transcription of parliamentary meetings and classroom lectures - A sustainable approach and real system evaluations -.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Semi-automated update of automatic transcription system for the Japanese national congress.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Optimizing spectral subtraction and wiener filtering for robust speech recognition in reverberant and noisy conditions.
Proceedings of the IEEE International Conference on Acoustics, 2010
Using online model comparison in the Variational Bayes framework for online unsupervised Voice Activity Detection.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the 10th IEEE-RAS International Conference on Humanoid Robots, 2010
2009
Computer Assisted Language Learning system based on dynamic question generation and error prediction for automatic speech recognition.
Speech Commun., 2009
Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data.
J. Inf. Process., 2009
Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System.
IEICE Trans. Inf. Syst., 2009
Proceedings of the User Modeling, 2009
Japanese CALL system based on dynamic question generation and error prediction for ASR.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Language model transformation applied to lightly supervised training of acoustic model for congress meetings.
Proceedings of the IEEE International Conference on Acoustics, 2009
New perspectives on spoken language understanding: Does machine need to fully understand speech?
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009
2008
IEICE Trans. Inf. Syst., 2008
Proceedings of the International Conference on Language Resources and Evaluation, 2008
A Japanese CALL system based on dynamic question generation and error prediction for ASR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Aggregated cross-validation and its efficient application to Gaussian mixture optimization.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Predicting ASR errors by exploiting barge-in rate of individual users for spoken dialogue systems.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Statistical speech activity detection based on spatial power distribution for analyses of poster presentations.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Effective error prediction using decision tree for ASR grammar network in call system.
Proceedings of the IEEE International Conference on Acoustics, 2008
GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.
Proceedings of the IEEE International Conference on Acoustics, 2008
Admissible stopping in viterbi beam search for unit selection in concatenative speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008
Automatic lecture transcription by exploiting presentation slide information for language model adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
2007
Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics.
IEEE Trans. Speech Audio Process., 2007
Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007
Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Bayes risk-based optimization of dialogue management for document retrieval system with speech interface.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
An Interactive Framework for Document Retrieval and Presentation with Question-Answering Function in Restricted Domain.
Proceedings of the New Trends in Applied Artificial Intelligence, 2007
Proceedings of the 2007 workshop on Tagging, 2007
Speech-Based Interactive Information Guidance System using Question-Answering Technique.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Topic-Independent Speaking-Style Transformation of Language Model for Spontaneous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007
2006
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
Speech Commun., 2006
Trigger-Based Language Model Adaptation for Automatic Transcription of Panel Discussions.
IEICE Trans. Inf. Syst., 2006
Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures.
IEICE Trans. Inf. Syst., 2006
Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006
Prototyping a call system for students of Japanese using dynamic diagram generation and interactive hints.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Evaluation of voice activity detection by combining multiple features with weight adaptation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machines.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Efficient Estimation of Language Model Statistics of Spontaneous Speech Via Statistical Transformation Model.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
User Model. User Adapt. Interact., 2005
Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing.
IEEE Trans. Speech Audio Process., 2005
Syst. Comput. Jpn., 2005
Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching.
IEICE Trans. Inf. Syst., 2005
Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions.
IEICE Trans. Inf. Syst., 2005
Proceedings of the HLT/EMNLP 2005, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Minimum Bayes-risk decoding considering word significance for information retrieval system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Utterance verification incorporating in-domain confidence and discourse coherence measures.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Voice activity detection based on optimally weighted combination of multiple features.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
A New ASR Evaluation Measure and Minimum Bayes-Risk Decoding for Open-domain Speech Understanding.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Generalized Statistical Modeling of Pronunciation Variations using Variable-length Phone Context.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
Language model and speaking rate adaptation for spontaneous presentation speech recognition.
IEEE Trans. Speech Audio Process., 2004
Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers.
IEEE Trans. Speech Audio Process., 2004
IEEE Trans. Speech Audio Process., 2004
Bus Information System Based on User Models and Dynamic Generation of VoiceXML Scripts.
Proceedings of the New Frontiers in Artificial Intelligence - JSAI 2003 and JSAI 2004 Conferences and Workshops, Niigata, Japan, June 23-27, 2003 and Kanazawa, Japan, May 31, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Automatic transformation of lecture transcription into document style using statistical framework.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Topic classification and verification modeling for out-of-domain utterance detection.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Practical use of English pronunciation system for Japanese students in the CALL classroom.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the Innovations in Applied Artificial Intelligence, 2004
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004
Speaker indexing and adaptation using speaker clustering based on statistical model selection.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Real-time word confidence scoring using local posterior probabilities on tree trellis search.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Out-of-domain detection based on confidence measures from multiple topic classification.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Efficient Confirmation Strategy for Large-scale Text Retrieval Systems with Spoken Dialogue Interface.
Proceedings of the COLING 2004, 2004
2003
Flexible Spoken Dialogue System based on User Models and Dynamic Generation of VoiceXML Scripts.
Proceedings of the SIGDIAL 2003 Workshop, 2003
Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Hierarchical topic classification for dialog speech recognition based on language model switching.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Unsupervised speaker indexing using anchor models and automatic transcription of discussions.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, 2003
Proceedings of the ACL 2003, 2003
2002
Continuous Speech Recognition Consortium an Open Repository for CSR Tools and Models.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002
Belief network based disambiguation of object reference in spoken dialogue system for robot.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Recognition and verification of English by Japanese students for computer-assisted language learning system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learning.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Speaking rate compensation based on likelihood criterion in acoustic model training and decoding.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002
Automatic indexing of lecture speech by extracting topic-independent discourse markers.
Proceedings of the IEEE International Conference on Acoustics, 2002
Efficient Dialogue Strategy to Find Users' Intended Items from Information Query Results.
Proceedings of the 19th International Conference on Computational Linguistics, 2002
2001
Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the IEEE International Conference on Acoustics, 2001
2000
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000
Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Generating effective confirmation and guidance using two-level confidence measures for dialogue systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Modelling of the perception of English sentence stress for computer-assisted language learning.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Proceedings of the IEEE International Conference on Acoustics, 2000
Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000
1999
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999
1998
Flexible speech understanding based on combined key-phrase detection and verification.
IEEE Trans. Speech Audio Process., 1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Sharable software repository for Japanese large vocabulary continuous speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Speaking-style dependent lexicalized filler model for key-phrase detection and verification.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998
1997
Individual identification by integrating facial image, walking image, and vocal features.
Syst. Comput. Jpn., 1997
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997
Combining key-phrase detection and subword-based verification for flexible speech understanding.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997
1996
Proceedings of the 4th International Conference on Spoken Language Processing, 1996
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996
1994
Continuous speech recognition based on A* search with word-pair constraint as heuristics.
Syst. Comput. Jpn., 1994
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994
1992
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992
1991
Speaker-independent consonant recognition by integrating discriminant analysis and hmm.
Syst. Comput. Jpn., 1991
Unsupervised speaker normalization by speaker Markov model converter for speaker-independent speech recognition.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991
Proceedings of the 1991 International Conference on Acoustics, 1991
1990
Phoneme recognition by combining Bayesian linear discriminations of selected pairs of classes.
Proceedings of the First International Conference on Spoken Language Processing, 1990