Tatsuya Kawahara

Orcid: 0000-0002-2686-2296

Affiliations:
  • Kyoto University, School of Informatics, Japan


According to our database1, Tatsuya Kawahara authored at least 391 papers between 1990 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Awards

IEEE Fellow

IEEE Fellow 2017, "For contributions to speech recognition and understanding".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Character expression of a conversational robot for adapting to user personality.
Adv. Robotics, February, 2024

Refining Synthesized Speech Using Speaker Information and Phone Masking for Data Augmentation of Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Waveform-Domain Speech Enhancement Using Spectrogram Encoding for Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

A large-scale television advertising dataset for detailed impression analysis.
Multim. Tools Appl., 2024

Yeah, Un, Oh: Continuous and Real-time Backchannel Prediction with Fine-tuning of Voice Activity Projection.
CoRR, 2024

Efficient and Robust Long-Form Speech Recognition with Hybrid H3-Conformer.
CoRR, 2024

Analysis and Detection of Differences in Spoken User Behaviors between Autonomous and Wizard-of-Oz Systems.
CoRR, 2024

Should RAG Chatbots Forget Unimportant Conversations? Exploring Importance and Forgetting with Psychological Insights.
CoRR, 2024

Serialized Speech Information Guidance with Overlapped Encoding Separation for Multi-Speaker Automatic Speech Recognition.
CoRR, 2024

Benchmarking Japanese Speech Recognition on ASR-LLM Setups with Multi-Pass Augmented Generative Error Correction.
CoRR, 2024

Investigation of Adapter for Automatic Speech Recognition in Noisy Environment.
CoRR, 2024

Evaluation of a semi-autonomous attentive listening system with takeover prompting.
CoRR, 2024

Acknowledgment of Emotional States: Generating Validating Responses for Empathetic Dialogue.
CoRR, 2024

Enhancing Personality Recognition in Dialogue by Data Augmentation and Heterogeneous Conversational Graph Networks.
CoRR, 2024

Real-time and Continuous Turn-taking Prediction Using Voice Activity Projection.
CoRR, 2024

An Analysis of User Behaviors for Objectively Evaluating Spoken Dialogue Systems.
CoRR, 2024

StyEmp: Stylizing Empathetic Response Generation via Multi-Grained Prefix Encoder and Personality Reinforcement.
Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2024

MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction.
Proceedings of the IEEE International Conference on Acoustics, 2024

Zero- and Few-Shot Sound Event Localization and Detection.
Proceedings of the IEEE International Conference on Acoustics, 2024

Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders.
Proceedings of the IEEE International Conference on Acoustics, 2024

Enhancing Two-Stage Finetuning for Speech Emotion Recognition Using Adapters.
Proceedings of the IEEE International Conference on Acoustics, 2024

Multilingual Turn-taking Prediction Using Voice Activity Projection.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Finetuning Pretrained Model with Embedding of Domain and Language Information for ASR of Very Low-Resource Settings.
Int. J. Asian Lang. Process., December, 2023

Effect of attentive listening robot on pleasure and arousal change in psychiatric daycare.
Adv. Robotics, November, 2023

Dual variational generative model and auxiliary retrieval for empathetic response generation by conversational robot.
Adv. Robotics, November, 2023

Character expression for spoken dialogue systems with semi-supervised learning using Variational Auto-Encoder.
Comput. Speech Lang., April, 2023

Alignment Knowledge Distillation for Online Streaming Attention-Based Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Reasoning before Responding: Integrating Commonsense-based Causality Explanation for Empathetic Response Generation.
Proceedings of the 24th Meeting of the Special Interest Group on Discourse and Dialogue, 2023

Robotic Backchanneling in Online Conversation Facilitation: A Cross-Generational Study.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

RealPersonaChat: A Realistic Persona Chat Corpus with Interlocutors' Own Personalities.
Proceedings of the 37th Pacific Asia Conference on Language, 2023

Embedding Articulatory Constraints for Low-resource Speech Recognition Based on Large Pre-trained Model.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Two-stage Finetuning of Wav2vec 2.0 for Speech Emotion Recognition with ASR and Gender Pretraining.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Towards Objective Evaluation of Socially-Situated Conversational Robots: Assessing Human-Likeness through Multimodal User Behaviors.
Proceedings of the International Conference on Multimodal Interaction, 2023

Time-Domain Speech Enhancement Assisted by Multi-Resolution Frequency Encoder and Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2023

Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language.
Proceedings of the IEEE International Conference on Acoustics, 2023

I Know Your Feelings Before You Do: Predicting Future Affective Reactions in Human-Computer Dialogue.
Proceedings of the Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, 2023

2022
Autoregressive Moving Average Jointly-Diagonalizable Spatial Covariance Analysis for Joint Source Separation and Dereverberation.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Computationally-Efficient Overdetermined Blind Source Separation Based on Iterative Source Steering.
IEEE Signal Process. Lett., 2022

Can a robot laugh with you?: Shared laughter generation for empathetic spoken dialogue.
Frontiers Robotics AI, 2022

Distilling the Knowledge of BERT for CTC-based ASR.
CoRR, 2022

Simultaneous Job Interview System Using Multiple Semi-autonomous Agents.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

Monaural Speech Enhancement Based on Spectrogram Decomposition for Convolutional Neural Network-sensitive Feature Extraction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-end Speech-to-Punctuated-Text Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multimodal Persuasive Dialogue Corpus using Teleoperated Android.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Leveraging Simultaneous Translation for Enhancing Transcription of Low-resource Language via Cross Attention Mechanism.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Non-autoregressive Error Correction for CTC-based ASR with Phone-conditioned Masked LM.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Selective Multi-Task Learning For Speech Emotion Recognition Using Corpora Of Different Styles.
Proceedings of the IEEE International Conference on Acoustics, 2022

Phone-Informed Refinement of Synthesized Mel Spectrogram for Data Augmentation in Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2022

Alzheimer's Dementia Detection through Spontaneous Dialogue with Proactive Robotic Listeners.
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, 2022

Backchannel Generation Model for a Third Party Listener Agent.
Proceedings of the International Conference on Human-Agent Interaction, 2022

2021
TriECCC: Trilingual Corpus of the Extraordinary Chambers in the Courts of Cambodia for Speech Recognition and Translation Studies.
Int. J. Asian Lang. Process., 2021

Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring.
CoRR, 2021

Intelligent Conversational Android ERICA Applied to Attentive Listening and Job Interview.
CoRR, 2021

Semi-autonomous avatar enabling unconstrained parallel conversations -seamless hybrid of WOZ and autonomous dialogue systems-.
Adv. Robotics, 2021

Multi-Referenced Training for Dialogue Response Generation.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021

ERICA: An Empathetic Android Companion for Covid-19 Quarantine.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021

A multi-party attentive listening robot which stimulates involvement from side participants.
Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2021

Khmer Speech Translation Corpus of the Extraordinary Chambers in the Courts of Cambodia (ECCC).
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

Source and Target Bidirectional Knowledge Distillation for End-to-end Speech Translation.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

VAD-Free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

ORTHROS: non-autoregressive end-to-end speech translation With dual-decoder.
Proceedings of the IEEE International Conference on Acoustics, 2021

Data Augmentation for ASR Using TTS Via a Discrete Representation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

ASR Rescoring and Confidence Estimation with Electra.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Spectrograms Fusion-based End-to-end Robust Automatic Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

An End-To-End Model from Speech to Clean Transcript for Parliamentary Meetings.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

On the Use of Speaker Information for Automatic Speech Recognition in Speaker-imbalanced Corpora.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Fast Multichannel Nonnegative Matrix Factorization With Directivity-Aware Jointly-Diagonalizable Spatial Covariance Matrices for Blind Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Spoken Language Interaction with Virtual Agents and Robots (SLIVAR): Towards Effective and Ethical Interaction (Dagstuhl Seminar 20021).
Dagstuhl Reports, 2020

An Attentive Listening System with Android ERICA: Comparison of Autonomous and WOZ Interactions.
Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2020

Speech Corpus of Ainu Folklore and End-to-end Speech Recognition for Ainu Language.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

A Character Expression Model Affecting Spoken Dialogue Behaviors.
Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020

Response Generation to Out-of-Database Questions for Example-Based Dialogue Systems.
Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020

Semi-Supervised Learning for Character Expression of Spoken Dialogue Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Generative Adversarial Training Data Adaptation for Very Low-Resource Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Enhancing Monotonic Multihead Attention for Streaming ASR.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

CTC-Synchronous Training for Monotonic Attention Model.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Distilling the Knowledge of BERT for Sequence-to-Sequence ASR.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speech Emotion Recognition Combined with Acoustic-to-Word ASR Model.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speech-to-Dialog-Act Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Prediction of Shared Laughter for Human-Robot Dialogue.
Proceedings of the Companion Publication of the 2020 International Conference on Multimodal Interaction, 2020

Job Interviewer Android with Elaborate Follow-up Question Generation.
Proceedings of the ICMI '20: International Conference on Multimodal Interaction, 2020

Autonomous Dialogue Technologies in Symbiotic Human-robot Interaction.
Proceedings of the Companion of the 2020 ACM/IEEE International Conference on Human-Robot Interaction, 2020

Semi-supervised Multichannel Speech Separation Based on a Phone- and Speaker-Aware Deep Generative Model of Speech Spectrograms.
Proceedings of the 28th European Signal Processing Conference, 2020

Topic-relevant Response Generation using Optimal Transport for an Open-domain Dialog System.
Proceedings of the 28th International Conference on Computational Linguistics, 2020

End-to-end Music-mixed Speech Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Integration of Semi-Blind Speech Source Separation and Voice Activity Detection for Flexible Spoken Dialogue.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Computer-Resource-Aware Deep Speech Separation with a Run-Time-Specified Number of BLSTM Layers.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Designing Precise and Robust Dialogue Response Evaluators.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Semi-Supervised Multichannel Speech Enhancement With a Deep Speech Prior.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Joint dialog act segmentation and recognition in human conversations using attention to dialog context.
Comput. Speech Lang., 2019

Effective Incorporation of Speaker Information in Utterance Encoding in Dialog.
CoRR, 2019

Content Word-based Sentence Decoding and Evaluating for Open-domain Neural Response Generation.
CoRR, 2019

Expressing reactive emotion based on multimodal emotion recognition for natural conversation in human-robot interaction.
Adv. Robotics, 2019

End-to-end Modeling for Selection of Utterance Constructional Units via System Internal States.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

Engagement-Based Adaptive Behaviors for Laboratory Guide in Human-Robot Dialogue.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

A Job Interview Dialogue System with Autonomous Android ERICA.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

Improved End-to-End Speech Emotion Recognition Using Self Attention Mechanism and Multitask Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Improving Transformer-Based Speech Recognition Systems with Compressed Structure and Speech Attributes Augmentation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Investigating Radical-Based End-to-End Speech Recognition Systems for Chinese Dialects and Japanese.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Analysis of Effect and Timing of Fillers in Natural Turn-Taking.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Turn-Taking Prediction Based on Detection of Transition Relevance Place.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Articulatory Attribute Modeling for Low-Resource Multilingual Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

ERICA and WikiTalk.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Smooth Turn-taking by a Robot Using an Online Continuous Model to Generate Turn-taking Cues.
Proceedings of the International Conference on Multimodal Interaction, 2019

Multi-speaker Sequence-to-sequence Speech Synthesis for Data Augmentation in Acoustic-to-word Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

Transfer Learning of Language-independent End-to-end ASR with Language Model Fusion.
Proceedings of the IEEE International Conference on Acoustics, 2019

Multilingual End-to-End Speech Translation.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Multi-lingual Transformer Training for Khmer Automatic Speech Recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Typing Tutor: Individualized Tutoring in Text Entry for Older Adults Based on Statistical Input Stumble Detection.
J. Inf. Process., 2018

Exploiting automatic speech recognition errors to enhance partial and synchronized caption for facilitating second language listening.
Comput. Speech Lang., 2018

Leveraging Sequence-to-Sequence Speech Synthesis for Enhancing Acoustic-to-Word Speech Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Improving OOV Detection and Resolution with External Language Models in Acoustic-to-Word ASR.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Improving Very Deep Time-Delay Neural Network With Vertical-Attention For Effectively Training CTC-Based ASR Systems.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

A Unified Neural Architecture for Joint Dialog Act Segmentation and Recognition in Spoken Dialog System.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Generating Fillers Based on Dialog Act Pairs for Smooth Turn-Taking by Humanoid Robot.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Spoken Dialogue System for a Human-like Conversational Robot ERICA.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Latent Character Model for Engagement Recognition Based on Multimodal Behaviors.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Voice Input Tutoring System for Older Adults using Input Stumble Detection.
Proceedings of the 23rd International Conference on Intelligent User Interfaces, 2018

Encoder Transfer for Attention-based Acoustic-to-word Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Forward-Backward Attention Decoder.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Improving CTC-based Acoustic Model with Very Deep Residual Time-delay Neural Networks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Engagement Recognition in Spoken Dialogue via Neural Network by Aggregating Different Annotators' Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Prediction of Turn-taking Using Multitask Learning with Prediction of Backchannels and Fillers.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Evaluation of Real-time Deep Learning Turn-taking Models for Multiple Dialogue Scenarios.
Proceedings of the 2018 on International Conference on Multimodal Interaction, 2018

Acoustic-to-Word Attention-Based Model Complemented with Character-Level CTC-Based Model.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Audio-Visual Conversation Analysis by Smart Posterboard and Humanoid Robot.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

An End-to-End Approach to Joint Social Signal Detection and Automatic Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Efficient Learning of Articulatory Models Based on Multi-Label Training and Label Correction for Pronunciation Learning.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Independent Low-Rank Tensor Analysis for Audio Source Separation.
Proceedings of the 26th European Signal Processing Conference, 2018

Dialogue Behavior Control Model for Expressing a Character of Humanoid Robots.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Bayesian Multichannel Speech Enhancement with a Deep Speech Prior.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Assistive Typing Application for Older Adults Based on Input Stumble Detection.
J. Inf. Process., 2017

Articulatory Modeling for Pronunciation Error Detection without Non-Native Training Data Based on DNN Transfer Learning.
IEICE Trans. Inf. Syst., 2017

Detection of social signals for recognizing engagement in human-robot interaction.
CoRR, 2017

Detecting listening difficulty for second language learners using Automatic Speech Recognition errors.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Transfer Learning based Non-native Acoustic Modeling for Pronunciation Error Detection.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Attentive listening system with backchanneling, response generation and flexible turn-taking.
Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, 2017

Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

A Conversational Dialogue Manager for the Humanoid Robot ERICA.
Proceedings of the Advanced Social Interaction with Agents, 2017

Analysis of the Relationship Between Prosodic Features of Fillers and its Forms or Occurrence Positions.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Combined Multi-Channel NMF-Based Robust Beamforming for Noisy Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Social Signal Detection in Spontaneous Dialogue Using Bidirectional LSTM-CTC.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Joint Learning of Dialog Act Segmentation and Recognition in Spoken Dialog Using Neural Networks.
Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Semi-supervised ensemble DNN acoustic model training.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Bayesian multichannel nonnegative matrix factorization for audio source separation and localization.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Effective articulatory modeling for pronunciation error detection of L2 learner without non-native training data.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Utterance Behavior of Users While Playing Basketball with a Virtual Teammate.
Proceedings of the 9th International Conference on Agents and Artificial Intelligence, 2017

Cross-domain speech recognition using nonparallel corpora with cycle-consistent adversarial networks.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Incremental training and constructing the very deep convolutional residual network acoustic models.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Emotion recognition by combining prosody and sentiment analysis for expressing reactive emotion by humanoid robot.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Automatic meeting transcription system for the Japanese parliament (diet).
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Semi-Supervised Acoustic Model Training by Discriminative Data Selection From Multiple ASR Systems' Hypotheses.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Talking with ERICA, an autonomous android.
Proceedings of the SIGDIAL 2016 Conference, 2016

ERICA: The ERATO Intelligent Conversational Android.
Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication, 2016

Managing Dialog and Joint Actions for Virtual Basketball Teammates.
Proceedings of the Intelligent Virtual Agents - 16th International Conference, 2016

Confidence estimation for speech recognition systems using conditional random fields trained with partially annotated data.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Pronunciation error detection using DNN articulatory model based on multi-lingual and multi-task learning.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Joint Optimization of Denoising Autoencoder and DNN Acoustic Model Based on Multi-Target Learning for Noisy Speech Recognition.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Prediction and Generation of Backchannel Form for Attentive Listening Systems.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Multimodal interaction with the autonomous Android ERICA.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

Annotation and analysis of listener's engagement based on multi-modal behaviors.
Proceedings of the Workshop on Multimodal Analyses enabling Artificial Agents in Human-Machine Interaction, 2016

Prediction of ice-breaking between participants using prosodic features in the first meeting dialogue.
Proceedings of the 2nd Workshop on Advancements in Social Signal Processing for Multimodal Interaction, 2016

Data selection from multiple ASR systems' hypotheses for unsupervised acoustic model training.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Multi-lingual and multi-task DNN learning for articulatory error detection.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Automatic Speech Recognition Errors as a Predictor of L2 Listening Difficulties.
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity, 2016

Smart Posterboard: Multi-modal Sensing and Analysis of Poster Conversations.
Proceedings of the Human-Harmonized Information Technology, Volume 1 - Vertical Impact, 2016

2015
Automatic Lecture Transcription Based on Discriminative Data Selection for Lightly Supervised Acoustic Model Training.
IEICE Trans. Inf. Syst., 2015

Reverberant speech recognition combining deep neural networks and deep autoencoders augmented with a phone-class feature.
EURASIP J. Adv. Signal Process., 2015

Conversational system for information navigation based on POMDP with user focus tracking.
Comput. Speech Lang., 2015

ASR technology to empower partial and synchronized caption for L2 listening development.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Named Entity Recognizer Trainable from Partially Annotated Data.
Proceedings of the Computational Linguistics, 2015

Speech dereverberation using long short-term memory.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Ensemble speaker modeling using speaker adaptive training deep neural network for speaker adaptation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Discriminative data selection for lightly supervised training of acoustic model using closed caption texts.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Enhanced speaker diarization with detection of backchannels using eye-gaze information in poster conversations.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Deep autoencoders augmented with phone-class feature for reverberant speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Language model adaptation for academic lectures using character recognition result of presentation slides.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Synchrony in prosodic and linguistic features between backchannels and preceding utterances in attentive listening.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Automatic classification of usability of ASR result for real-time captioning of lectures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

News Navigation System Based on Proactive Dialogue Strategy.
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015

2014
Multiparty Interaction Understanding Using Smart Multimodal Digital Signage.
IEEE Trans. Hum. Mach. Syst., 2014

Lexicon optimization based on discriminative learning for automatic speech recognition of agglutinative language.
Speech Commun., 2014

Information Navigation System Based on POMDP that Tracks User Focus.
Proceedings of the SIGDIAL 2014 Conference, 2014

Corpus and transcription system of Chinese Lecture Room.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speaker diarization using eye-gaze information in multi-party conversations.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Exploring deep neural networks and deep autoencoders in reverberant speech recognition.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Speaker diarization based on audio-visual integration for smart posterboard.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Unsupervised speaker adaptation of DNN-HMM by selecting similar speakers for lecture transcription.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Japanese-to-English patent translation system based on domain-adapted word segmentation and post-ordering.
Proceedings of the 11th Conference of the Association for Machine Translation in the Americas: MT Researchers Track, 2014

2013
Substring-based machine translation.
Mach. Transl., 2013

Evaluation Framework Design of Spoken Term Detection Study at the NTCIR-9 IR for Spoken Documents Task.
Inf. Media Technol., 2013

Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis.
IEICE Trans. Inf. Syst., 2013

Overview of the NTCIR-10 SpokenDoc-2 Task.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

The Similar Segments in Social Speech Task.
Proceedings of the MediaEval 2013 Multimedia Benchmark Workshop, 2013

Estimation of interest and comprehension level of audience through multi-modal behaviors in poster conversations.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Predicate Argument Structure Analysis using Partially Annotated Corpora.
Proceedings of the Sixth International Joint Conference on Natural Language Processing, 2013

Hands-free human-robot communication robust to speaker's radial position.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

Incorporating semantic information to selection of web texts for language model of spoken dialogue system.
Proceedings of the IEEE International Conference on Acoustics, 2013

Multi-party Human-Machine Interaction Using a Smart Multimodal Digital Signage.
Proceedings of the Human-Computer Interaction. Interaction Modalities and Techniques, 2013

Smart posterboard: Multi-modal sensing and analysis of poster conversations.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Joint Phrase Alignment and Extraction for Statistical Machine Translation.
J. Inf. Process., 2012

Bayesian Learning of a Language Model from Continuous Speech.
IEICE Trans. Inf. Syst., 2012

A monotonic statistical machine translation approach to speaking style transformation.
Comput. Speech Lang., 2012

Multi-modal Sensing and Analysis of Poster Conversations: Toward Smart Posterboard.
Proceedings of the SIGDIAL 2012 Conference, 2012

Designing an Evaluation Framework for Spoken Term Detection and Spoken Document Retrieval at the NTCIR-9 SpokenDoc Task.
Proceedings of the Eighth International Conference on Language Resources and Evaluation, 2012

Comparative Analysis of Intensity between Native Speakers and Japanese Speakers of English.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Prediction of Turn-Taking by Combining Prosodic and Eye-Gaze Information in Poster Conversations.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Dereverberation based on Wavelet Packet Filtering for Robust Automatic Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Automatic Transcription of Lecture Speech using Language Model Based on Speaking-Style Transformation of Proceeding Texts.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Discriminative approach to lexical entry selection for automatic speech recognition of agglutinative language.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Transcription System Using Automatic Speech Recognition for the Japanese Parliament (Diet).
Proceedings of the Twenty-Fourth Conference on Innovative Applications of Artificial Intelligence, 2012

Multi-party human-robot interaction with distant-talking speech recognition.
Proceedings of the International Conference on Human-Robot Interaction, 2012

Group Dynamics and Multimodal Interaction Modeling Using a Smart Digital Signage.
Proceedings of the Computer Vision - ECCV 2012. Workshops and Demonstrations, 2012

Language Modeling for Spoken Dialogue System based on Filtering using Predicate-Argument Structures.
Proceedings of the COLING 2012, 2012

Language modeling for spoken dialogue system based on sentence transformation and filtering using predicate-argument structures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Hybrid vector space model for flexible voice search.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Machine Translation without Words through Substring Alignment.
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, July 8-14, 2012, Jeju Island, Korea, 2012

2011
Modeling spoken decision support dialogue and optimization of its dialogue strategy.
ACM Trans. Speech Lang. Process., 2011

Probabilistic Concatenation Modeling for Corpus-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2011

Spoken Dialogue System based on Information Extraction using Similarity of Predicate Argument Structures.
Proceedings of the SIGDIAL 2011 Conference, 2011

Overview of the IR for Spoken Documents Task in NTCIR-9 Workshop.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

Combining Slot-based Vector Space Model for Voice Book Search.
Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems, 2011

Denoising Using Optimized Wavelet Filtering for Automatic Speech Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Automatic Comma Insertion of Lecture Transcripts Based on Multiple Annotations.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Japanese Painting Study Tool: A System for Creating Nihonga Portraits.
Proceedings of the 2011 Second International Conference on Culture and Computing, 2011

An Unsupervised Model for Joint Phrase Alignment and Extraction.
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, 2011

Online Learning of Bayes Risk-Based Optimization of Dialogue Management for Document Retrieval Systems with Speech Interface.
Proceedings of the Spoken Dialogue Systems Technology and Design, 2011

2010
Speech Activity Detection for Multi-Party Conversation Analyses Based on Likelihood Ratio Test on Spatial Magnitude.
IEEE Trans. Speech Audio Process., 2010

Robust Speech Recognition Based on Dereverberation Parameter Optimization Using Acoustic Model Likelihood.
IEEE Trans. Speech Audio Process., 2010

Statistical Transformation of Language and Pronunciation Models for Spontaneous Speech Recognition.
IEEE Trans. Speech Audio Process., 2010

Bayes risk-based dialogue management for document retrieval system with speech interface.
Speech Commun., 2010

Gaussian Mixture Optimization Based on Efficient Cross-Validation.
IEEE J. Sel. Top. Signal Process., 2010

Online Unsupervised Classification With Model Comparison in the Variational Bayes Framework for Voice Activity Detection.
IEEE J. Sel. Top. Signal Process., 2010

Spoken Dialogue System Based on Information Extraction from Web Text.
Proceedings of the Spoken Dialogue Systems for Ambient Environments, 2010

Automatic transcription of parliamentary meetings and classroom lectures - A sustainable approach and real system evaluations -.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Learning a language model from continuous speech.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Detection of hot spots in poster conversations based on reactive tokens of audience.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Classroom note-taking system for hearing impaired students using automatic speech recognition adapted to lectures.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Constructing Japanese test collections for spoken term detection.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An improved wavelet-based dereverberation for robust automatic speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Semi-automated update of automatic transcription system for the Japanese national congress.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improved statistical models for SMT-based speaking style transformation.
Proceedings of the IEEE International Conference on Acoustics, 2010

Optimizing spectral subtraction and wiener filtering for robust speech recognition in reverberant and noisy conditions.
Proceedings of the IEEE International Conference on Acoustics, 2010

Using online model comparison in the Variational Bayes framework for online unsupervised Voice Activity Detection.
Proceedings of the IEEE International Conference on Acoustics, 2010

Robust hands-free Automatic Speech Recognition for human-machine interaction.
Proceedings of the 10th IEEE-RAS International Conference on Humanoid Robots, 2010

2009
Computer Assisted Language Learning system based on dynamic question generation and error prediction for automatic speech recognition.
Speech Commun., 2009

Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data.
J. Inf. Process., 2009

Effective Prediction of Errors by Non-native Speakers Using Decision Tree for Speech Recognition-Based CALL System.
IEICE Trans. Inf. Syst., 2009

A Model of Temporally Changing User Behaviors in a Deployed Spoken Dialogue System.
Proceedings of the User Modeling, 2009

Japanese CALL system based on dynamic question generation and error prediction for ASR.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2009

Acoustic event detection for spotting "hot spots" in podcasts.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A WFST-based log-linear framework for speaking-style transformation.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Optimization of dereverberation parameters based on likelihood of speech recognizer.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Automatic transcription system for meetings of the Japanese national congress.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Optimal learning of P-Layer additive F0 models with cross-validation.
Proceedings of the IEEE International Conference on Acoustics, 2009

Language model transformation applied to lightly supervised training of acoustic model for congress meetings.
Proceedings of the IEEE International Conference on Acoustics, 2009

New perspectives on spoken language understanding: Does machine need to fully understand speech?
Proceedings of the 2009 IEEE Workshop on Automatic Speech Recognition & Understanding, 2009

2008
Voice Activity Detection Based on High Order Statistics and Online EM Algorithm.
IEICE Trans. Inf. Syst., 2008

Test Collections for Spoken Document Retrieval from Lecture Audio Data.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

A Japanese CALL system based on dynamic question generation and error prediction for ASR.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Aggregated cross-validation and its efficient application to Gaussian mixture optimization.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Extracting word-pronunciation pairs from comparable set of text and speech.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Predicting ASR errors by exploiting barge-in rate of individual users for spoken dialogue systems.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Detection of feeling through back-channels in spoken dialogue.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Multi-modal recording, analysis and indexing of poster sessions.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Statistical speech activity detection based on spatial power distribution for analyses of poster presentations.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Effective error prediction using decision tree for ASR grammar network in call system.
Proceedings of the IEEE International Conference on Acoustics, 2008

GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.
Proceedings of the IEEE International Conference on Acoustics, 2008

Admissible stopping in viterbi beam search for unit selection in concatenative speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008

Automatic lecture transcription by exploiting presentation slide information for language model adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2008

Using variational bayes free energy for unsupervised voice activity detection.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Out-of-Domain Utterance Detection Using Classification Confidences of Multiple Topics.
IEEE Trans. Speech Audio Process., 2007

Real-Time Continuous Speech Recognition System on SH-4A Microprocessor.
Proceedings of the IEEE 9th Workshop on Multimedia Signal Processing, 2007

Evaluating and optimizing Japanese tutor system featuring dynamic question generation and interactive guidance.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Gaussian mixture optimization for HMM based on efficient cross-validation.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Bayes risk-based optimization of dialogue management for document retrieval system with speech interface.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Analyzing temporal transition of real user's behaviors in a spoken dialogue system.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Evaluation of real-time voice activity detection based on high order statistics.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

PLSA-based topic detection in meetings for adaptation of lexicon and language model.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

An Interactive Framework for Document Retrieval and Presentation with Question-Answering Function in Restricted Domain.
Proceedings of the New Trends in Applied Artificial Intelligence, 2007

Multi-modal conversational analysis of poster presentations using multiple sensors.
Proceedings of the 2007 workshop on Tagging, 2007

Speech-Based Interactive Information Guidance System using Question-Answering Technique.
Proceedings of the IEEE International Conference on Acoustics, 2007

Automatic Detection of Sentence and Clause Units using Local Syntactic Dependency.
Proceedings of the IEEE International Conference on Acoustics, 2007

Topic-Independent Speaking-Style Transformation of Language Model for Spontaneous Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2007

HMM training based on CV-EM and CV Gaussian mixture optimization.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Dialogue strategy to clarify user's queries for document retrieval system with speech interface.
Speech Commun., 2006

Trigger-Based Language Model Adaptation for Automatic Transcription of Panel Discussions.
IEICE Trans. Inf. Syst., 2006

Verification of Speech Recognition Results Incorporating In-domain Confidence and Discourse Coherence Measures.
IEICE Trans. Inf. Syst., 2006

Embedded Julius: Continuous Speech Recognition Software for Microprocessor.
Proceedings of the IEEE 8th Workshop on Multimedia Signal Processing, 2006

Dependency-structure Annotation to Corpus of Spontaneous Japanese.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Prototyping a call system for students of Japanese using dynamic diagram generation and interactive hints.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Decision tree-based training of probabilistic concatenation models for corpus-based speech synthesis.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A bootstrapping approach for developing language model of new spoken dialogue systems by selecting web texts.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Evaluation of voice activity detection by combining multiple features with weight adaptation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Detection of quotations and inserted clauses and its application to dependency structure analysis in spontaneous Japanese.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Voice activity detector based on enhanced cumulant of LPC residual and on-line EM algorithm.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Sentence boundary detection of spontaneous Japanese using statistical language model and support vector machines.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Efficient Estimation of Language Model Statistics of Spontaneous Speech Via Statistical Transformation Model.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
User Modeling in Spoken Dialogue Systems to Generate Flexible Guidance.
User Model. User Adapt. Interact., 2005

Speaker model selection based on the Bayesian information criterion applied to unsupervised speaker indexing.
IEEE Trans. Speech Audio Process., 2005

Unsupervised speaker indexing of discussions using anchor models.
Syst. Comput. Jpn., 2005

Dialogue Speech Recognition by Combining Hierarchical Topic Classification and Language Model Switching.
IEICE Trans. Inf. Syst., 2005

Language Model Adaptation Based on PLSA of Topics and Speakers for Automatic Transcription of Panel Discussions.
IEICE Trans. Inf. Syst., 2005

Speech-based Information Retrieval System with Clarification Dialogue Strategy.
Proceedings of the HLT/EMNLP 2005, 2005

Trigger-based language model adaptation for automatic meeting transcription.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Minimum Bayes-risk decoding considering word significance for information retrieval system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Utterance verification incorporating in-domain confidence and discourse coherence measures.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Voice activity detection based on optimally weighted combination of multiple features.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

A New ASR Evaluation Measure and Minimum Bayes-Risk Decoding for Open-domain Speech Understanding.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Incorporating Dialogue Context and Topic Clustering in Out-of-Domain Detection.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Generalized Statistical Modeling of Pronunciation Variations using Variable-length Phone Context.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Language model and speaking rate adaptation for spontaneous presentation speech recognition.
IEEE Trans. Speech Audio Process., 2004

Automatic indexing of lecture presentations using unsupervised learning of presumed discourse markers.
IEEE Trans. Speech Audio Process., 2004

Introduction to the Special Issue on Spontaneous Speech Processing.
IEEE Trans. Speech Audio Process., 2004

Bus Information System Based on User Models and Dynamic Generation of VoiceXML Scripts.
Proceedings of the New Frontiers in Artificial Intelligence - JSAI 2003 and JSAI 2004 Conferences and Workshops, Niigata, Japan, June 23-27, 2003 and Kanazawa, Japan, May 31, 2004

Confirmation strategy for document retrieval systems with spoken dialog interface.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Example-based training of dialogue planning incorporating user and situation models.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Automatic extraction of key sentences from oral presentations using statistical measure based on discourse markers.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Dependency structure analysis and sentence boundary detection in spontaneous Japanese.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Automatic transformation of lecture transcription into document style using statistical framework.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Recent progress of open-source LVCSR engine julius and Japanese model repository.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Topic classification and verification modeling for out-of-domain utterance detection.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Practical use of English pronunciation system for Japanese students in the CALL classroom.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Language model adaptation based on PLSA of topics and speakers.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Recognition of Emotional States in Spoken Dialogue with a Robot.
Proceedings of the Innovations in Applied Artificial Intelligence, 2004

Automatic audio archiving system for panel discussions.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Speaker indexing and adaptation using speaker clustering based on statistical model selection.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Automatic indexing of key sentences for lecture archives using statistics of presumed discourse markers.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Real-time word confidence scoring using local posterior probabilities on tree trellis search.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Out-of-domain detection based on confidence measures from multiple topic classification.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Efficient Confirmation Strategy for Large-scale Text Retrieval Systems with Spoken Dialogue Interface.
Proceedings of the COLING 2004, 2004

2003
Flexible Spoken Dialogue System based on User Models and Dynamic Generation of VoiceXML Scripts.
Proceedings of the SIGDIAL 2003 Workshop, 2003

Speaker model selection using Bayesian information criterion for speaker indexing and speaker adaptation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Hierarchical topic classification for dialog speech recognition based on language model switching.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

User modeling in spoken dialogue systems for flexible guidance generation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Spoken dialogue system for queries on appliance manuals using hierarchical confirmation strategy.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised speaker indexing using anchor models and automatic transcription of discussions.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised speaker indexing using speaker model selection based on Bayesian information criterion.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Language model switching based on topic detection for dialog speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Flexible Guidance Generation Using User Model in Spoken Dialogue Systems.
Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics, 2003

Dialog Navigator : A Spoken Dialog Q-A System based on Large Text Knowledge Base.
Proceedings of the ACL 2003, 2003

2002
Continuous Speech Recognition Consortium an Open Repository for CSR Tools and Models.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Belief network based disambiguation of object reference in spoken dialogue system for robot.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Recognition and verification of English by Japanese students for computer-assisted language learning system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Automatic intelligibility assessment and diagnosis of critical pronunciation errors for computer-assisted pronunciation learning.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speaking rate compensation based on likelihood criterion in acoustic model training and decoding.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Modeling and automatic detection of English sentence stress for computer-assisted English prosody learning system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speaking-rate dependent decoding and adaptation for spontaneous lecture speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2002

Automatic indexing of lecture speech by extracting topic-independent discourse markers.
Proceedings of the IEEE International Conference on Acoustics, 2002

Efficient Dialogue Strategy to Find Users' Intended Items from Information Query Results.
Proceedings of the 19th International Conference on Computational Linguistics, 2002

2001
Speaking rate dependent acoustic modeling for spontaneous lecture speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Julius - an open source real-time large vocabulary recognition engine.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Domain-independent spoken dialogue platform using key-phrase spotting based on combined language model.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Gaussian mixture selection using context-independent HMM.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
IPA Japanese Dictation Free Software Project.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Computer-assisted English vowel learning system for Japanese speakers using cross language formant structures.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic diagnosis of recognition errors in large vocabulary continuous speech recognition systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Generating effective confirmation and guidance using two-level confidence measures for dialogue systems.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Free software toolkit for Japanese large vocabulary continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Automatic transcription of lecture speech using topic-independent language modeling.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Modelling of the perception of English sentence stress for computer-assisted language learning.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Overview of an intelligent system for information retrieval based on human-machine dialogue through spoken language.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A new phonetic tied-mixture model for efficient decoding.
Proceedings of the IEEE International Conference on Acoustics, 2000

Flexible Mixed-Initiative Dialogue Management using Concept-Level Confidence Measures of Speech Recognizer Output.
Proceedings of the COLING 2000, 18th International Conference on Computational Linguistics, Proceedings of the Conference, 2 Volumes, July 31, 2000

1999
Topic independent language model for key-phrase detection and verification.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Flexible speech understanding based on combined key-phrase detection and verification.
IEEE Trans. Speech Audio Process., 1998

Prosodic analysis of fillers and self-repair in Japanese speech.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

An efficient two-pass search algorithm using word trellis index.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Sharable software repository for Japanese large vocabulary continuous speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Speaking-style dependent lexicalized filler model for key-phrase detection and verification.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Automatic pronunciation error detection and guidance for foreign language learning.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1997
Individual identification by integrating facial image, walking image, and vocal features.
Syst. Comput. Jpn., 1997

Task adaptation using MAP estimation in N-gram language modeling.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Combining key-phrase detection and subword-based verification for flexible speech understanding.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
Key-phrase detection and verification for flexible speech understanding.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Concept-based phrase spotting approach for spontaneous speech understanding.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1994
Continuous speech recognition based on A* search with word-pair constraint as heuristics.
Syst. Comput. Jpn., 1994

Keyword and phrase spotting with heuristic language model.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Heuristic search integrating syntactic, semantic and dialog-level constraints.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1992
HMM based on pair-wise Bayes classifiers.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1991
Speaker-independent consonant recognition by integrating discriminant analysis and hmm.
Syst. Comput. Jpn., 1991

Unsupervised speaker normalization by speaker Markov model converter for speaker-independent speech recognition.
Proceedings of the Second European Conference on Speech Communication and Technology, 1991

Phoneme recognition by combining discriminant analysis and HMM.
Proceedings of the 1991 International Conference on Acoustics, 1991

1990
Phoneme recognition by combining Bayesian linear discriminations of selected pairs of classes.
Proceedings of the First International Conference on Spoken Language Processing, 1990


  Loading...