Ryo Masumura

Orcid: 0000-0002-2415-4149

According to our database¹, Ryo Masumura authored at least 134 papers between 2010 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding.

[BibT_eX]

[DOI]

CoRR, 2024

Alignment-Free Training for Transducer-based Multi-Talker ASR.

[BibT_eX]

[DOI]

CoRR, 2024

Factor-Conditioned Speaking-Style Captioning.

[BibT_eX]

[DOI]

CoRR, 2024

Born-Again Multi-task Self-training for Multi-task Facial Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the Pattern Recognition - 27th International Conference, 2024

Talking Face Generation for Impression Conversion Considering Speech Semantics.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Block Refinement Learning for Improving Early Exit in Autoregressive ASR.

[BibT_eX]

[DOI]

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

2023

Multi-region CNN-Transformer for Micro-gesture Recognition in Face and Upper Body.

[BibT_eX]

[DOI]

Proceedings of the ACM Multimedia Asia 2023, 2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

End-to-End Joint Target and Non-Target Speakers ASR.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

What are differences? Comparing DNN and Human by Their Performance and Characteristics in Speaker Age Estimation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Retrieval, Masking, and Generation: Feedback Comment Generation using Masked Comment Examples.

[BibT_eX]

[DOI]

Proceedings of the 16th International Natural Language Generation Conference, 2023

Open-Set Recognition for Facial-Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

OnDA-DETR: Online Domain Adaptation for Detection Transformers with Self-Training Framework.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

Distilling Knowledge of Bidirectional Language Model for Scene Text Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Image Processing, 2023

Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff.

[BibT_eX]

[DOI]

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Scheduled Sampling for Neural Transducer-Based ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Next-Speaker Prediction Based on Non-Verbal Information in Multi-Party Video Conversation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Large Text Corpora For End-To-End Speech Summarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Modeling Lead-Lag Structure in Facial Expression Synchrony for Social-Psychological Outcome Prediction from Negotiation Interaction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.

[BibT_eX]

[DOI]

Proceedings of the 31st European Signal Processing Conference, 2023

2022

Audio Visual Scene-Aware Dialog Generation with Transformer-based Video Representations.

[BibT_eX]

[DOI]

CoRR, 2022

Knowledge Transferred Fine-Tuning: Convolutional Neural Network Is Born Again With Anti-Aliasing Even in Data-Limited Situations.

[BibT_eX]

[DOI]

IEEE Access, 2022

On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Multimodal Negotiation Corpus with Various Subjective Assessments for Social-Psychological Outcome Prediction from Non-Verbal Cues.

[BibT_eX]

[DOI]

Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Dialogue Acts Aided Important Utterance Detection Based on Multiparty and Multimodal Information.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Fully Shareable Scene Text Recognition Modeling for Horizontal and Vertical Writing.

[BibT_eX]

[DOI]

Proceedings of the 2022 IEEE International Conference on Image Processing, 2022

Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Customer Satisfaction Estimation Using Unsupervised Representation Learning with Multi-Format Prediction Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Perspective Document Revision.

[BibT_eX]

[DOI]

Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021

Hierarchical Latent Words Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

J. Inf. Process., 2021

Neural candidate-aware language models for speech recognition.

[BibT_eX]

[DOI]

Tomohiro Tanaka

Ryo Masumura

Takanobu Oba

Comput. Speech Lang., 2021

Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings.

[BibT_eX]

[DOI]

Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.

[BibT_eX]

[DOI]

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Enrollment-Less Training for Personalized Voice Activity Detection.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Emotion Recognition Based on Listener Adaptive Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020

Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2020

DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Generating Responses that Reflect Meta Information in User-Generated Question Answer Pairs.

[BibT_eX]

[DOI]

Takashi Kodama

Ryuichiro Higashinaka

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Parallel Corpus for Japanese Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

Akihiko Takashima

Ryo Masumura

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Self-Distillation for Improving CTC-Transformer-Based ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Transformer-Based Audio Captioning Model with Keyword Estimation.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Natural Language Generation, 2020

Distilling Attention Weights for CTC-Based ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Sequence-Level Consistency Training for Semi-Supervised End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Large-Context Pointer-Generator Networks for Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

Akihiko Takashima

Ryo Masumura

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Sequence-To-One Neural Networks for Japanese Dialect Speech Classification.

[BibT_eX]

[DOI]

Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

End-to-End Automatic Speech Recognition with Deep Mutual Learning.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Dialect-Aware Modeling for End-to-End Japanese Dialect Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Viterbi Approximation of Latent Words Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

J. Inf. Process., 2019

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2019

Recurrent out-of-vocabulary word detection based on distribution of features.

[BibT_eX]

[DOI]

Comput. Speech Lang., 2019

Does Speaking Training Application with Speech Recognition Motivate Junior High School Students in Actual Classroom? - A Case Study.

[BibT_eX]

[DOI]

Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Joint Maximization Decoder with Neural Converters for Fully Neural Network-Based Japanese Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speech Emotion Recognition Based on Multi-Label Emotion Existence Model.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Large Context End-to-end Automatic Speech Recognition via Extension of Hierarchical Recurrent Encoder-decoder Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Context-Aware Neural Voice Activity Detection Using Auxiliary Networks for Phoneme Recognition, Speech Enhancement and Acoustic Scene Classification.

[BibT_eX]

[DOI]

Proceedings of the 27th European Signal Processing Conference, 2019

Generalized Large-Context Language Models Based on Forward-Backward Hierarchical Recurrent Encoder-Decoder Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Disfluency Detection Based on Speech-Aware Token-by-Token Sequence Labeling with BLSTM-CRFs and Attention Mechanisms.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Revisiting Dynamic Adjustment of Language Model Scaling Factor for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Can We Simulate Generative Process of Acoustic Modeling Data? Towards Data Restoration for Acoustic Modeling.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Urgent Voicemail Detection Focused on Long-term Temporal Variation.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Likability Estimation of Call-center Agents by Suppressing Annotator Variability.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2018

Neural Dialogue Context Online End-of-Turn Detection.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Yushi Aono

Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Neural Error Corrective Language Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Automatic Question Detection from Acoustic and Phonetic Features Using Feature-wise Pre-training.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Neural Confnet Classification: Fully Neural Network Based Spoken Utterance Classification Using Word Confusion Networks.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Soft-Target Training with Ambiguous Emotional Utterances for DNN-Based Speech Emotion Classification.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Adversarial Training for Multi-task and Multi-lingual Joint Modeling of Utterance Intent Classification.

[BibT_eX]

[DOI]

Ryo Masumura

Yusuke Shinohara

Ryuichiro Higashinaka

Yushi Aono

Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Multi-task and Multi-lingual Joint Learning of Neural Lexical Utterance Classification based on Partially-shared Modeling.

[BibT_eX]

[DOI]

Ryo Masumura

Tomohiro Tanaka

Ryuichiro Higashinaka

Hirokazu Masataki

Yushi Aono

Proceedings of the 27th International Conference on Computational Linguistics, 2018

Neural Speech-to-Text Language Models for Rescoring Hypotheses of DNN-HMM Hybrid Automatic Speech Recognition Systems.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Progressive Neural Network-based Knowledge Transfer in Acoustic Models.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Online Call Scene Segmentation of Contact Center Dialogues based on Role Aware Hierarchical LSTM-RNNs.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Relevant Phonetic-aware Neural Acoustic Models using Native English and Japanese Speech for Japanese-English Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Parallel Hierarchical Attention Networks with Shared Memory Reader for Multi-Stream Conversational Document Classification.

[BibT_eX]

[DOI]

Naoki Sawada

Ryo Masumura

Hiromitsu Nishizaki

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Online End-of-Turn Detection from Speech Based on Stacked Time-Asynchronous Sequential Networks.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Prosody Aware Word-Level Encoder Based on BLSTM-RNNs for DNN-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Hierarchical LSTMs with Joint Learning for Estimating Customer Satisfaction from Contact Center Calls.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Improving Neural Text Normalization with Data Augmentation at Character- and Morphological Levels.

[BibT_eX]

[DOI]

Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Hyperspherical Query Likelihood Models with Word Embeddings.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Proceedings of the Eighth International Joint Conference on Natural Language Processing, 2017

Parallel phonetically aware DNNs and LSTM-RNNS for frame-by-frame discriminative modeling of spoken language identification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Domain adaptation of DNN acoustic models using knowledge distillation.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Joint unsupervised adaptation of n-gram and RNN language models via LDA-based hybrid mixture modeling.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

N-gram Approximation of Latent Words Language Models for Domain Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2016

Investigation of Combining Various Major Language Model Technologies including Data Expansion and Adaptation.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2016

Mechanism and Control of Whole-Body Electro-Hydrostatic Actuator Driven Humanoid Robot Hydra.

[BibT_eX]

[DOI]

Proceedings of the International Symposium on Experimental Robotics, 2016

Language Identification Based on Generative Modeling of Posteriorgram Sequences Extracted from Frame-by-Frame DNNs and LSTM-RNNs.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Recurrent Out-of-Vocabulary Word Detection Using Distribution of Features.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Enhancement of mechanical strength, computational power, and heat management for fieldwork humanoid robots.

[BibT_eX]

[DOI]

Proceedings of the 16th IEEE-RAS International Conference on Humanoid Robots, 2016

2015

Discourse Relation Recognition by Comparing Various Units of Sentence Expression with Recursive Neural Network.

[BibT_eX]

[DOI]

Ryuichiro Higashinaka

Toshiro Makino

Yoshihiro Matsuo

Proceedings of the 29th Pacific Asia Conference on Language, Information and Computation, 2015

Latent words recurrent neural network language models.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Combinations of various language model technologies including data expansion and adaptation in spontaneous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Training data selection for acoustic modeling via submodular optimization of joint kullback-leibler divergence.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Hierarchical Latent Words Language Models for Robust Modeling to Out-Of Domain Tasks.

[BibT_eX]

[DOI]

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

2014

Mixture of latent words language models for domain adaptation.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Read and spontaneous speech classification based on variance of GMM supervectors.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Role play dialogue topic model for language model adaptation in multi-party conversation speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

2013

Viterbi decoding for latent words language models using gibbs sampling.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Use of latent words language models in ASR: A sampling-based implementation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

2011

Language Model Expansion Using Webdata for Spoken Document Retrieval.

[BibT_eX]

[DOI]

Ryo Masumura

Seongjun Hahm

Akinori Ito

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Training a Language Model Using Webdata for Large Vocabulary Japanese Spontaneous Speech Recognition.

[BibT_eX]

[DOI]

Ryo Masumura

Seongjun Hahm

Akinori Ito

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Document expansion using relevant web documents for spoken document retrieval.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010

Ryo Masumura

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...