Mana Ihori

Talking Face Generation for Impression Conversion Considering Speech Semantics.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2024

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

End-to-End Joint Target and Non-Target Speakers ASR.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Retrieval, Masking, and Generation: Feedback Comment Generation using Masked Comment Examples.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 16th International Natural Language Generation Conference, 2023

Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2023

Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 31st European Signal Processing Conference, 2023

Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Multi-Perspective Document Revision.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 29th International Conference on Computational Linguistics, 2022

Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Enrollment-Less Training for Personalized Voice Activity Detection.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Parallel Corpus for Japanese Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

,

Akihiko Takashima

,

Ryo Masumura

Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 13th International Conference on Natural Language Generation, 2020

Sequence-Level Consistency Training for Semi-Supervised End-to-End Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Large-Context Pointer-Generator Networks for Spoken-to-Written Style Conversion.

[BibT_eX]

[DOI]

Mana Ihori

,

Akihiko Takashima

,

Ryo Masumura

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

End-to-End Automatic Speech Recognition with Deep Mutual Learning.

[BibT_eX]

[DOI]

,

,

,

,

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Generalized Large-Context Language Models Based on Forward-Backward Hierarchical Recurrent Encoder-Decoder Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Ryuichiro Higashinaka

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Mana Ihori

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...