Tomohiro Tanaka

Orcid: 0000-0002-8884-9089

According to our database1, Tomohiro Tanaka authored at least 85 papers between 2002 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Talking Face Generation for Impression Conversion Considering Speech Semantics.
Proceedings of the IEEE International Conference on Acoustics, 2024

Detection of Circulating Tumor Cells in Blood Using Random Forest.
Proceedings of the International Conference on Electronics, Information, and Communication, 2024

2023
Attention as Annotation: Generating Images and Pseudo-masks for Weakly Supervised Semantic Segmentation with Diffusion.
CoRR, 2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

End-to-End Joint Target and Non-Target Speakers ASR.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transcribing Speech as Spoken and Written Dual Text Using an Autoregressive Model.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Audio-Visual Praise Estimation for Conversational Video based on Synchronization-Guided Multimodal Transformer.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Retrieval, Masking, and Generation: Feedback Comment Generation using Masked Comment Examples.
Proceedings of the 16th International Natural Language Generation Conference, 2023

Ladder Siamese Network: A Method and Insights for Multi-Level Self-Supervised Learning.
Proceedings of the IEEE International Conference on Image Processing, 2023

Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Scheduled Sampling for Neural Transducer-Based ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Large Text Corpora For End-To-End Speech Summarization.
Proceedings of the IEEE International Conference on Acoustics, 2023

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

Text-to-Text Pre-Training with Paraphrasing for Improving Transformer-Based Image Captioning.
Proceedings of the 31st European Signal Processing Conference, 2023

2022
Automatic Spoken Language Acquisition Based on Observation and Dialogue.
IEEE J. Sel. Top. Signal Process., 2022

Control of Spindle Position and Stiffness of Aerostatic-Bearing-Type Air Turbine Spindle.
Int. J. Autom. Technol., 2022

Stochastic optimization of a mixed moving average process for controlling non-Markovian streamflow environments.
CoRR, 2022

Modeling and computation of an integral operator Riccati equation for an infinite-dimensional stochastic differential equation governing streamflow discharge.
Comput. Math. Appl., 2022

Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Strategies to Improve Robustness of Target Speech Extraction to Enrollment Variations.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Circulating Tumor Cells Detection by Brightness Values Analysis and Circularity.
Proceedings of the 7th International Conference on Frontiers of Signal Processing, 2022

Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Perspective Document Revision.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
Neural candidate-aware language models for speech recognition.
Comput. Speech Lang., 2021

Large-Context Conversational Representation Learning: Self-Supervised Learning For Conversational Documents.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Utilizing Resource-Rich Language Datasets for End-to-End Scene Text Recognition in Resource-Poor Languages.
Proceedings of the MMAsia '21: ACM Multimedia Asia, Gold Coast, Australia, December 1, 2021

End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Enrollment-Less Training for Personalized Voice Activity Detection.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Context-Free TextSpotter for Real-Time and Mobile End-to-End Text Detection and Recognition.
Proceedings of the 16th International Conference on Document Analysis and Recognition, 2021

Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Transformer-Based Large-Context End-To-End ASR with Large-Context Knowledge Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2021

Audio-Visual Speech Separation Using Cross-Modal Correspondence Loss.
Proceedings of the IEEE International Conference on Acoustics, 2021

MAPGN: Masked Pointer-Generator Network for Sequence-to-Sequence Pre-Training.
Proceedings of the IEEE International Conference on Acoustics, 2021

Hierarchical Knowledge Distillation for Dialogue Sequence Labeling.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Domain Adaptation for Dialogue Sequence Labeling Based on Hierarchical Adversarial Training.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Self-Distillation for Improving CTC-Transformer-Based ASR Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Phoneme-to-Grapheme Conversion Based Large-Scale Pre-Training for End-to-End Automatic Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Memory Attentive Fusion: External Language Model Integration for Transformer-based Sequence-to-Sequence Model.
Proceedings of the 13th International Conference on Natural Language Generation, 2020

Unsupervised Sound Source Localization From Audio-Image Pairs Using Input Gradient Map.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Distilling Attention Weights for CTC-Based ASR Systems.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Spoken Language Acquisition Based on Reinforcement Learning and Word Unit Segmentation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Unsupervised Domain Adversarial Training in Angular Space for Facial Expression Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

End-to-End Automatic Speech Recognition with Deep Mutual Learning.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

An automatic domain updating method for fast 2-dimensional flood-inundation modelling.
Environ. Model. Softw., 2019

A Joint End-to-End and DNN-HMM Hybrid Automatic Speech Recognition System with Transferring Sharable Knowledge.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Joint Maximization Decoder with Neural Converters for Fully Neural Network-Based Japanese Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Automatic Speech Recognition with a Reconstruction Criterion Using Speech-to-Text and Text-to-Speech Encoder-Decoders.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Large Context End-to-end Automatic Speech Recognition via Extension of Hierarchical Recurrent Encoder-decoder Models.
Proceedings of the IEEE International Conference on Acoustics, 2019

Efficient Free Keyword Detection Based on CNN and End-to-End Continuous DP-Matching.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Generalized Large-Context Language Models Based on Forward-Backward Hierarchical Recurrent Encoder-Decoder Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Disfluency Detection Based on Speech-Aware Token-by-Token Sequence Labeling with BLSTM-CRFs and Attention Mechanisms.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Neural Dialogue Context Online End-of-Turn Detection.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Micromagnetic Study of Probabilistic Switching Behavior in Sub 20 nm-CoFeB/MgO Magnetic Tunnel Junction.
Proceedings of the Non-Volatile Memory Technology Symposium, 2018

Neural Error Corrective Language Models for Automatic Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Multi-task and Multi-lingual Joint Learning of Neural Lexical Utterance Classification based on Partially-shared Modeling.
Proceedings of the 27th International Conference on Computational Linguistics, 2018

F-Measure Based End-to-End Optimization of Neural Network Keyword Detectors.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Neural Speech-to-Text Language Models for Rescoring Hypotheses of DNN-HMM Hybrid Automatic Speech Recognition Systems.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Online Call Scene Segmentation of Contact Center Dialogues based on Role Aware Hierarchical LSTM-RNNs.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2016
Automated structure discovery and parameter tuning of neural network language model based on evolution strategy.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Development of a passive knee mechanism that realizes level walk and stair ascent functions for transfemoral prosthesis.
Proceedings of the 6th IEEE International Conference on Biomedical Robotics and Biomechatronics, 2016

2015
How Co-translational Folding of Multi-domain Protein Is Affected by Elongation Schedule: Molecular Simulations.
PLoS Comput. Biol., 2015

Development of a Trax Artificial Intelligence algorithm using path and edge.
Proceedings of the 2015 International Conference on Field Programmable Technology, 2015

Time-dependent sleep stage transition model based on heart rate variability.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Automation of system building for state-of-the-art large vocabulary speech recognition using evolution strategy.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
The 10th Generation 16-Core SPARC64™ Processor for Mission Critical UNIX Server.
IEEE J. Solid State Circuits, 2014

Heart rate monitoring through the surface of a drinkware.
Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing, 2014

A study of gait analysis with a smartphone for measurement of hip joint angle.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A 10<sup>th</sup> generation 16-core SPARC64 processor for mission-critical UNIX server.
Proceedings of the 2013 IEEE International Solid-State Circuits Conference, 2013

2008
A Large-Scale, Flip-Flop RAM Imitating a Logic LSI for Fast Development of Process Technology.
IEICE Trans. Electron., 2008

2004
Robust <i>F</i><sub>0</sub> Estimation of Speech Signal Using Harmonicity Measure Based on Instantaneous Frequency.
IEICE Trans. Inf. Syst., 2004

2002
Fundamental frequency estimation based on instantaneous frequency amplitude spectrum.
Proceedings of the IEEE International Conference on Acoustics, 2002


  Loading...