Takanori Ashihara

Orcid: 0009-0003-4322-4127

According to our database¹, Takanori Ashihara authored at least 38 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Guided Speaker Embedding.

[BibT_eX]

[DOI]

CoRR, 2024

Investigation of Speaker Representation for Target-Speaker Speech Processing.

[BibT_eX]

[DOI]

CoRR, 2024

Boosting Hybrid Autoregressive Transducer-based ASR with Internal Acoustic Model Training and Dual Blank Thresholding.

[BibT_eX]

[DOI]

CoRR, 2024

Alignment-Free Training for Transducer-based Multi-Talker ASR.

[BibT_eX]

[DOI]

CoRR, 2024

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings.

[BibT_eX]

[DOI]

CoRR, 2024

Sentence-wise Speech Summarization: Task, Datasets, and End-to-End Modeling with LM Knowledge Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling.

[BibT_eX]

[DOI]

CoRR, 2024

Lightweight Zero-shot Text-to-Speech with Mixture of Adapters.

[BibT_eX]

[DOI]

CoRR, 2024

Applying LLMs for Rescoring N-best ASR Hypotheses of Casual Conversations: Effects of Domain Adaptation and Context Carry-over.

[BibT_eX]

[DOI]

CoRR, 2024

Unveiling the Linguistic Capabilities of a Self-Supervised Speech Model Through Cross-Lingual Benchmark and Layer- Wise Similarity Analysis.

[BibT_eX]

[DOI]

IEEE Access, 2024

Probing Self-Supervised Learning Models With Target Speech Extraction.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with Adapters.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

What Do Self-Supervised Speech and Speaker Models Learn? New Findings from a Cross Model Layer-Wise Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

Downstream Task Agnostic Speech Enhancement with Self-Supervised Representation Loss.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Knowledge Distillation for Neural Transducer-based Target-Speaker ASR: Exploiting Parallel Mixture/Single-Talker Speech Data.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Transfer Learning from Pre-trained Language Models Improves End-to-End Speech Summarization.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

SpeechGLUE: How Well Can Self-Supervised Speech Models Capture Linguistic Knowledge?

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Scheduled Sampling for Neural Transducer-Based ASR.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Large Text Corpora For End-To-End Speech Summarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Zero-Shot Text-to-Speech Synthesis Conditioned Using Self-Supervised Speech Representation Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Summarize While Translating: Universal Model With Parallel Decoding for Summarization and Translation.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022

On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis.

[BibT_eX]

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Domain Adversarial Self-Supervised Speech Representation Learning for Improving Unknown Domain Downstream Tasks.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Deep versus Wide: An Analysis of Student Architectures for Task-Agnostic Knowledge Distillation of Self-Supervised Speech Models.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Customer Satisfaction Estimation Using Unsupervised Representation Learning with Multi-Format Prediction Loss.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Investigating the Impact of Spectral and Temporal Degradation on End-to-End Automatic Speech Recognition Performance.

[BibT_eX]

[DOI]

Takanori Ashihara

Takafumi Moriya

Makio Kashino

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Speech Emotion Recognition Based on Listener Adaptive Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Self-Distillation for Improving CTC-Transformer-Based ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Distilling Attention Weights for CTC-Based ASR Systems.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

End-to-End Automatic Speech Recognition with Deep Mutual Learning.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Neural Whispered Speech Detection with Imbalanced Learning.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Revisiting Dynamic Adjustment of Language Model Scaling Factor for Automatic Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Takanori Ashihara

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...