Atsushi Ando
Orcid: 0000-0002-3971-0654
According to our database1,
Atsushi Ando
authored at least 43 papers
between 2015 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
Speech Rhythm-Based Speaker Embeddings Extraction from Phonemes and Phoneme Duration for Multi-Speaker Speech Synthesis.
IEICE Trans. Inf. Syst., January, 2024
Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings.
CoRR, 2024
SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling.
CoRR, 2024
NTT Speaker Diarization System for Chime-7: Multi-Domain, Multi-Microphone end-to-end and Vector Clustering Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
Proceedings of the ACM Multimedia Asia 2023, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Joint Autoregressive Modeling of End-to-End Multi-Talker Overlapped Speech Recognition and Utterance-level Timestamp Prediction.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
OnDA-DETR: Online Domain Adaptation for Detection Transformers with Self-Training Framework.
Proceedings of the IEEE International Conference on Image Processing, 2023
Adversarial Finetuning with Latent Representation Constraint to Mitigate Accuracy-Robustness Tradeoff.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
2022
Knowledge Transferred Fine-Tuning: Convolutional Neural Network Is Born Again With Anti-Aliasing Even in Data-Limited Situations.
IEEE Access, 2022
On the Use of Modality-Specific Large-Scale Pre-Trained Encoders for Multimodal Sentiment Analysis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Interactive Co-Learning with Cross-Modal Transformer for Audio-Visual Emotion Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
End-to-End Joint Modeling of Conversation History-Dependent and Independent ASR Systems with Multi-History Training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Speaker consistency loss and step-wise optimization for semi-supervised joint training of TTS and ASR using unpaired text data.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.
Proceedings of the IEEE International Conference on Acoustics, 2022
Customer Satisfaction Estimation Using Unsupervised Representation Learning with Multi-Format Prediction Loss.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Speech Emotion Recognition in Real Environments using Characteristics of Emotional Expression and Perception.
PhD thesis, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Simpleflat: A Simple Whole-Network Pre-Training Approach for RNN Transducer-Based End-to-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Sequence-Level Consistency Training for Semi-Supervised End-to-End Automatic Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Improving Conversation-Context Language Models with Multiple Spoken Language Understanding Models.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Improving Speech-Based End-of-Turn Detection Via Cross-Modal Representation Learning with Punctuated Text Data.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018
Role Play Dialogue Aware Language Models Based on Conditional Hierarchical Recurrent Encoder-Decoder.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Automatic Question Detection from Acoustic and Phonetic Features Using Feature-wise Pre-training.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Soft-Target Training with Ambiguous Emotional Utterances for DNN-Based Speech Emotion Classification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Online Call Scene Segmentation of Contact Center Dialogues based on Role Aware Hierarchical LSTM-RNNs.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Hierarchical LSTMs with Joint Learning for Estimating Customer Satisfaction from Contact Center Calls.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Robust children and adults speech identification and confidence measure based on DNN posteriorgram.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016
2015
Agreement and disagreement utterance detection in conversational speech by extracting and integrating local features.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015