Naoyuki Kanda
Orcid: 0000-0002-8628-3288
According to our database1,
Naoyuki Kanda
authored at least 84 papers
between 2005 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Laugh Now Cry Later: Controlling Time-Varying Emotional States of Flow-Matching-Based Zero-Shot Text-to-Speech.
CoRR, 2024
CoRR, 2024
NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription.
CoRR, 2024
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
i-Code V2: An Autoregressive Generation Framework over Vision, Language, and Speech Data.
CoRR, 2023
Speaker Diarization for ASR Output with T-vectors: A Sequence Classification Approach.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-To-End Neural Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2023
Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
2022
IEEE J. Sel. Top. Signal Process., 2022
Comput. Speech Lang., 2022
CoRR, 2022
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization.
CoRR, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Internal Language Model Adaptation with Text-Only Data for End-to-End Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
IEEE Signal Process. Lett., 2021
CoRR, 2021
Exploring End-to-End Multi-Channel ASR with Bias Information for Meeting Transcription.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Integration of Speech Separation, Diarization, and Recognition for Multi-Speaker Meetings: System Description, Comparison, and Analysis.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Investigation of End-to-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Microsoft Speaker Diarization System for the Voxceleb Speaker Recognition Challenge 2020.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Hypothesis Stitcher for End-to-End Speaker-Attributed ASR on Long-Form Multi-Talker Recordings.
Proceedings of the IEEE International Conference on Acoustics, 2021
A Comparative Study of Modular and Joint Approaches for Speaker-Attributed ASR on Monaural Long-Form Audio.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.
Proceedings of the IEEE International Conference on Acoustics, 2019
Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Minimum Bayes risk training of CTC acoustic models in maximum a posteriori based decoding framework.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Investigation of lattice-free maximum mutual information-based acoustic models with sequence-level Kullback-Leibler divergence.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
2016
Combination of multiple acoustic models with unsupervised adaptation for lecture speech transcription.
Speech Commun., 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Investigation of Semi-Supervised Acoustic Model Training Based on the Committee of Heterogeneous Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
2015
Training data pseudo-shuffling and direct decoding framework for recurrent neural network based acoustic modeling.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Open-ended Spoken Language Technology: Studies on Spoken Dialogue Systems and Spoken Document Retrieval Systems.
PhD thesis, 2014
Proceedings of the 11th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2014, 2014
Boundary contraction training for acoustic models based on discrete deep neural networks.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Multiple index combination for Japanese spoken term detection with optimum index selection based on OOV-region classifier.
Proceedings of the IEEE International Conference on Acoustics, 2013
Elastic spectral distortion for low resource speech recognition with deep neural networks.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013
2012
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
A multi-expert model for dialogue and behavior control of conversational robots and agents.
Knowl. Based Syst., 2011
2008
Proceedings of the International Workshop on Multimedia Signal Processing, 2008
2006
Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors.
Proceedings of the SIGDIAL 2006 Workshop, 2006
2005
A two-layer model for behavior and dialogue planning in conversational service robots.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005
Contextual constraints based on dialogue models in database search task for spoken dialogue systems.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005