Kenji Nagamatsu

According to our database1, Kenji Nagamatsu authored at least 42 papers between 1996 and 2021.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.




In proceedings 
PhD thesis 




Team Hitachi @ AutoMin 2021: Reference-free Automatic Minuting Pipeline with Argument Structure Construction over Topic-based Summarization.
CoRR, 2021

Emotional Speech Synthesis for Companion Robot to Imitate Professional Caregiver Speech.
CoRR, 2021

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.
CoRR, 2021

Online End-To-End Neural Diarization with Speaker-Tracing Buffer.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Block-Online Guided Source Separation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Audio-Visual Speech Enhancement Method Conditioned in the Lip Motion and Speaker-Discriminative Embeddings.
Proceedings of the IEEE International Conference on Acoustics, 2021

End-To-End Speaker Diarization as Post-Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

Building Multi lingual TTS using Cross Lingual Voice Conversion.
CoRR, 2020

Online End-to-End Neural Diarization with Speaker-Tracing Buffer.
CoRR, 2020

Neural Speaker Diarization with Speaker-Wise Chain Rule.
CoRR, 2020

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification.
CoRR, 2020

Delay Mitigation for Backchannel Prediction in Spoken Dialog System.
Proceedings of the Conversational Dialogue Systems for the Next Decade, 2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Meta-Learning for Speech Emotion Recognition Considering Ambiguity of Emotion Labels.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Anticipating the Start of User Interaction for Service Robot in the Wild.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

Addressing Ambiguity of Emotion Labels Through Meta-Learning.
CoRR, 2019

Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Neural Speaker Diarization with Permutation-Free Objectives.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.
Proceedings of the IEEE International Conference on Acoustics, 2019

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

End-to-End Neural Speaker Diarization with Self-Attention.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Face-Voice Matching using Cross-modal Embeddings.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

Fast Multichannel Nonnegative Matrix Factorization with Constraints on Active Source Candidates.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Lattice-free State-level Minimum Bayes Risk Training of Acoustic Models.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Sequence Distillation for Purely Sequence Trained Acoustic Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Local Gaussian model with source-set constraints in audio source separation.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Independent vector analysis with frequency range division and prior switching.
Proceedings of the 25th European Signal Processing Conference, 2017

Investigation of lattice-free maximum mutual information-based acoustic models with sequence-level Kullback-Leibler divergence.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Cycle time based multi-goal path optimization for redundant robotic systems.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

A Pre-Identification Method for Chinese Named Entity Recognition.
J. Softw., 2010

Cascade Chinese Potential Name Recognition.
Proceedings of the International Forum on Information Technology and Applications, 2009

A Hybrid Method of Chinese Prosodic Word Tagging Based on Keyword Anchor and Hidden Markov Model.
Proceedings of the 2009 International Conference on Asian Language Processing, 2009

Scalable Implementation Of Unit Selection Based Text-To-Speech System For Embedded Solutions.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Unit selection using pitch synchronous cross correlation for Japanese concatenative speech synthesis.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

Estimating Point-of-View-based Similarity Using POV Reinforcement and Similarity Propagation.
Proceedings of the 11th Pacific Asia Conference on Language, Information and Computation, 1996
