Shota Horiguchi

Orcid: 0000-0002-3166-4956

According to our database1, Shota Horiguchi authored at least 52 papers between 2016 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Guided Speaker Embedding.
CoRR, 2024

Investigation of Speaker Representation for Target-Speaker Speech Processing.
CoRR, 2024

Mamba-based Segmentation Model for Speaker Diarization.
CoRR, 2024

Alignment-Free Training for Transducer-based Multi-Talker ASR.
CoRR, 2024

Recursive Attentive Pooling for Extracting Speaker Embeddings from Multi-Speaker Recordings.
CoRR, 2024

SpeakerBeam-SS: Real-time Target Speaker Extraction with Lightweight Conv-TasNet and State Space Modeling.
CoRR, 2024

Factor-Conditioned Speaking-Style Captioning.
CoRR, 2024

Thresholding Data Shapley for Data Cleansing Using Multi-Armed Bandits.
CoRR, 2024

Streaming Active Learning for Regression Problems Using Regression via Classification.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Online Neural Diarization of Unlimited Numbers of Speakers Using Global and Local Attractors.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

CAPTDURE: Captioned Sound Dataset of Single Sources.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Spoofing Attacker Also Benefits from Self-Supervised Pretrained Model.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Synthetic Data Augmentation for ASR with Domain Filtering.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Encoder-Decoder Based Attractors for End-to-End Neural Diarization.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Online Neural Diarization of Unlimited Numbers of Speakers.
CoRR, 2022

Mutual Learning of Single- and Multi-Channel End-to-End Neural Diarization.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Improving the Naturalness of Simulated Conversations for End-to-End Neural Diarization.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Updating Only Encoders Prevents Catastrophic Forgetting of End-to-End ASR Models.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Rethinking Fano's Inequality in Ensemble Learning.
Proceedings of the International Conference on Machine Learning, 2022

Environmental Sound Extraction Using Onomatopoeic Words.
Proceedings of the IEEE International Conference on Acoustics, 2022

Multi-Channel End-To-End Neural Diarization with Distributed Microphones.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Environmental Sound Extraction Using Onomatopoeia.
CoRR, 2021

Encoder-Decoder Based Attractor Calculation for End-to-End Neural Diarization.
CoRR, 2021

The Hitachi-JHU DIHARD III System: Competitive End-to-End Neural Diarization and X-Vector Clustering Systems Combined by DOVER-Lap.
CoRR, 2021

Online End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.
CoRR, 2021

Online End-To-End Neural Diarization with Speaker-Tracing Buffer.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

End-to-End Speaker Diarization Conditioned on Speech Activity and Overlap Detection.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Block-Online Guided Source Separation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

End-To-End Speaker Diarization as Post-Processing.
Proceedings of the IEEE International Conference on Acoustics, 2021

Towards Neural Diarization for Unlimited Numbers of Speakers Using Global and Local Attractors.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

2020
Significance of Softmax-Based Features in Comparison to Distance Metric Learning-Based Features.
IEEE Trans. Pattern Anal. Mach. Intell., 2020

Online End-to-End Neural Diarization with Speaker-Tracing Buffer.
CoRR, 2020

Neural Speaker Diarization with Speaker-Wise Chain Rule.
CoRR, 2020

End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification.
CoRR, 2020

Hitachi at SemEval-2020 Task 8: Simple but Effective Modality Ensemble for Meme Emotion Recognition.
Proceedings of the Fourteenth Workshop on Semantic Evaluation, 2020

Utterance-Wise Meeting Transcription System Using Asynchronous Distributed Microphones.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Speaker Diarization for an Unknown Number of Speakers with Encoder-Decoder Based Attractors.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Anticipating the Start of User Interaction for Service Robot in the Wild.
Proceedings of the 2020 IEEE International Conference on Robotics and Automation, 2020

2019
Omnidirectional Pedestrian Detection by Rotation Invariant Training.
Proceedings of the IEEE Winter Conference on Applications of Computer Vision, 2019

Auxiliary Interference Speaker Loss for Target-Speaker Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Guided Source Separation Meets a Strong ASR Backend: Hitachi/Paderborn University Joint Investigation for Dinner Party ASR.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Multimodal Response Obligation Detection with Unsupervised Online Domain Adaptation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

End-to-End Neural Speaker Diarization with Permutation-Free Objectives.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Acoustic Modeling for Distant Multi-talker Speech Recognition with Single- and Multi-channel Branches.
Proceedings of the IEEE International Conference on Acoustics, 2019

Simultaneous Speech Recognition and Speaker Diarization for Monaural Dialogue Recordings with Target-Speaker Acoustic Models.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

End-to-End Neural Speaker Diarization with Self-Attention.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

2018
Personalized Classifier for Food Image Recognition.
IEEE Trans. Multim., 2018

Face-Voice Matching using Cross-modal Embeddings.
Proceedings of the 2018 ACM Multimedia Conference on Multimedia Conference, 2018

2016
Food Search Based on User Feedback to Assist Image-based Food Recording Systems.
Proceedings of the 2nd International Workshop on Multimedia Assisted Dietary Management, 2016

The log-normal distribution of the size of objects in daily meal images and its application to the efficient reduction of object proposals.
Proceedings of the 2016 IEEE International Conference on Image Processing, 2016


  Loading...