We stand with Ukraine

We stand with Ukraine

Katsutoshi Itoyama

Orcid: 0000-0002-7098-3896

According to our database¹, Katsutoshi Itoyama authored at least 122 papers between 2006 and 2024.

Collaborative distances:

Dijkstra number² of five.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

on orcid.org

On csauthors.net:

Bibliography

2024

Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

EURASIP J. Audio Speech Music. Process., December, 2024

SLAM-Based Joint Calibration of Multiple Asynchronous Microphone Arrays and Sound Source Localization.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

,

,

,

IEEE Trans. Robotics, 2024

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution.

[BibT_eX]

[DOI]

Ragib Amin Nihal

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

CoRR, 2024

Real Time Sound Source Localization Using von-Mises ResNet.

[BibT_eX]

[DOI]

Mert Bozkurtlar

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2024

Improving Impressions of Response Delay in AI-based Spoken Dialogue Systems.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 33rd IEEE International Conference on Robot and Human Interactive Communication, 2024

Improving Noise Robustness of Automatic Speech Recognition Based on a Parallel Adapter Model with Near-Identity Initialization.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the Advances and Trends in Artificial Intelligence. Theory and Applications, 2024

UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios.

[BibT_eX]

[DOI]

Ragib Amin Nihal

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the Pattern Recognition - 27th International Conference, 2024

A Video Vision Transformer for Sound Source Localization.

[BibT_eX]

[DOI]

,

Mert Bozkurtlar

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 32nd European Signal Processing Conference, 2024

FPGA-based Low Power Acceleration of HARK Sound Source Localization.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2024

2023

Audio-Visual Class Association Based on Two-stage Self-supervised Contrastive Learning towards Robust Scene Analysis.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Assessment of Simultaneous Calibration for Positions, Orientations, and Time Offsets in Multiple Microphone Arrays Systems.

[BibT_eX]

[DOI]

Chishio Sugiyama

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Metric-Based Multimodal Meta-Learning for Human Movement Identification Via Footstep Recognition.

[BibT_eX]

[DOI]

Muhammad Shakeel

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Reconstruction of Depth Scenes Based on Echolocation.

[BibT_eX]

[DOI]

Hidehiko Kishinami

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

FPGA based Power-Efficient Edge Server to Accelerate Speech Interface for Socially Assistive Robotics.

[BibT_eX]

[DOI]

,

Muhammad Shakeel

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

,

,

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

An Ensemble Method for Multiple Speech Enhancement Using Deep Learning.

[BibT_eX]

[DOI]

Masahiko Fujita

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Improving Sign Language Understanding Introducing Label Smoothing.

[BibT_eX]

[DOI]

,

Khan Nabeela Khanum

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Unsupervised Domain Adaptation of Universal Source Separation Based on Neural Full-Rank Spatial Covariance Analysis.

[BibT_eX]

[DOI]

Takahiro Aizawa

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

,

Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023

miniStreamer: Enhancing Small Conformer with Chunked-Context Masking for Streaming ASR Applications on the Edge.

[BibT_eX]

[DOI]

,

Monikka Roslianna Busto

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Is the Ideal Ratio Mask Really the Best? - Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022

Outdoor evaluation of sound source localization for drone groups using microphone arrays.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Spotforming by NMF Using Multiple Microphone Arrays.

[BibT_eX]

[DOI]

Yasuhiro Kagimoto

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Weakly-Supervised Neural Full-Rank Spatial Covariance Analysis for a Front-End System of Distant Speech Recognition.

[BibT_eX]

[DOI]

,

Takahiro Aizawa

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2021

Multichannel environmental sound segmentation.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Appl. Intell., 2021

Detecting earthquakes: a novel deep learning-based approach for effective disaster response.

[BibT_eX]

[DOI]

Muhammad Shakeel

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Appl. Intell., 2021

Assessment of a Beamforming Implementation Developed for Surface Sound Source Separation.

[BibT_eX]

[DOI]

,

Muhammad Shakeel

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Sound Source Tracking Using Integrated Direction Likelihood for Drones with Microphone Arrays.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

EMC: Earthquake Magnitudes Classification on Seismic Signals via Convolutional Recurrent Networks.

[BibT_eX]

[DOI]

Muhammad Shakeel

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

Yoshiya Morimoto

,

,

,

,

Kazuhiro Nakadai

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

2020

Bayesian Singing Transcription Based on a Hierarchical Generative Model of Keys, Musical Notes, and F0 Trajectories.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

IEEE ACM Trans. Audio Speech Lang. Process., 2020

Sound event aware environmental sound segmentation with Mask U-Net.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Adv. Robotics, 2020

Design and Assessment of a Scan-and-sum Beamformer for Surface Sound Source Separation.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Sound Source Tracking by Drones with Microphone Arrays.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Multi-channel Environmental sound segmentation.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Sound Source Localization Based on von-Mises-Bernoulli Deep Neural Network.

[BibT_eX]

[DOI]

Kazuhiro Nakadai

,

,

,

,

Katsutoshi Itoyama

,

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Audio-Visual 3D Reconstruction Framework for Dynamic Scenes.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Synchronization of Microphones Based on Rank Minimization of Warped Spectrum for Asynchronous Distributed Recording.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices, 2020

Detection of Ball Spin Direction using Hitting Sound in Tennis.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

Proceedings of the 8th International Conference on Sport Sciences Research and Technology Support, 2020

2019

Development of Tough Snake Robot Systems.

[BibT_eX]

[DOI]

Fumitoshi Matsuno

,

Tetsushi Kamegawa

,

,

Tatsuya Takemori

,

Motoyasu Tanaka

,

Mizuki Nakajima

,

Kenjiro Tadakuma

,

Masahiro Fujita

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

,

,

Tomofumi Fujiwara

,

Satoshi Tadokoro

Proceedings of the Disaster Robotics - Results from the ImPACT Tough Robotics Challenge, 2019

ImPACT-TRC Thin Serpentine Robot Platform for Urban Search and Rescue.

[BibT_eX]

[DOI]

,

,

,

,

Satoshi Tadokoro

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

,

Takayuki Okatani

,

,

Proceedings of the Disaster Robotics - Results from the ImPACT Tough Robotics Challenge, 2019

Unsupervised Speech Enhancement Based on Multichannel NMF-Informed Beamforming for Noise-Robust Automatic Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

IEEE ACM Trans. Audio Speech Lang. Process., 2019

2D sound source position estimation using microphone arrays and its application to a VR-based bird song analysis system.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Adv. Robotics, 2019

Design and assessment of multiple-sound source localization using microphone arrays.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the IEEE/SICE International Symposium on System Integration, 2019

Environmental sound segmentation utilizing Mask U-Net.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Joint Transcription of Lead, Bass, and Rhythm Guitars Based on a Factorial Hidden Semi-Markov Model.

[BibT_eX]

[DOI]

Kentaro Shibata

,

,

Satoru Fukayama

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the IEEE International Conference on Acoustics, 2019

Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

2018

Bayesian Multichannel Audio Source Separation Based on Integrated Source and Spatial Models.

[BibT_eX]

[DOI]

Kousuke Itakura

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

,

Hiroshi G. Okuno

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Signal Restoration based on Bi-directional LSTM with Spectral Filtering for Robot Audition.

[BibT_eX]

[DOI]

Ryosuke Taniguchi

,

,

Katsutoshi Itoyama

,

,

Kazuhiro Nakadai

Proceedings of the 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018

Interactive Arrangement of Chords and Melodies Based on a Tree-Structured Generative Model.

[BibT_eX]

[DOI]

Hiroaki Tsushima

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 19th International Society for Music Information Retrieval Conference, 2018

Unsupervised Beamforming Based on Multichannel Nonnegative Matrix Factorization for Noisy Speech Recognition.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Statistical Speech Enhancement Based on Probabilistic Integration of Variational Autoencoder and Non-Negative Matrix Factorization.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Sequential Generation of Singing F0 Contours from Musical Note Sequences Based on WaveNet.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Simultaneous Identification and Localization of Still and Mobile Speakers Based on Binaural Robot Audition.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

J. Robotics Mechatronics, 2017

Layout Optimization of Cooperative Distributed Microphone Arrays Based on Estimation of Source Separation Performance.

[BibT_eX]

[DOI]

Kouhei Sekiguchi

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

J. Robotics Mechatronics, 2017

Audio-Visual Beat Tracking Based on a State-Space Model for a Robot Dancer Performing with a Human Dancer.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

J. Robotics Mechatronics, 2017

Low Latency and High Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot.

[BibT_eX]

[DOI]

,

Hiroshi Saruwatari

,

,

,

Katsutoshi Itoyama

,

Daichi Kitamura

,

Masaru Ishimura

,

,

,

,

,

,

,

Satoshi Tadokoro

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

J. Robotics Mechatronics, 2017

Generative Statistical Models with Self-Emergent Grammar of Chord Sequences.

[BibT_eX]

[DOI]

Hiroaki Tsushima

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

CoRR, 2017

Infinite probabilistic latent component analysis for audio source separation.

[BibT_eX]

[DOI]

Kazuyoshi Yoshii

,

,

Katsutoshi Itoyama

,

Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Semi-Blind speech enhancement basedon recurrent neural network for source separation and dereverberation.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Function- and Rhythm-Aware Melody Harmonization Based on Tree-Structured Parsing and Split-Merge Sampling of Chord Sequences.

[BibT_eX]

[DOI]

Hiroaki Tsushima

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Scale- and Rhythm-Aware Musical Note Estimation for Vocal F0 Trajectories Based on a Semi-Tatum-Synchronous Hierarchical Hidden Semi-Markov Model.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 18th International Society for Music Information Retrieval Conference, 2017

Bayesian multichannel nonnegative matrix factorization for audio source separation and localization.

[BibT_eX]

[DOI]

Kousuke Itakura

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Tatsuya Kawahara

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

2016

Singing Voice Separation and Vocal F0 Estimation Based on Mutual Combination of Robust Principal Component Analysis and Subharmonic Summation.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Sound-based online localization for an in-pipe snake robot.

[BibT_eX]

[DOI]

,

,

Motoyasu Tanaka

,

Tetsushi Kamegawa

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Fumitoshi Matsuno

,

Hiroshi G. Okuno

Proceedings of the 2016 IEEE International Symposium on Safety, 2016

Parallel Speech Corpora of Japanese Dialects.

[BibT_eX]

[DOI]

Koichiro Yoshino

,

,

,

Fumihiko Takahashi

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Student's t multichannel nonnegative matrix factorization for blind source separation.

[BibT_eX]

[DOI]

Koichi Kitamura

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

A Hierarchical Bayesian Model of Chords, Pitches, and Spectrograms for Multipitch Analysis.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

Musical Note Estimation for F0 Trajectories of Singing Voices Based on a Bayesian Semi-Beat-Synchronous HMM.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 17th International Society for Music Information Retrieval Conference, 2016

Online simultaneous localization and mapping of multiple sound sources and asynchronous microphone arrays.

[BibT_eX]

[DOI]

Kouhei Sekiguchi

,

,

Keisuke Nakamura

,

Kazuhiro Nakadai

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Student's T nonnegative matrix factorization and positive semidefinite tensor factorization for single-channel audio source separation.

[BibT_eX]

[DOI]

Kazuyoshi Yoshii

,

Katsutoshi Itoyama

,

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Rhythm transcription of MIDI performances based on hierarchical Bayesian modelling of repetition and modification of musical note patterns.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 24th European Signal Processing Conference, 2016

A unified Bayesian model of time-frequency clustering and low-rank approximation for multi-channel source separation.

[BibT_eX]

[DOI]

Kousuke Itakura

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 24th European Signal Processing Conference, 2016

Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 24th European Signal Processing Conference, 2016

2015

Automatic Speech Recognition for Mixed Dialect Utterances by Mixing Dialect Language Models.

[BibT_eX]

[DOI]

,

Koichiro Yoshino

,

Katsutoshi Itoyama

,

,

Hiroshi G. Okuno

IEEE ACM Trans. Audio Speech Lang. Process., 2015

HMM-based Attacks on Google's ReCAPTCHA with Continuous Visual and Audio Symbols.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

J. Inf. Process., 2015

Toward a quizmaster robot for speech-based multiparty interaction.

[BibT_eX]

[DOI]

Izaya Nishimuta

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Adv. Robotics, 2015

Posture estimation of hose-shaped robot by using active microphone array.

[BibT_eX]

[DOI]

,

,

Takeshi Mizumoto

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Hiroshi G. Okuno

Adv. Robotics, 2015

Unified inter- and intra-recording duration model for multiple music audio alignment.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Human-voice enhancement based on online RPCA for a hose-shaped rescue robot with a microphone array.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 2015 IEEE International Symposium on Safety, 2015

Identification and Localization of One or Two Concurrent Speakers in a Binaural Robotic Context.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 2015 IEEE International Conference on Systems, 2015

Infinite Superimposed Discrete All-Pole Modeling for Multipitch Analysis of Wavelet Spectrograms.

[BibT_eX]

[DOI]

Kazuyoshi Yoshii

,

Katsutoshi Itoyama

,

Proceedings of the 16th International Society for Music Information Retrieval Conference, 2015

Optimizing the layout of multiple mobile robots for cooperative sound source separation.

[BibT_eX]

[DOI]

Kouhei Sekiguchi

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Audio-visual beat tracking based on a state-space model for a music robot dancing with humans.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Microphone-accelerometer based 3D posture estimation for a hose-shaped rescue robot.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Bayesian integration of sound source separation and speech recognition: a new approach to simultaneous speech recognition.

[BibT_eX]

[DOI]

Kousuke Itakura

,

Izaya Nishimuta

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A feedback framework for improved chord recognition based on NMF-based approximate note transcription.

[BibT_eX]

[DOI]

,

Kazuyoshi Yoshii

,

Katsutoshi Itoyama

,

,

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Singing voice analysis and editing based on mutually dependent F0 estimation and source separation.

[BibT_eX]

[DOI]

,

Kazuyoshi Yoshii

,

Katsutoshi Itoyama

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Challenges in deploying a microphone array to localize and separate sound sources in real auditory scenes.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

,

,

Hiroshi G. Okuno

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Recognition of In-Field Frog Chorusing Using Bayesian Nonparametric Microphone Array Processing.

[BibT_eX]

[DOI]

,

,

,

Hiromitsu Awano

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Hiroshi Gitchang Okuno

Proceedings of the Computational Sustainability, 2015

2014

Nonparametric Bayesian dereverberation of power spectrograms based on infinite-order autoregressive processes.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

IEEE ACM Trans. Audio Speech Lang. Process., 2014

A sound-based online method for estimating the time-varying posture of a hose-shaped robot.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Satoshi Tadokoro

,

Kazuhiro Nakadai

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 2014 IEEE International Symposium on Safety, 2014

Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

Hiroshi G. Okuno

Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Bayesian Audio Alignment based on a Unified Model of Music Composition and Performance.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Hiroshi G. Okuno

Proceedings of the 15th International Society for Music Information Retrieval Conference, 2014

Visualization of auditory awareness based on sound source positions estimated by depth sensor and microphone array.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014

Transferring Vocal Expression of F0 Contour Using Singing Voice Synthesizer.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the Modern Advances in Applied Intelligence, 2014

Parameter Estimation of Virtual Musical Instrument Synthesizers.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the Music Technology meets Philosophy, 2014

Automatic transcription of guitar tablature from audio signals in accordance with player's proficiency.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2014

Transcribing vocal expression from polyphonic music.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2014

A robot quizmaster that can localize, separate, and recognize simultaneous utterances for a fastest-voice-first quiz game.

[BibT_eX]

[DOI]

Izaya Nishimuta

,

,

Kazuyoshi Yoshii

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the 14th IEEE-RAS International Conference on Humanoid Robots, 2014

2013

Robust Multipitch Analyzer against Initialization based on Latent Harmonic Allocation using Overtone Corpus.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Hiroshi G. Okuno

J. Inf. Process., 2013

Noise correlation matrix estimation for improving sound source localization by multirotor UAV.

[BibT_eX]

[DOI]

Koutarou Furukawa

,

,

,

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

Hiroshi G. Okuno

Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Posture estimation of hose-shaped robot using microphone array localization.

[BibT_eX]

[DOI]

,

Takeshi Mizumoto

,

Katsutoshi Itoyama

,

Kazuhiro Nakadai

,

Hiroshi G. Okuno

Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Automatic estimation of dialect mixing ratio for dialect speech recognition.

[BibT_eX]

[DOI]

,

Koichiro Yoshino

,

Katsutoshi Itoyama

,

,

Hiroshi G. Okuno

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Audio-based guitar tablature transcription using multipitch analysis and playability constraints.

[BibT_eX]

[DOI]

,

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2013

Initialization-robust Bayesian multipitch analyzer based on psychoacoustical and musical criteria.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2013

Multiple index combination for Japanese spoken term detection with optimum index selection based on OOV-region classifier.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2013

2012

Automated Violin Fingering Transcription Through Analysis of an Audio Recording.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Comput. Music. J., 2012

Bayesian Nonnegative Harmonic-Temporal Factorization and Its Application to Multipitch Analysis.

[BibT_eX]

[DOI]

,

,

Katsutoshi Itoyama

,

Hiroshi G. Okuno

Proceedings of the 13th International Society for Music Information Retrieval Conference, 2012

Automatic Chord Recognition Based on Probabilistic Integration of Acoustic Features, Bass Sounds, and Chord Transition.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

,

Hiroshi G. Okuno

Proceedings of the Advanced Research in Applied Artificial Intelligence, 2012

Initialization-robust multipitch estimation based on latent harmonic allocation using overtone corpus.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Hiroshi G. Okuno

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

A musical mood trajectory estimation method using lyrics and acoustic features.

[BibT_eX]

[DOI]

Naoki Nishikawa

,

Katsutoshi Itoyama

,

Hiromasa Fujihara

,

,

,

Hiroshi G. Okuno

Proceedings of the 1st international ACM workshop on Music information retrieval with user-centered and multimodal strategies, Scottsdale, AZ, USA, November 28, 2011

Simultaneous processing of sound source separation and musical instrument identification using Bayesian spectral modeling.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Violin Fingering Estimation Based on Violin Pedagogical Fingering Model Constrained by Bowed Sequence Estimation from Audio Input.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the Trends in Applied Intelligent Systems, 2010

2009

Parameter Estimation for Harmonic and Inharmonic Models by Using Timbre Feature Distributions.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

J. Inf. Process., 2009

Changing timbre and phrase in existing musical performances as you like: manipulations of single part using harmonic and inharmonic models.

[BibT_eX]

[DOI]

Naoki Yasuraoka

,

,

Katsutoshi Itoyama

,

,

,

Hiroshi G. Okuno

Proceedings of the 17th International Conference on Multimedia 2009, 2009

Bowed String Sequence Estimation of a Violin Based on Adaptive Audio Signal Classification and Context-Dependent Error Correction.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

,

,

Hiroshi G. Okuno

Proceedings of the 11th IEEE International Symposium on Multimedia, 2009

2008

Automatic Chord Recognition Based on Probabilistic Integration of Chord Transition and Bass Pitch Estimation.

[BibT_eX]

[DOI]

,

Katsutoshi Itoyama

,

Kazuyoshi Yoshii

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the ISMIR 2008, 2008

Instrument Equalizer for Query-by-Example Retrieval: Improving Sound Source Separation Based on Integrated Harmonic and Inharmonic Models.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the ISMIR 2008, 2008

2007

Integration and Adaptation of Harmonic and Inharmonic Models for Separating Polyphonic Musical Signals.

[BibT_eX]

[DOI]

Katsutoshi Itoyama

,

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the IEEE International Conference on Acoustics, 2007

2006

Automatic Feature Weighting in Automatic Transcription of Specified Part in Polyphonic Music.

[BibT_eX]

Katsutoshi Itoyama

,

Tetsuro Kitahara

,

Kazunori Komatani

,

,

Hiroshi G. Okuno

Proceedings of the ISMIR 2006, 2006

Loading...