Kazuhiro Nakadai

Orcid: 0000-0002-6134-4558

According to our database1, Kazuhiro Nakadai authored at least 296 papers between 1995 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SLAM-Based Joint Calibration of Multiple Asynchronous Microphone Arrays and Sound Source Localization.
IEEE Trans. Robotics, 2024

UAV-Enhanced Combination to Application: Comprehensive Analysis and Benchmarking of a Human Detection Dataset for Disaster Scenarios.
CoRR, 2024

Can all variations within the unified mask-based beamformer framework achieve identical peak extraction performance?
CoRR, 2024

From Blurry to Brilliant Detection: YOLOv5-Based Aerial Object Detection with Super Resolution.
CoRR, 2024

Real Time Sound Source Localization Using von-Mises ResNet.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2024

Improving Impressions of Response Delay in AI-based Spoken Dialogue Systems.
Proceedings of the 33rd IEEE International Conference on Robot and Human Interactive Communication, 2024

Improving Noise Robustness of Automatic Speech Recognition Based on a Parallel Adapter Model with Near-Identity Initialization.
Proceedings of the Advances and Trends in Artificial Intelligence. Theory and Applications, 2024

A Video Vision Transformer for Sound Source Localization.
Proceedings of the 32nd European Signal Processing Conference, 2024

FPGA-based Low Power Acceleration of HARK Sound Source Localization.
Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2024

2023
Extracting Bird Vocalizations from a Complex Natural Soundscape in Forests Using Robot Audition Techniques.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Audio-Visual Class Association Based on Two-stage Self-supervised Contrastive Learning towards Robust Scene Analysis.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Assessment of Simultaneous Calibration for Positions, Orientations, and Time Offsets in Multiple Microphone Arrays Systems.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Metric-Based Multimodal Meta-Learning for Human Movement Identification Via Footstep Recognition.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Reconstruction of Depth Scenes Based on Echolocation.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Observability Analysis of Graph SLAM-Based Joint Calibration of Multiple Microphone Arrays and Sound Source Localization.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

FPGA based Power-Efficient Edge Server to Accelerate Speech Interface for Socially Assistive Robotics.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

An Ensemble Method for Multiple Speech Enhancement Using Deep Learning.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2023

Online Adaptation of Fourier Series Based Acoustic Transfer Function Model to Improve Sound Source Localization and Separation.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Improving Sign Language Understanding Introducing Label Smoothing.
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023

Unsupervised Domain Adaptation of Universal Source Separation Based on Neural Full-Rank Spatial Covariance Analysis.
Proceedings of the 33rd IEEE International Workshop on Machine Learning for Signal Processing, 2023

Retraining-free Customized ASR for Enharmonic Words Based on a Named-Entity-Aware Model and Phoneme Similarity Estimation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

miniStreamer: Enhancing Small Conformer with Chunked-Context Masking for Streaming ASR Applications on the Edge.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Low power implementation of Geometric High-order Decorrelation-based Source Separation on an FPGA board.
Proceedings of the IEEE Symposium in Low-Power and High-Speed Chips, 2023

Is the Ideal Ratio Mask Really the Best? - Exploring the Best Extraction Performance and Optimal Mask of Mask-based Beamformers.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Auditory Survey of Endangered Eurasian Bittern Using Microphone Arrays and Robot Audition.
Frontiers Robotics AI, 2022

Outdoor evaluation of sound source localization for drone groups using microphone arrays.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Spotforming by NMF Using Multiple Microphone Arrays.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

Empirical Sampling from Latent Utterance-wise Evidence Model for Missing Data ASR based on Neural Encoder-Decoder Model.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Automatic Speech Recognition with Re-blocking Processing Based on Integrated Voice Activity Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Weakly-Supervised Neural Full-Rank Spatial Covariance Analysis for a Front-End System of Distant Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

An FPGA off-loading of HARK sound source localization.
Proceedings of the 2022 Tenth International Symposium on Computing and Networking, CANDAR 2022, 2022

2021
Multichannel environmental sound segmentation.
Appl. Intell., 2021

Detecting earthquakes: a novel deep learning-based approach for effective disaster response.
Appl. Intell., 2021

Investigation of Node Pruning Criteria for Neural Networks Model Compression with Non-Linear Function and Non-Uniform Network Topology.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Assessment of a Beamforming Implementation Developed for Surface Sound Source Separation.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Sound Source Tracking Using Integrated Direction Likelihood for Drones with Microphone Arrays.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Visualizing Directional Soundscapes of Bird Vocalizations Using Robot Audition Techniques.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Multi-channel Environmental Sound Segmentation utilizing Sound Source Localization and Separation U-Net.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

EMC: Earthquake Magnitudes Classification on Seismic Signals via Convolutional Recurrent Networks.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Observing Nocturnal Birds Using Localization Techniques.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Fully-Online Always-Adaptation of Transfer Functions and Its Application to Sound Source Localization and Separation.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2021

Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Spatial Normalization to Reduce Positional Complexity in Direction-aided Supervised Binaural Sound Source Separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Learning Three-dimensional Skeleton Data from Sign Language Video.
ACM Trans. Intell. Syst. Technol., 2020

Recognition of Non-Manual Content in Continuous Japanese Sign Language.
Sensors, 2020

Reactive Chameleon: A Method to Mimic Conversation Partner's Body Sway for a Robot.
Int. J. Soc. Robotics, 2020

Sound event aware environmental sound segmentation with Mask U-Net.
Adv. Robotics, 2020

Multi-hop wireless command and telemetry communication system for remote operation of robots with extending operation area beyond line-of-sight using 920 MHz/169 MHz.
Adv. Robotics, 2020

Robot Audition and Computational Auditory Scene Analysis.
Adv. Intell. Syst., 2020

Design and Assessment of a Scan-and-sum Beamformer for Surface Sound Source Separation.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Sound Source Tracking by Drones with Microphone Arrays.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Design and Implementation of Real-Time Visualization of Sound Source Positions by Drone Audition.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Soundscape Analysis of Bird Songs in Forests Using Microphone Arrays.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Multi-channel Environmental sound segmentation.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Sound Source Localization Based on von-Mises-Bernoulli Deep Neural Network.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Audio-Visual 3D Reconstruction Framework for Dynamic Scenes.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

A Fourier series based Data compression model for Acoustic transfer function.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Synchronization of Microphones Based on Rank Minimization of Warped Spectrum for Asynchronous Distributed Recording.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020

Calibration of a Microphone Array Based on a Probabilistic Model of Microphone Positions.
Proceedings of the Trends in Artificial Intelligence Theory and Applications. Artificial Intelligence Practices, 2020

Detection of Ball Spin Direction using Hitting Sound in Tennis.
Proceedings of the 8th International Conference on Sport Sciences Research and Technology Support, 2020

Age Classification of Evacuees at Times of Disaster Using a Vibration Sensor.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Recent R&D Technologies and Future Prospective of Flying Robot in Tough Robotics Challenge.
Proceedings of the Disaster Robotics - Results from the ImPACT Tough Robotics Challenge, 2019

Special issue on robot and human interactive communication.
Adv. Robotics, 2019

2D sound source position estimation using microphone arrays and its application to a VR-based bird song analysis system.
Adv. Robotics, 2019

Close Sound Source Localization incorporating Semi-Supervised Variational Bayesian NMF.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2019

Design and assessment of multiple-sound source localization using microphone arrays.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2019

Environmental sound segmentation utilizing Mask U-Net.
Proceedings of the 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019

Weakly-Supervised Deep Recurrent Neural Networks for Basic Dance Step Generation.
Proceedings of the International Joint Conference on Neural Networks, 2019

An Integrated Framework for Field Recording, Localization, Classification and Annotation of Birdsongs Using Robot Audition Techniques - Harkbird 2.0.
Proceedings of the IEEE International Conference on Acoustics, 2019

Acoustic Simulation in Dynamic Environments for Robot Audition.
Proceedings of the 27th European Signal Processing Conference, 2019

CNN-based Multichannel End-to-End Speech Recognition for Everyday Home Environments<sup>*</sup>.
Proceedings of the 27th European Signal Processing Conference, 2019

Improvement of DOA Estimation by using Quaternion Output in Sound Event Localization and Detection.
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events 2019 (DCASE 2019), 2019

2018
Speech Enhancement Based on Bayesian Low-Rank and Sparse Decomposition of Multichannel Magnitude Spectrograms.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Assessment of MUSIC-Based Noise-Robust Sound Source Localization with Active Frequency Range Filtering.
J. Robotics Mechatronics, 2018

CNN-based MultiChannel End-to-End Speech Recognition for everyday home environments.
CoRR, 2018

Signal Restoration based on Bi-directional LSTM with Spectral Filtering for Robot Audition.
Proceedings of the 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018

Data-driven development of Virtual Sign Language Communication Agents.
Proceedings of the 27th IEEE International Symposium on Robot and Human Interactive Communication, 2018

Deep JSLC: A Multimodal Corpus Collection for Data-driven Generation of Japanese Sign Language Expressions.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

To animate or anime-te?: Investigating sign avatar comprehensibility.
Proceedings of the 18th International Conference on Intelligent Virtual Agents, 2018

Multi-timescale Feature-extraction Architecture of Deep Neural Networks for Acoustic Model Training from Raw Speech Signal.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Extracting the Relationship between the Spatial Distribution and Types of Bird Vocalizations Using Robot Audition System HARK.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

HARK-Bird-Box: A Portable Real-time Bird Song Scene Analysis System.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

2017
Design of UAV-Embedded Microphone Array System for Sound Source Localization in Outdoor Environments.
Sensors, 2017

Sound Source Localization Using Deep Learning Models.
J. Robotics Mechatronics, 2017

HARKBird: Exploring Acoustic Interactions in Bird Communities Using a Microphone Array.
J. Robotics Mechatronics, 2017

Outdoor Acoustic Event Identification with DNN Using a Quadrotor-Embedded Microphone Array.
J. Robotics Mechatronics, 2017

Editorial: Robot Audition Technologies.
J. Robotics Mechatronics, 2017

Outdoor Sound Source Detection Using a Quadcopter with Microphone Array.
J. Robotics Mechatronics, 2017

Ego-Noise Suppression for Robots Based on Semi-Blind Infinite Non-Negative Matrix Factorization.
J. Robotics Mechatronics, 2017

Development, Deployment and Applications of Robot Audition Open Source Software HARK.
J. Robotics Mechatronics, 2017

Psychologically-Inspired Audio-Visual Speech Recognition Using Coarse Speech Recognition and Missing Feature Theory.
J. Robotics Mechatronics, 2017

Acoustic Monitoring of the Great Reed Warbler Using Multiple Microphone Arrays and Robot Audition.
J. Robotics Mechatronics, 2017

Bird Song Scene Analysis Using a Spatial-Cue-Based Probabilistic Model.
J. Robotics Mechatronics, 2017

Design and Assessment of Sound Source Localization System with a UAV-Embedded Microphone Array.
J. Robotics Mechatronics, 2017

Acoustic model training based on node-wise weight boundary model for fast and small-footprint deep neural networks.
Comput. Speech Lang., 2017

Swarm of micro-quadrocopters for consensus-based sound source localization.
Adv. Robotics, 2017

Development of microphone-array-embedded UAV for search and rescue task.
Proceedings of the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017

Node Pruning Based on Entropy of Weights and Node Activity for Small-Footprint Acoustic Model Based on Deep Neural Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A Spatial-Cue-Based Probabilistic Model for Bird Song Scene Analysis.
Proceedings of the 2017 IEEE International Conference on Data Science and Advanced Analytics, 2017

2016
Robust Recognition of Simultaneous Speech By a Mobile Robot.
CoRR, 2016

Multimodal Scene Understanding Framework and Its Application to Cooking Recognition.
Appl. Artif. Intell., 2016

Leveraging phantom signals for improved voice-based human-robot interaction.
Proceedings of the 25th IEEE International Symposium on Robot and Human Interactive Communication, 2016

Construction of Japanese Audio-Visual Emotion Database and Its Application in Emotion Recognition.
Proceedings of the Tenth International Conference on Language Resources and Evaluation LREC 2016, 2016

Robust sound source mapping using three-layered selective audio rays for mobile robots.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Online simultaneous localization and mapping of multiple sound sources and asynchronous microphone arrays.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Partially Shared Deep Neural Network in sound source separation and identification using a UAV-embedded microphone array.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Semi-automatic bird song analysis by spatial-cue-based integration of sound source detection, localization, separation, and identification.
Proceedings of the 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2016

Localizing Bird Songs Using an Open Source Robot Audition System with a Microphone Array.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Reduction of Computational Cost Using Two-Stage Deep Neural Network for Training for Denoising and Sound Source Identification.
Proceedings of the Trends in Applied Knowledge-Based Systems and Data Science, 2016

Variational Bayesian multi-channel robust NMF for human-voice enhancement with a deformable and partially-occluded microphone array.
Proceedings of the 24th European Signal Processing Conference, 2016

Designing Speech and Multimodal Interactions for Mobile, Wearable, and Pervasive Applications.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

2015
Beat Tracking for Interactive Dancing Robots.
Int. J. Humanoid Robotics, 2015

Prevention of accomplishing synchronous multi-modal human-robot cooperation by using visual rhythms.
Adv. Robotics, 2015

Posture estimation of hose-shaped robot by using active microphone array.
Adv. Robotics, 2015

Audio-visual speech recognition using deep learning.
Appl. Intell., 2015

Improved sound source localization in horizontal plane for binaural robot audition.
Appl. Intell., 2015

Human-voice enhancement based on online RPCA for a hose-shaped rescue robot with a microphone array.
Proceedings of the 2015 IEEE International Symposium on Safety, 2015

A case study of an automatic volume control interface for a telepresence system.
Proceedings of the 24th IEEE International Symposium on Robot and Human Interactive Communication, 2015

Interactive sound source localization using robot audition for tablet devices.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Robot audition based Acoustic Event Identification using a Bayesian model considering spectral and temporal uncertainties.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Robot-Audition-based Human-Machine Interface for a Car.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Audio-visual scene understanding utilizing text information for a cooking support robot.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Utilizing visual cues in robot audition for sound source discrimination in speech-based human-robot communication.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Microphone-accelerometer based 3D posture estimation for a hose-shaped rescue robot.
Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2015

Dereverberation for active human-robot communication robust to speaker's face orientation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Interactive Interface to Optimize Sound Source Localization with HARK.
Proceedings of the Current Approaches in Applied Artificial Intelligence, 2015

Scene Understanding Based on Sound and Text Information for a Cooking Support Robot.
Proceedings of the Current Approaches in Applied Artificial Intelligence, 2015

On-the-spot calibration of microphone array Transfer Functions for robot audition.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Temporal smearing compensation in reverberant environment for speech-based human-robot interaction.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Robot audition: Its rise and perspectives.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Interactive interface to optimize sound source localization based on microphone array with coarse-to-fine tuning for humanoids.
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots, 2015

Sound source separation for robot audition using deep learning.
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots, 2015

Compensating changes in speaker position for improved voice-based human-robot communication.
Proceedings of the 15th IEEE-RAS International Conference on Humanoid Robots, 2015

Acoustic model training based on node-wise weight boundary model increasing speed of discrete neural networks.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Sound Source Orientation Estimation Based on an Orientation-Extended Beamformer.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2014

A sound-based online method for estimating the time-varying posture of a hose-shaped robot.
Proceedings of the 2014 IEEE International Symposium on Safety, 2014

Sound annotation tool for multidirectional sounds based on spatial information extracted by HARK robot audition software.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Auditory-aware navigation for mobile robots based on reflection-robust sound source localization and visual SLAM.
Proceedings of the 2014 IEEE International Conference on Systems, Man, and Cybernetics, 2014

Making a robot dance to diverse musical genre in noisy environments.
Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014

Improvement in outdoor sound source detection using a quadrotor-embedded microphone array.
Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014

Speech-based human-robot interaction robust to acoustic reflections in real environment.
Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2014

Lipreading using convolutional neural network.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Ego-motion noise suppression for robots based on Semi-Blind Infinite Non-negative Matrix Factorization.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Improved hands-free automatic speech recognition in reverberant environment condition.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Volume adaptation and visualization by modeling the volume level in noisy environments for telepresence system.
Proceedings of the second international conference on Human-agent interaction, 2014

2013
Sound Source Localization Using Joint Bayesian Estimation With a Hierarchical Noise Model.
IEEE Trans. Speech Audio Process., 2013

A real-time super-resolution robot audition system that improves the robustness of simultaneous speech recognition.
Adv. Robotics, 2013

Footstep detection and classification using distributed microphones.
Proceedings of the 14th International Workshop on Image Analysis for Multimedia Interactive Services, 2013

Real-time super-resolution three-dimensional sound source localization for robots.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Dereverberation robust to speaker's azimuthal orientation in multi-channel human-robot communication.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Noise correlation matrix estimation for improving sound source localization by multirotor UAV.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Posture estimation of hose-shaped robot using microphone array localization.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Improved Sound Source Localization and Front-Back Disambiguation for Humanoid Robots with Two Ears.
Proceedings of the Recent Trends in Applied Artificial Intelligence, 2013

Hands-free human-robot communication robust to speaker's radial position.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

Development of a Sound Source Localization System for Assisting Group Conversation.
Proceedings of the Intelligent Robotics and Applications - 6th International Conference, 2013

Robustness to speaker position in distant-talking automatic speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013

Mitigating the effects of reverberation for effective human-robot interaction in the real world.
Proceedings of the 13th IEEE-RAS International Conference on Humanoid Robots, 2013

Differences in the audio-visual detection of word prominence from Japanese and English speakers.
Proceedings of the Auditory-Visual Speech Processing, 2013

2012
Efficient Blind Dereverberation and Echo Cancellation Based on Independent Component Analysis for Actual Acoustic Signals.
Neural Comput., 2012

A role of multi-modal rhythms in physical interaction and cooperation.
EURASIP J. Audio Speech Music. Process., 2012

Audio-Visual Voice Activity Detection Based on an Utterance State Transition Model.
Adv. Robotics, 2012

SLAM-based Online Calibration for Asynchronous Microphone Array.
Adv. Robotics, 2012

An active audition framework for auditory-driven HRI: Application to interactive robot dancing.
Proceedings of the 21st IEEE International Symposium on Robot and Human Interactive Communication, 2012

Live assessment of beat tracking for robot audition.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

Outdoor auditory scene analysis using a moving microphone array embedded in a quadrocopter.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

Real-time super-resolution Sound Source Localization for robots.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

Online learning for template-based multi-channel ego noise estimation.
Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2012

Online audio beat tracking for a dancing robot in the presence of ego-motion noise in a real environment.
Proceedings of the IEEE International Conference on Robotics and Automation, 2012

Sound source localization in spatially colored noise using a hierarchical Bayesian model.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Active audio-visual integration for Voice Activity Detection based on a Causal Bayesian Network.
Proceedings of the 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), Osaka, Japan, November 29, 2012

Improvement of audio-visual score following in robot ensemble with human guitarist.
Proceedings of the 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012), Osaka, Japan, November 29, 2012

Multi-party human-robot interaction with distant-talking speech recognition.
Proceedings of the International Conference on Human-Robot Interaction, 2012

Estimation of the number of sources and their locations in colored noise using reversible jump MCMC.
Proceedings of the 20th European Signal Processing Conference, 2012

2011
A multi-expert model for dialogue and behavior control of conversational robots and agents.
Knowl. Based Syst., 2011

Real-Time Audio-to-Score Alignment Using Particle Filter for Coplayer Music Robots.
EURASIP J. Adv. Signal Process., 2011

Whole Body Motion Noise Cancellation of a Robot for Improved Automatic Speech Recognition.
Adv. Robotics, 2011

Ego noise cancellation of a robot using missing feature masks.
Appl. Intell., 2011

Incremental Bayesian Audio-to-Score Alignment with Flexible Harmonic Structure Models.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Intelligent sound source localization and its application to multimodal human tracking.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

SLAM-based online calibration of asynchronous microphone array for robot audition.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

Incremental learning for ego noise estimation of a robot.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

Assessment of single-channel ego noise estimation methods.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

HARK based real-time single pane 3D auditory scene visualizer empowered by Speech Arrow.
Proceedings of the 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2011

Bayesian Extension of MUSIC for Sound Source Localization and Tracking.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Robust Intonation Pattern Classification in Human Robot Interaction.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Design and implementation of selectable sound separation on the Texai telepresence system using HARK.
Proceedings of the IEEE International Conference on Robotics and Automation, 2011

Assessment of general applicability of ego noise estimation.
Proceedings of the IEEE International Conference on Robotics and Automation, 2011

Correlation matrix interpolation in Sound Source Localization for a robot.
Proceedings of the IEEE International Conference on Acoustics, 2011

Rhythmic reference of a human while a rope turning task.
Proceedings of the 6th International Conference on Human Robot Interaction, 2011

2010
Blind Source Separation With Parameter-Free Adaptive Step-Size Method for Robot Audition.
IEEE Trans. Speech Audio Process., 2010

Soft missing-feature mask generation for robot audition.
Paladyn J. Behav. Robotics, 2010

Voice-awareness control for a humanoid robot consistent with its body posture and movements.
Paladyn J. Behav. Robotics, 2010

Design and Implementation of Robot Audition System 'HARK' - Open Source Software for Listening to Three Simultaneous Speakers.
Adv. Robotics, 2010

3D sound field recording and reproducing system including sound source orientation.
Proceedings of the 4th International Universal Communication Symposium, 2010

Two-layered audio-visual speech recognition for robots in noisy environments.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Speedup and performance improvement of ICA-based robot audition by parallel and resampling-based block-wise processing.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

An improvement in automatic speech recognition using soft missing feature masks for robot audition.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

An easily-configurable robot audition system using Histogram-based Recursive Level Estimation.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Sound source separation and automatic speech recognition for moving sources.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Human-robot ensemble between robot thereminist and human percussionist using coupled oscillator model.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Multi-talker speech recognition under ego-motion noise using Missing Feature Theory.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Pitch extraction in Human-Robot interaction.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

PROT - An embodied agent for intelligible and user-friendly human-robot interaction.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Two-layered audio-visual integration in voice activity detection and automatic speech recognition for robots.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A robust speech recognition system against the ego noise of a robot.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Applying geometric source separation for improved pitch extraction in human-robot interaction.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An Improvement in Audio-Visual Voice Activity Detection for Automatic Speech Recognition.
Proceedings of the Trends in Applied Intelligent Systems, 2010

Music-Ensemble Robot That Is Capable of Playing the Theremin While Listening to the Accompanied Music.
Proceedings of the Trends in Applied Intelligent Systems, 2010

Robust Ego Noise Suppression of a Robot.
Proceedings of the Trends in Applied Intelligent Systems, 2010

Upper-limit evaluation of robot audition based on ICA-BSS in multi-source, barge-in and highly reverberant conditions.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Improvement in listening capability for humanoid robot HRP-2.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010

A hybrid framework for ego noise cancellation of a robot.
Proceedings of the IEEE International Conference on Robotics and Automation, 2010

Robust hands-free Automatic Speech Recognition for human-machine interaction.
Proceedings of the 10th IEEE-RAS International Conference on Humanoid Robots, 2010

Audio-visual speech recognition system for a robot.
Proceedings of the Auditory-Visual Speech Processing, 2010

Design and Implementation of Two-level Synchronization for Interactive Music Robot.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
Robot Audition: Missing Feature Theory Approach and Active Audition.
Proceedings of the Robotics Research - The 14th International Symposium, 2009

Step-size parameter adaptation of multi-channel semi-blind ICA with piecewise linear model for barge-in-able robot audition.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Missing-feature-theory-based robust simultaneous speech recognition system with non-clean speech acoustic model.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Incremental polyphonic audio to score alignment using beat tracking for singer robots.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Intelligent sound source localization for dynamic environments.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Real-time sound source orientation estimation using a 96 channel microphone array.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Ego noise suppression of a robot using template subtraction.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

ICA-based efficient blind dereverberation and echo cancellation method for barge-in-able robot audition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Sound source separation of moving speakers for robot audition.
Proceedings of the IEEE International Conference on Acoustics, 2009

Automatic speech recognition improved by two-layered audio-visual integration for robot audition.
Proceedings of the 9th IEEE-RAS International Conference on Humanoid Robots, 2009

Automatic estimation of reverberation time with robot speech to improve ICA-based robot audition.
Proceedings of the 9th IEEE-RAS International Conference on Humanoid Robots, 2009

Voice quality manipulation for humanoid robots consistent with their head movements.
Proceedings of the 9th IEEE-RAS International Conference on Humanoid Robots, 2009

2008
A Robot Singer with Music Recognition Based on Real-Time Beat Tracking.
Proceedings of the ISMIR 2008, 2008

Barge-in-able robot audition based on ICA and missing feature theory under semi-blind situation.
Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008

High performance sound source separation adaptable to environmental changes for robot audition.
Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008

A robot uses its own microphone to synchronize its steps to musical beats while scatting and singing.
Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008

Soft missing-feature mask generation for simultaneous speech recognition system in robots.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A robot referee for rock-paper-scissors sound games.
Proceedings of the 2008 IEEE International Conference on Robotics and Automation, 2008

Adaptive step-size parameter control for real-world blind source separation.
Proceedings of the IEEE International Conference on Acoustics, 2008

An open source software system for robot audition HARK and its evaluation.
Proceedings of the 8th IEEE-RAS International Conference on Humanoid Robots, 2008

2007
Robust Recognition of Simultaneous Speech by a Mobile Robot.
IEEE Trans. Robotics, 2007

Moving Sound Source Extraction by Time-Variant Beamforming.
Proceedings of the New Frontiers in Artificial Intelligence, 2007

A biped robot that keeps steps in time with musical beats while listening to music with its own ears.
Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29, 2007

Exploiting known sound source signals to improve ICA-based robot audition in speech separation and recognition.
Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29, 2007

Coarse speech recognition by audio-visual integration based on missing feature theory.
Proceedings of the 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems, October 29, 2007

The Design of Phoneme Grouping for Coarse Phoneme Recognition.
Proceedings of the New Trends in Applied Artificial Intelligence, 2007

A Navigation System Using Ultrasonic Directional Speaker with Rotating Base.
Proceedings of the Human Interface and the Management of Information. Interacting in Information Environments, 2007

Design and implementation of a robot audition system for automatic speech recognition of simultaneous speech.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Multi-Domain Spoken Dialogue System with Extensibility and Robustness against Speech Recognition Errors.
Proceedings of the SIGDIAL 2006 Workshop, 2006

Recognition of Simultaneous Speech by Estimating Reliability of Separated Signals for Robot Audition.
Proceedings of the PRICAI 2006: Trends in Artificial Intelligence, 2006

Real-Time Robot Audition System That Recognizes Simultaneous Speech in The Real World.
Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006

Real-Time Tracking of Multiple Sound Sources by Integration of In-Room and Robot-Embedded Microphone Arrays.
Proceedings of the 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2006

Leak energy based missing feature mask generation for ICA and GSS and its evaluation with simultaneous speech recognition.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audition, 2006

Genetic Algorithm-Based Improvement of Robot Hearing Capabilities in Separating and Recognizing Simultaneous Speech Signals.
Proceedings of the Advances in Applied Artificial Intelligence, 2006

Robust Tracking of Multiple Sound Sources by Spatial Integration of Room And Robot Microphone Arrays.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Speech Recognition for a Humanoid with Motor Noise Utilizing Missing Feature Theory.
Proceedings of the 2006 6th IEEE-RAS International Conference on Humanoid Robots, 2006

A Robot That Can Engage in Both Task-Oriented and Non-Task-Oriented Dialogues.
Proceedings of the 2006 6th IEEE-RAS International Conference on Humanoid Robots, 2006

2005
Making a robot recognize three simultaneous sentences in real-time.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

A two-layer model for behavior and dialogue planning in conversational service robots.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Sound source tracking with directivity pattern estimation using a 64 ch microphone array.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Implementation of active direction-pass filter on dynamically reconfigurable processor.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Multiple moving speaker tracking by microphone array on mobile robot.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Enhanced Robot Speech Recognition Based on Microphone Array Source Separation and Missing Feature Theory.
Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005

Towards New Human-Humanoid Communication: Listening During Speaking by Using Ultrasonic Directional Speaker.
Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005

2004
Effects of increasing modalities in recognizing three simultaneous speeches.
Speech Commun., 2004

Improvement of recognition of simultaneous speech signals using AV integration and scattering theory for humanoid robots.
Speech Commun., 2004

Sound and Visual Tracking for Humanoid Robot.
Appl. Intell., 2004

Assessment of general applicability of robot audition system by recognizing three simultaneous speeches.
Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28, 2004

Multimodal expression for humanoid robots by integration of human speech mimicking and facial color.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Improvement of Robot Audition by Interfacing Sound Source Separation and Automatic Speech Recognition with Missing Feature Theory.
Proceedings of the 2004 IEEE International Conference on Robotics and Automation, 2004

2003
Human-robot non-verbal interaction empowered by real-time auditory and visual multiple-talker tracking.
Adv. Robotics, 2003

Real-Time Sound Source Localization and Separation Based on Active Audio-Visual Integration.
Proceedings of the Artificial Neural Nets Problem Solving Methods, 2003

Applying scattering theory to robot audition system: robust sound source localization and extraction.
Proceedings of the 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems, Las Vegas, Nevada, USA, October 27, 2003

Three simultaneous speech recognition by integration of active audition and face recognition for humanoid.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Design and Implementation of Personality of Humanoids in Human Humanoid Non-verbal Interaction.
Proceedings of the Developments in Applied Artificial Intelligence, 2003

Realizing personality in audio-visually triggered non-verbal behaviors.
Proceedings of the 2003 IEEE International Conference on Robotics and Automation, 2003

Robot recognizes three simultaneous speech by active audition.
Proceedings of the 2003 IEEE International Conference on Robotics and Automation, 2003

Improvement of three simultaneous speech recognition by using AV integration and scattering theory for humanoid.
Proceedings of the AVSP 2003, 2003

2002
Real-time Auditory and Visual Multiple-speaker Tracking For Human-robot Interaction.
J. Robotics Mechatronics, 2002

Realizing Audio-Visually Triggered ELIZA-Like Non-verbal Behaviors.
Proceedings of the PRICAI 2002: Trends in Artificial Intelligence, 2002

Auditory fovea based speech separation and its application to dialog system.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, September 30, 2002

Auditory fovea based speech enhancement and its application to human-robot dialog system.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Real-time sound source localization and separation for robot audition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Social Interaction of Humanoid RobotBased on Audio-Visual Tracking.
Proceedings of the Developments in Applied Artificial Intelligence, 2002

Real-Time Speaker Localization and Speech Separation by Audio-Visual Integration.
Proceedings of the 2002 IEEE International Conference on Robotics and Automation, 2002

Exploiting Auditory Fovea in Humanoid-Human Interaction.
Proceedings of the Eighteenth National Conference on Artificial Intelligence and Fourteenth Conference on Innovative Applications of Artificial Intelligence, July 28, 2002

2001
Human-robot interaction through real-time auditory and visual multiple-talker tracking.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2001

Epipolar geometry based sound localization and extraction for humanoid audition.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2001

Separating three simultaneous speeches with two microphones by integrating auditory and visual processing.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Real-time multiple speaker tracking by multi-modal integration for mobile robots.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Real-Time Auditory and Visual Multiple-Object Tracking for Humanoids.
Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

Graph extraction from color images.
Proceedings of the 9th European Symposium on Artificial Neural Networks, 2001

A computational model of monkey grating cells for oriented repetitive alternating patterns.
Proceedings of the 9th European Symposium on Artificial Neural Networks, 2001

2000
And the Fans Are Going Wild! SIG plus MIKE.
Proceedings of the RoboCup 2000: Robot Soccer World Cup IV, 2000

Humanoid Active Audition System Improved by the Cover Acoustics.
Proceedings of the PRICAI 2000, Topics in Artificial Intelligence, 6th Pacific Rim International Conference on Artificial Intelligence, Melbourne, Australia, August 28, 2000

Active audition system and humanoid exterior design.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2000

Design and architecture of SIG the humanoid: an experimental platform for integrated perception in RoboCup humanoid challenge.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2000

Designing a humanoid head for RoboCup challenge.
Proceedings of the Fourth International Conference on Autonomous Agents, 2000

Active Audition for Humanoid.
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

1995
Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitative Information Integration Mechanism.
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995


  Loading...