Hiroshi Saruwatari

Orcid: 0000-0003-0876-5617

According to our database1, Hiroshi Saruwatari authored at least 440 papers between 1999 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Physics-constrained adaptive kernel interpolation for region-to-region acoustic transfer function: a Bayesian approach.
EURASIP J. Audio Speech Music. Process., December, 2024

JNV corpus: A corpus of Japanese nonverbal vocalizations with diverse phrases and emotions.
Speech Commun., January, 2024

Text-Inductive Graphone-Based Language Adaptation for Low-Resource Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Sound Field Estimation Based on Physics-Constrained Kernel Interpolation Adapted to Environment.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

DNN-based ensemble singing voice synthesis with interactions between singers.
CoRR, 2024

The T05 System for The VoiceMOS Challenge 2024: Transfer Learning from Deep Image Classifier to Naturalness MOS Prediction of High-Quality Synthetic Speech.
CoRR, 2024

Cross-Dialect Text-To-Speech in Pitch-Accent Language Incorporating Multi-Dialect Phoneme-Level BERT.
CoRR, 2024

BigCodec: Pushing the Limits of Low-Bitrate Neural Speech Codec.
CoRR, 2024

J-CHAT: Japanese Large-scale Spoken Dialogue Corpus for Spoken Dialogue Language Modeling.
CoRR, 2024

Spatial Voice Conversion: Voice Conversion Preserving Spatial Information and Non-target Signals.
CoRR, 2024

Noise-Robust Voice Conversion by Conditional Denoising Training Using Latent Variables of Recording Quality and Environment.
CoRR, 2024

SRC4VC: Smartphone-Recorded Corpus for Voice Conversion Benchmark.
CoRR, 2024

RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis.
CoRR, 2024

Building speech corpus with diverse voice characteristics for its prompt-based representation.
CoRR, 2024

SpeechBERTScore: Reference-Aware Automatic Evaluation of Speech Generation Leveraging NLP Evaluation Metrics.
CoRR, 2024

JVNV: A Corpus of Japanese Emotional Speech With Verbal Content and Nonverbal Expressions.
IEEE Access, 2024

Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression.
Proceedings of the IEEE International Conference on Acoustics, 2024

Do Learned Speech Symbols Follow Zipf's Law?
Proceedings of the IEEE International Conference on Acoustics, 2024

Diversity-Based Core-Set Selection for Text-to-Speech with Linguistic and Acoustic Features.
Proceedings of the IEEE International Conference on Acoustics, 2024

Real-Time Speech Extraction Using Spatially Regularized Independent Low-Rank Matrix Analysis and Rank-Constrained Spatial Covariance Matrix Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
JVNV: A Corpus of Japanese Emotional Speech with Verbal Content and Nonverbal Expressions.
Dataset, October, 2023

PoP-IDLMA: Product-of-Prior Independent Deeply Learned Matrix Analysis for Multichannel Music Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Amplitude Matching for Multizone Sound Field Control.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

SelfRemaster: Self-Supervised Speech Restoration for Historical Audio Resources.
IEEE Access, 2023

Kernel Interpolation of Incident Sound Field in Region Including Scattering Objects.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Perceptual Quality Enhancement of Sound Field Synthesis Based on Combination of Pressure and Amplitude Matching.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

Improving robustness of spontaneous speech synthesis with linguistic speech regularization and pseudo-filled-pause insertion.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Federated Learning for Human-in-the-Loop Many-to-Many Voice Conversion.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Laughter Synthesis using Pseudo Phonetic Tokens with a Large-scale In-the-wild Laughter Corpus.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

HumanDiffusion: diffusion model using perceptual gradients.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

ChatGPT-EDSS: Empathetic Dialogue Speech Synthesis Trained from ChatGPT-derived Context Word Embeddings.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

CALLS: Japanese Empathetic Dialogue Speech Corpus of Complaint Handling and Attentive Listening in Customer Center.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

How Generative Spoken Language Modeling Encodes Noisy Speech: Investigation from Phonetics to Syntactics.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Learning to Speak from Text: Zero-Shot Multilingual Text-to-Speech with Unsupervised Text Pretraining.
Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts.
Proceedings of the IEEE International Conference on Acoustics, 2023

MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models.
Proceedings of the IEEE International Conference on Acoustics, 2023

Kernel Interpolation of Acoustic Transfer Functions with Adaptive Kernel for Directed and Residual Reverberations.
Proceedings of the IEEE International Conference on Acoustics, 2023

Visual Onoma-to-Wave: Environmental Sound Synthesis from Visual Onomatopoeias and Sound-Source Images.
Proceedings of the IEEE International Conference on Acoustics, 2023

jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus.
Proceedings of the IEEE International Conference on Acoustics, 2023

Spatial Active Noise Control Method Based on Sound Field Interpolation from Reference Microphone Signals.
Proceedings of the IEEE International Conference on Acoustics, 2023

NoisyILRMA: Diffuse-Noise-Aware Independent Low-Rank Matrix Analysis for Fast Blind Source Extraction.
Proceedings of the 31st European Signal Processing Conference, 2023

Multichannel Active Noise Control with Exterior Radiation Suppression Based on Riemannian Optimization.
Proceedings of the 31st European Signal Processing Conference, 2023

Algorithms of Sampling-Frequency-Independent Layers for Non-integer Strides.
Proceedings of the 31st European Signal Processing Conference, 2023

COCO-NUT: Corpus of Japanese Utterance and Voice Characteristics Description for Prompt-Based Control.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

Blind Source Separation Using Independent Low-Rank Matrix Analysis with Spectrogram-Consistency Regularization.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023

2022
Region-Restricted Sensor Placement Based on Gaussian Process for Sound Field Estimation.
IEEE Trans. Signal Process., 2022

Sampling-Frequency-Independent Convolutional Layer and its Application to Audio Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Region-to-Region Kernel Interpolation of Acoustic Transfer Functions Constrained by Physical Properties.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Deficient-basis-complementary rank-constrained spatial covariance matrix estimation based on multivariate generalized Gaussian distribution for blind speech extraction.
EURASIP J. Adv. Signal Process., 2022

Text-to-speech synthesis from dark data with evaluation-in-the-loop data selection.
CoRR, 2022

Spontaneous speech synthesis with linguistic-speech consistency training using pseudo-filled pauses.
CoRR, 2022

Empirical Study Incorporating Linguistic Knowledge on Filled Pauses for Personalized Spontaneous Speech Synthesis.
CoRR, 2022

Multi-Task Adversarial Training Algorithm for Multi-Speaker Neural Text-to-Speech.
CoRR, 2022

Exploring the Effectiveness of Self-supervised Learning and Classifier Chains in Emotion Recognition of Nonverbal Vocalizations.
CoRR, 2022

Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation.
CoRR, 2022

VTTS: Visual-Text To Speech.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Personalized Filled-pause Generation with Group-wise Prediction Models.
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022

Physics-Informed Convolutional Neural Network with Bicubic Spline Interpolation for Sound Field Estimation.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Head-Related Transfer Function Interpolation From Spatially Sparse Measurements Using Autoencoder With Source Position Conditioning.
Proceedings of the 17th International Workshop on Acoustic Signal Enhancement, 2022

Human-in-the-loop Speaker Adaptation for DNN-based Multi-speaker TTS.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

J-MAC: Japanese multi-speaker audiobook corpus for speech synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

STUDIES: Corpus of Japanese Empathetic Dialogue Speech Towards Friendly Voice Agent.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

UTMOS: UTokyo-SaruLab System for VoiceMOS Challenge 2022.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

SelfRemaster: Self-Supervised Speech Restoration with Analysis-by-Synthesis Approach Using Channel Modeling.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Predicting VQVAE-based Character Acting Style from Quotation-Annotated Text for Audiobook Speech Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Region-to-Region Kernel Interpolation of Acoustic Transfer Function with Directional Weighting.
Proceedings of the IEEE International Conference on Acoustics, 2022

Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds.
Proceedings of the IEEE International Conference on Acoustics, 2022

Spatial Active Noise Control Based on Individual Kernel Interpolation of Primary and Secondary Sound Fields.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Directionally Weighted Wave Field Estimation Exploiting Prior Information on Source Direction.
IEEE Trans. Signal Process., 2021

Perceptual-Similarity-Aware Deep Speaker Representation Learning for Multi-Speaker Generative Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Time-Domain Audio Source Separation With Neural Networks Based on Multiresolution Analysis.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Multichannel Blind Source Separation Based on Evanescent-Region-Aware Non-Negative Tensor Factorization in Spherical Harmonic Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Spatial Active Noise Control Based on Kernel Interpolation of Sound Field.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Incremental Text-to-Speech Synthesis Using Pseudo Lookahead With Large Pretrained Language Model.
IEEE Signal Process. Lett., 2021

Deep Gaussian process based multi-speaker speech synthesis with latent speaker representation.
Speech Commun., 2021

Independent deeply learned matrix analysis with automatic selection of stable microphone-wise update and fast sourcewise update of demixing matrix.
Signal Process., 2021

Joint-diagonalizability-constrained multichannel nonnegative matrix factorization based on time-variant multivariate complex sub-Gaussian distribution.
Signal Process., 2021

Real-Time Full-Band Voice Conversion with Sub-Band Modeling and Data-Driven Phase Estimation of Spectral Differentials.
IEICE Trans. Inf. Syst., 2021

DNN-Based Low-Musical-Noise Single-Channel Speech Enhancement Based on Higher-Order-Moments Matching.
IEICE Trans. Inf. Syst., 2021

Noise Robust Acoustic Anomaly Detection System with Nonnegative Matrix Factorization Based on Generalized Gaussian Distribution.
IEICE Trans. Inf. Syst., 2021

Convex and Differentiable Formulation for Inverse Problems in Hilbert Spaces with Nonlinear Clipping Effects.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2021

Binaural rendering from microphone array signals of arbitrary geometry.
CoRR, 2021

Mean-Square-Error-Based Secondary Source Placement in Sound Field Synthesis with Prior Information on Desired Field.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Kernel Learning for Sound Field Estimation with L1 and L2 Regularizations.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021

Accent Modeling of Low-Resourced Dialect in Pitch Accent Language Using Variational Autoencoder.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Audiobook Speech Synthesis Conditioned by Cross-Sentence Context-Aware Word Embeddings.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021

Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Harmonic WaveGAN: GAN-Based Speech Waveform Generation Model with Harmonic Structure Discriminator.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Disentangled Speaker and Language Representations Using Mutual Information Minimization and Domain Adaptation for Cross-Lingual TTS.
Proceedings of the IEEE International Conference on Acoustics, 2021

Humanacgan: Conditional Generative Adversarial Network with Human-Based Auxiliary Classifier and its Evaluation in Phoneme Perception.
Proceedings of the IEEE International Conference on Acoustics, 2021

Amplitude Matching: Majorization-Minimization Algorithm for Sound Field Control Only with Amplitude Constraint.
Proceedings of the IEEE International Conference on Acoustics, 2021

Deficient Basis Estimation of Noise Spatial Covariance Matrix for Rank-Constrained Spatial Covariance Matrix Estimation Method in Blind Speech Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2021

Sampling-Frequency-Independent Audio Source Separation Using Convolution Layer Based on Impulse Invariant Method.
Proceedings of the 29th European Signal Processing Conference, 2021

Independent Deeply Learned Tensor Analysis for Determined Audio Source Separation.
Proceedings of the 29th European Signal Processing Conference, 2021

Empirical Bayesian Independent Deeply Learned Matrix Analysis For Multichannel Audio Source Separation.
Proceedings of the 29th European Signal Processing Conference, 2021

Low-Latency Incremental Text-to-Speech Synthesis with Distilled Context Prediction Network.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Prior Distribution Design for Music Bleeding-Sound Reduction Based on Nonnegative Matrix Factorization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Speech Enhancement by Noise Self-Supervised Rank-Constrained Spatial Covariance Matrix Estimation via Independent Deeply Learned Matrix Analysis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Emotion-Controllable Speech Synthesis Using Emotion Soft Labels and Fine-Grained Prosody Factors.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Multichannel Audio Source Separation with Independent Deeply Learned Matrix Analysis Using Product of Source Models.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Independent Low-Rank Matrix Analysis Based on Time-Variant Sub-Gaussian Source Model for Determined Blind Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Blind Speech Extraction Based on Rank-Constrained Spatial Covariance Matrix Estimation With Multivariate Generalized Gaussian Distribution.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Acoustic model-based subword tokenization and prosodic-context extraction without language knowledge for text-to-speech synthesis.
Speech Commun., 2020

Reciprocity gap functional in spherical harmonic domain for gridless sound field decomposition.
Signal Process., 2020

Phase reconstruction from amplitude spectrograms based on directional-statistics deep neural networks.
Signal Process., 2020

DNN-Based Full-Band Speech Synthesis Using GMM Approximation of Spectral Envelope.
IEICE Trans. Inf. Syst., 2020

Generative Moment Matching Network-Based Neural Double-Tracking for Synthesized and Natural Singing Voices.
IEICE Trans. Inf. Syst., 2020

JSSS: free Japanese speech corpus for summarization and simplification.
CoRR, 2020

JVS-MuSiC: Japanese multispeaker singing-voice corpus.
CoRR, 2020

Binaural Rendering From Distributed Microphone Signals Considering Loudspeaker Distance in Measurements.
Proceedings of the 22nd IEEE International Workshop on Multimedia Signal Processing, 2020

DNN-based Speech Synthesis Using Abundant Tags of Spontaneous Speech Corpus.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

SMASH Corpus: A Spontaneous Speech Corpus Recording Third-person Audio Commentaries on Gameplay.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Investigating Effective Additional Contextual Factors in DNN-Based Spontaneous Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Cross-Lingual Text-To-Speech Synthesis via Domain Adaptation and Perceptual Similarity Regression in Speaker Space.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Harmonic Lowering for Accelerating Harmonic Convolution for Audio Signals.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Real-Time, Full-Band, Online DNN-Based Voice Conversion System Using a Single CPU.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Multi-Speaker Text-to-Speech Synthesis Using Deep Gaussian Processes.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

End-to-End Text-to-Speech Synthesis with Unaligned Multiple Language Units Based on Attention.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Kernel interpolation of acoustic transfer function between regions considering reciprocity.
Proceedings of the 11th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2020

Lifter Training and Sub-Band Modeling for Computationally Efficient and High-Quality Voice Conversion Using Spectral Differentials.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Utterance-Level Sequential Modeling for Deep Gaussian Process Based Speech Synthesis Using Simple Recurrent Unit.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Convergence-Guaranteed Independent Positive Semidefinite Tensor Analysis Based on Student's T Distribution.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Regularized Fast Multichannel Nonnegative Matrix Factorization with ILRMA-Based Prior Distribution of Joint-Diagonalization Process.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Spatial Active Noise Control Based on Kernel Interpolation with Directional Weighting.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Humangan: Generative Adversarial Network With Human-Based Discriminator And Its Evaluation In Speech Perception Modeling.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Mutual-Information-Based Sensor Placement for Spatial Sound Field Recording.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

DNN-Based Frequency Component Prediction for Frequency-Domain Audio Source Separation.
Proceedings of the 28th European Signal Processing Conference, 2020

Sensor placement in arbitrarily restricted region for field estimation based on Gaussian process.
Proceedings of the 28th European Signal Processing Conference, 2020

Joint-Diagonalizability-Constrained Multichannel Nonnegative Matrix Factorization Based on Multivariate Complex Sub-Gaussian Distribution.
Proceedings of the 28th European Signal Processing Conference, 2020

Joint-Diagonalizability-Constrained Multichannel Nonnegative Matrix Factorization Based on Multivariate Complex Student's t-distribution.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019
Three-Dimensional Sound Field Reproduction Based on Weighted Mode-Matching Method.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Independent Deeply Learned Matrix Analysis for Determined Audio Source Separation.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Bilevel Optimization Using Stationary Point of Lower-Level Objective Function for Discriminative Basis Learning in Nonnegative Matrix Factorization.
IEEE Signal Process. Lett., 2019

Prosody Correction Preserving Speaker Individuality for Chinese-Accented Japanese HMM-Based Text-to-Speech Synthesis.
IEICE Trans. Inf. Syst., 2019

Independent Low-Rank Matrix Analysis Based on Generalized Kullback-Leibler Divergence.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2019

Vocoder-free text-to-speech synthesis incorporating generative adversarial networks using low-/multi-frequency STFT amplitude spectra.
Comput. Speech Lang., 2019

JVS corpus: free Japanese multi-speaker voice corpus.
CoRR, 2019

Two-Dimensional Sound Field Recording With Multiple Circular Microphone Arrays Considering Multiple Scattering.
Proceedings of the 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2019

TransVoice: Real-Time Voice Conversion for Augmenting Near-Field Speech Communication.
Proceedings of the Adjunct Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology, 2019

DNN-based Speaker Embedding Using Subjective Inter-speaker Similarity for Multi-speaker Modeling in Speech Synthesis.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

V2S attack: building DNN-based voice conversion from automatic speaker verification.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Subword tokenization based on DNN-based acoustic model for end-to-end prosody generation.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Implementation of DNN-based real-time voice conversion and its improvements by audio data augmentation and mask-shaped device.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Generative Moment Matching Network-based Random Modulation Post-filter for DNN-based Singing Voice Synthesis and Neural Double-tracking.
Proceedings of the IEEE International Conference on Acoustics, 2019

Robust Gridless Sound Field Decomposition Based on Structured Reciprocity Gap Functional in Spherical Harmonic Domain.
Proceedings of the IEEE International Conference on Acoustics, 2019

Feedforward Spatial Active Noise Control Based on Kernel Interpolation of Sound Field.
Proceedings of the IEEE International Conference on Acoustics, 2019

Efficient Full-Rank Spatial Covariance Estimation Using Independent Low-Rank Matrix Analysis for Blind Source Separation.
Proceedings of the 27th European Signal Processing Conference, 2019

Evaluation of Multichannel Hearing Aid System by Rank-Constrained Spatial Covariance Matrix Estimation.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Robust Demixing Filter Update Algorithm Based on Microphone-wise Coordinate Descent for Independent Deeply Learned Matrix Analysis.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Acceleration of rank-constrained spatial covariance matrix estimation for blind speech extraction.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Sparse Representation Using Multidimensional Mixed-Norm Penalty With Application to Sound Field Decomposition.
IEEE Trans. Signal Process., 2018

Statistical Parametric Speech Synthesis Incorporating Generative Adversarial Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

Sound Field Recording Using Distributed Microphones Based on Harmonic Analysis of Infinite Order.
IEEE Signal Process. Lett., 2018

Generalized independent low-rank matrix analysis using heavy-tailed distributions for blind source separation.
EURASIP J. Adv. Signal Process., 2018

CPJD Corpus: Crowdsourced Parallel Speech Corpus of Japanese Dialects.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Kernel Ridge Regression with Constraint of Helmholtz Equation for Sound Field Interpolation.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Phase Reconstruction from Amplitude Spectrograms Based on Von-Mises-Distribution Deep Neural Network.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Gridless Sound Field Decomposition Based on Reciprocity Gap Functional in Spherical Harmonic Domain.
Proceedings of the 10th IEEE Sensor Array and Multichannel Signal Processing Workshop, 2018

Sound Field Reproduction with Exterior Cancellation Using Analytical Weighting of Harmonic Coefficients.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Text-to-Speech Synthesis Using STFT Spectra Based on Low-/Multi-Resolution Generative Adversarial Networks.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Vectorwise Coordinate Descent Algorithm for Spatially Regularized Independent Low-Rank Matrix Analysis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Exterior and Interior Sound Field Separation Using Convex Optimization: Comparison of Signal Models.
Proceedings of the 26th European Signal Processing Conference, 2018

Independent Deeply Learned Matrix Analysis for Multichannel Audio Source Separation.
Proceedings of the 26th European Signal Processing Conference, 2018

Generative approach using the noise generation models for DNN-based speech synthesis trained from noisy speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Independent Low-Rank Matrix Analysis Based on Time-Variant Sub-Gaussian Source Model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Prosody-aware subword embedding considering Japanese intonation systems and its application to DNN-based multi-dialect speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Low Latency and High Quality Two-Stage Human-Voice-Enhancement System for a Hose-Shaped Rescue Robot.
J. Robotics Mechatronics, 2017

Voice Conversion Using Input-to-Output Highway Networks.
IEICE Trans. Inf. Syst., 2017

JSUT corpus: free large-scale Japanese speech corpus for end-to-end speech synthesis.
CoRR, 2017

Independent low-rank matrix analysis based on complex student's t-distribution for blind audio source separation.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017

Sampling-Based Speech Parameter Generation Using Moment-Matching Networks.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Voice Conversion Using Sequence-to-Sequence Learning of Context Posterior Probabilities.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Listening-area-informed sound field reproduction based on circular harmonic expansion.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Training algorithm to deceive Anti-Spoofing Verification for DNN-based speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Spatio-temporal sparse sound field decomposition considering acoustic source signal characteristics.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Blind source separation based on independent low-rank matrix analysis with sparse regularization for time-series activity.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Ego Noise Reduction for Hose-Shaped Rescue Robot Combining Independent Low-Rank Matrix Analysis and Multichannel Noise Cancellation.
Proceedings of the Latent Variable Analysis and Signal Separation, 2017

Listening-area-informed sound field reproduction with Gaussian prior based on circular harmonic expansion.
Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

Experimental analysis of optimal window length for independent low-rank matrix analysis.
Proceedings of the 25th European Signal Processing Conference, 2017

Independent low-rank matrix analysis based on parametric majorization-equalization algorithm.
Proceedings of the 2017 IEEE 7th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 2017

The UTokyo speech synthesis system for Blizzard Challenge 2017.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

Sound source localization using binaural difference for hose-shaped rescue robot.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Determined Blind Source Separation Unifying Independent Vector Analysis and Nonnegative Matrix Factorization.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Ego-noise reduction for a hose-shaped rescue robot using determined rank-1 multichannel nonnegative matrix factorization.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Discriminative and reconstructive basis training for audio source separation with semi-supervised nonnegative matrix factorization.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Noise reduction using independent vector analysis and noise cancellation for a hose-shaped rescue robot.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Semi-Supervised Joint Enhancement of Spectral and Cepstral Sequences of Noisy Speech.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Sparse sound field decomposition with multichannel extension of complex NMF.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Multichannel blind source separation based on non-negative tensor factorization in wavenumber domain.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Sound field decomposition in reverberant environment using sparse and low-rank signal models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Music signal separation using supervised NMF with all-pole-model-based discriminative basis deformation.
Proceedings of the 24th European Signal Processing Conference, 2016

Reverberation-robust underdetermined source separation with non-negative tensor double deconvolution.
Proceedings of the 24th European Signal Processing Conference, 2016

Audio signal separation using supervised NMF with time-variant all-pole-model-based basis deformation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Multichannel Signal Separation Combining Directional Clustering and Nonnegative Matrix Factorization with Spectrogram Restoration.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Source-location-informed sound field recording and reproduction with spherical arrays.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Statistical-model-based speech enhancement with musical-noise-free properties.
Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Statistical modeling of binaural signal and its application to binaural source separation.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Structured sparse signal models and decomposition algorithm for super-resolution in sound field recording and reproduction.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Efficient multichannel nonnegative matrix factorization exploiting rank-1 spatial model.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Relaxation of rank-1 spatial constraint in overdetermined blind source separation.
Proceedings of the 23rd European Signal Processing Conference, 2015

Sparse sound field decomposition with parametric dictionary learning for super-resolution recording and reproduction.
Proceedings of the 6th IEEE International Workshop on Computational Advances in Multi-Sensor Adaptive Processing, 2015

Sparse sound field decomposition using group sparse Bayesian learning.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

2014
Alaryngeal Speech Enhancement Based on One-to-Many Eigenvoice Conversion.
IEEE ACM Trans. Audio Speech Lang. Process., 2014

Musical-noise-free blind speech extraction integrating microphone array and iterative spectral subtraction.
Signal Process., 2014

Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Orthogonality and Maximum-Divergence Penalties.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2014

Music signal separation based on Bayesian spectral amplitude estimator with automatic target prior adaptation.
Proceedings of the IEEE International Conference on Acoustics, 2014

Theoretical analysis of biased MMSE short-time spectral amplitude estimator and its extension to musical-noise-free speech enhancement.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Optimized joint noise suppression and dereverberation based on blind signal extraction for hands-free speech recognition system.
Proceedings of the 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays, 2014

Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
Design of multichannel frequency domain statistical-based enhancement systems preserving spatial cues via spectral distances minimization.
Signal Process., 2013

Comparison of Methods for Topic Classification of Spoken Inquiries.
Inf. Media Technol., 2013

Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2013

Information-geometric optimization for nonlinear noise reduction systems.
Proceedings of the International Symposium on Intelligent Signal Processing and Communication Systems, 2013

Musical noise analysis for Bayesian minimum mean-square error speech amplitude estimators based on higher-order statistics.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Music signal separation by supervised nonnegative matrix factorization with basis deformation.
Proceedings of the 18th International Conference on Digital Signal Processing, 2013

Superresolution-based stereo signal separation via supervised nonnegative matrix factorization.
Proceedings of the 18th International Conference on Digital Signal Processing, 2013

Toward musical-noise-free blind speech extraction: Concept and its applications.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Semi-blind algorithm for joint noise suppression and dereverberation based on higher-order statistics and acoustic model likelihood.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Musical-Noise-Free Speech Enhancement Based on Optimized Iterative Spectral Subtraction.
IEEE Trans. Speech Audio Process., 2012

Speaking-aid systems using GMM-based voice conversion for electrolaryngeal speech.
Speech Commun., 2012

Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2012

Theoretical Analysis of Amounts of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2012

Topic Classification of Spoken Inquiries Using Transductive Support Vector Machine.
Proceedings of the Natural Interaction with Robots, 2012

Evaluation of Invalid Input Discrimination Using Bag-of-Words for Speech-Oriented Guidance System.
Proceedings of the Natural Interaction with Robots, 2012

Development of a Toolkit Handling Multiple Speech-Oriented Guidance Agents for Mobile Applications.
Proceedings of the Natural Interaction with Robots, 2012

Musical-Noise-Free Blind Speech Extraction Using ICA-Based Noise Estimation with Channel Selection.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

Theoretical Analysis of Musical Noise Generation in Noise Reduction Methods with Decision-Directed a Priori SNR Estimator.
Proceedings of the IWAENC 2012 - International Workshop on Acoustic Signal Enhancement, Proceedings, RWTH Aachen University, Germany, September 4th, 2012

Blind speech extraction for Non-Audible Murmur speech with speaker's movement noise.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012

Musical-noise-free blind speech extraction using ICA-based noise estimation and iterative spectral subtraction.
Proceedings of the 11th International Conference on Information Science, 2012

Spoken Inquiry Discrimination Using Bag-of-Words for Speech-Oriented Guidance System.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Evaluation of Many-to-Many Alignment Algorithm by Automatic Pronunciation Annotation Using Web Text Mining.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Statistical approach to voice quality control in esophageal speech enhancement.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Speech kurtosis estimation from observed noisy signal based on generalized Gaussian distribution prior and additivity of cumulants.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Musical-noise-free speech enhancement: Theory and evaluation.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Sound-localization-preserved binaural MMSE STSA estimator with explicit and implicit binaural cues.
Proceedings of the 20th European Signal Processing Conference, 2012

Object-based stereo up-mixer for wave field synthesis based on spatial information clustering.
Proceedings of the 20th European Signal Processing Conference, 2012

Theoretical analysis of musical noise in nonlinear noise reduction based on higher-order statistics.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Real-time semi-blind speech extraction with speaker direction tracking on Kinect.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Response generation based on statistical machine translation for speech-oriented guidance system.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Comparative study on various noise reduction methods with decision-directed a priori SNR estimator via higher-order statistics.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Optimization scheme of joint noise suppression and dereverberation based on higher-order statistics.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Musical Noise Controllable Algorithm of Channelwise Spectral Subtraction and Adaptive Beamforming Based on Higher Order Statistics.
IEEE Trans. Speech Audio Process., 2011

Theoretical Analysis of Musical Noise in Generalized Spectral Subtraction Based on Higher Order Statistics.
IEEE Trans. Speech Audio Process., 2011

Sound Field Reproduction by Wavefront Synthesis Using Directly Aligned Multi Point Control.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2011

Semi-blind speech extraction for robot using visual information and noise statistics.
Proceedings of the 2011 IEEE International Symposium on Signal Processing and Information Technology, 2011

Blind Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Theoretical Analysis of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Speaker-Adaptive Speech Synthesis Based on Eigenvoice Conversion and Language-Dependent Prosodic Conversion in Speech-to-Speech Translation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Automatic musical thumbnailing based on audio object localization and its evaluation.
Proceedings of the IEEE International Conference on Acoustics, 2011

Robust sound field reproduction integrating multi-point sound field control and wave field synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

Theoretical analysis of musical noise in Wiener filtering family via higher-order statistics.
Proceedings of the IEEE International Conference on Acoustics, 2011

An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques.
Proceedings of the IEEE International Conference on Acoustics, 2011

Acoustic model training for non-audible murmur recognition using transformed normal speech data.
Proceedings of the IEEE International Conference on Acoustics, 2011

Efficient blind speech separation suitable for embedded devices.
Proceedings of the 19th European Signal Processing Conference, 2011

Blind noise suppression for Non-Audible Murmur recognition with stereo signal processing.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

2010
Improvements of the One-to-Many Eigenvoice Conversion System.
IEICE Trans. Inf. Syst., 2010

Adaptive Training for Voice Conversion Based on Eigenvoices.
IEICE Trans. Inf. Syst., 2010

Evaluation of Extremely Small Sound Source Signals Used in Speaking-Aid System with Statistical Voice Conversion.
IEICE Trans. Inf. Syst., 2010

Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models.
IEICE Trans. Inf. Syst., 2010

Musical-Noise Analysis in Methods of Integrating Microphone Array and Spectral Subtraction Based on Higher-Order Statistics.
EURASIP J. Adv. Signal Process., 2010

Linear transformation approaches to many-to-one voice conversion.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Improvement of speech recognition performance for spoken-oriented robot dialog system using end-fire array.
Proceedings of the 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010

Comparison of methods for topic classification in a speech-oriented guidance system.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Adaptive voice-quality control based on one-to-many eigenvoice conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Close speaker cancellation for suppression of non-stationary background noise for hands-free speech interface.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Theoretical musical-noise analysis and its generalization for methods of integrating beamforming and spectral subtraction based on higher-order statistics.
Proceedings of the IEEE International Conference on Acoustics, 2010

MMSE STSA estimator with nonstationary noise estimation based on ICA for high-quality speech enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2010

Non-parallel training for many-to-many eigenvoice conversion.
Proceedings of the IEEE International Conference on Acoustics, 2010

Speech enhancement in presence of diffuse background noise: Why using blind signal extraction?
Proceedings of the IEEE International Conference on Acoustics, 2010

Complex Newton algorithm for blind signal extraction of speech in diffuse noise.
Proceedings of the IEEE International Conference on Acoustics, 2010

Statistical approach to enhancing esophageal speech based on Gaussian mixture models.
Proceedings of the IEEE International Conference on Acoustics, 2010

Blind Speech Extraction Combining Generalized MMSE STSA Estimator and ICA-Based Noise and Speech Probability Density Function Estimations.
Proceedings of the Latent Variable Analysis and Signal Separation, 2010

Theoretical analysis of musical noise in generalized spectral subtraction: Why should not use power/amplitude subtraction?
Proceedings of the 18th European Signal Processing Conference, 2010

Blind signal extraction based joint suppression of diffuse background noise and late reverberation.
Proceedings of the 18th European Signal Processing Conference, 2010

Musical noise controllable algorithm of channelwise spectral subtraction and beamforming based on higher-order statistics criterion.
Proceedings of the 2nd International Workshop on Cognitive Information Processing, 2010

2009
Blind Spatial Subtraction Array for Speech Enhancement in Noisy Environment.
IEEE Trans. Speech Audio Process., 2009

Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics.
Speech Commun., 2009

Enhancement of speech signals separated from their convolutive mixture by FDICA algorithm.
Digit. Signal Process., 2009

Temporal quantization of spatial information using directional clustering for multichannel audio coding.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2009

Semi-blind suppression of internal noise for hands-free robot spoken dialog system.
Proceedings of the 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2009

Many-to-many eigenvoice conversion with reference voice.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Electrolaryngeal speech enhancement based on statistical voice conversion.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Target Speech Enhancement in Presence of Jammer and Diffuse Background Noise.
Proceedings of the Independent Component Analysis and Signal Separation, 2009

Musical noise generation analysis for noise reduction methods based on spectral subtraction and MMSE STSA estimation.
Proceedings of the IEEE International Conference on Acoustics, 2009

Musical noise analysis based on higher order statistics for microphone array and nonlinear signal processing.
Proceedings of the IEEE International Conference on Acoustics, 2009

Source adaptive blind signal extraction using closed-form ICA for hands-free robot spoken dialogue system.
Proceedings of the IEEE International Conference on Acoustics, 2009

Hands-free speech recognition challenge for real-world speech dialogue systems.
Proceedings of the IEEE International Conference on Acoustics, 2009

Acoustic compensation methods for body transmitted speech conversion.
Proceedings of the IEEE International Conference on Acoustics, 2009

Kernel-based nonlinear independent component analysis for underdetermined blind source separation.
Proceedings of the IEEE International Conference on Acoustics, 2009

Multiple ICA-based real-time blind source extraction applied to handy size microphone.
Proceedings of the IEEE International Conference on Acoustics, 2009

Enhanced wiener post-processing based on partial projection back of the blind signal separation noise estimate.
Proceedings of the 17th European Signal Processing Conference, 2009

2008
Rapid Compensation of Temperature Fluctuation Effect for Multichannel Sound Field Reproduction System.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008

Fast Convergence Blind Source Separation Using Frequency Subband Interpolation by Null Beamforming.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008

Building an Effective Speech Corpus by Utilizing Statistical Multidimensional Scaling Method.
IEICE Trans. Inf. Syst., 2008

Cost Reduction of Acoustic Modeling for Real-Environment Applications Using Unsupervised and Selective Training.
IEICE Trans. Inf. Syst., 2008

Development, Long-Term Operation and Portability of a Real-Environment Speech-Oriented Guidance System.
IEICE Trans. Inf. Syst., 2008

Language model for the web search task in a spoken dialogue system for children.
Proceedings of the First Workshop on Child, Computer and Interaction, 2008

Real-time implementation of blind spatial subtraction array for hands-free robot spoken dialogue system.
Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008

An improved permutation solver for blind signal separation based front-ends in robot audition.
Proceedings of the 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008

Maximum a posteriori adaptation for many-to-one eigenvoice conversion.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Question and answer database optimization using speech recognition results.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Development and evaluation of hands-free spoken dialogue system for railway station guidance.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speaker verification with non-audible murmur segments by combining global alignment kernel and penalized logistic regression machine.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An improved one-to-many eigenvoice conversion system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Evaluation of speaking-aid system with voice conversion for laryngectomees toward its use in practical environments.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Hybrid structure of inverse filtering and DOA-parameterized wavefront synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008

Source-oriented localization control of stereo audio signals based on blind source separation.
Proceedings of the IEEE International Conference on Acoustics, 2008

Distant talking robust speech recognition using late reflection components of room impulse response.
Proceedings of the IEEE International Conference on Acoustics, 2008

Frequency domain semi-blind signal separation: application to the rejection of internal noises.
Proceedings of the IEEE International Conference on Acoustics, 2008

Extension of score function difference for frequency domain blind source separation.
Proceedings of the 2008 16th European Signal Processing Conference, 2008

2007
Reducing Computation Time of the Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics.
IEICE Trans. Inf. Syst., 2007

Interface for Barge-in Free Spoken Dialogue System Based on Sound Field Reproduction and Microphone Array.
EURASIP J. Adv. Signal Process., 2007

Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor.
EURASIP J. Adv. Signal Process., 2007

An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Regression approaches to voice quality controll based on one-to-many eigenvoice conversion.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Voice activity detection applied to hands-free spoken dialogue robot based on decoding using acoustic and language model.
Proceedings of the 1st International Conference on Robot Communication and Coordination, 2007

Robust spatial subtraction array with independent component analysis for speech enhancement.
Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Noise-robust hands-free speech recognition using SIMO-model-based blind source separation.
Proceedings of the 9th International Symposium on Signal Processing and Its Applications, 2007

Study on speaker verification with non-audible murmur segments.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Permutation-Robust Structure for ICA-Based Blind Source Extraction.
Proceedings of the IEEE International Conference on Acoustics, 2007

Efficient Blind Source Separation Combining Closed-Form Second-Order ICA and Nonclosed-Form Higher-Order ICA.
Proceedings of the IEEE International Conference on Acoustics, 2007

High-Presence Hearing-Aid System using DSP-Based Real-Time Blind Source Separation Module.
Proceedings of the IEEE International Conference on Acoustics, 2007

Barge-in- and noise-free spoken dialogue interface based on sound field control and semi-blind source separation.
Proceedings of the 15th European Signal Processing Conference, 2007

Development and portability of ASR and Q&A modules for real-environment speech-oriented guidance systems.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

SIMO-Model-Based Blind Source Separation - Principle and its Applications.
Proceedings of the Blind Speech Separation, 2007

2006
Blind source separation based on a fast-convergence algorithm combining ICA and beamforming.
IEEE Trans. Speech Audio Process., 2006

Interface for Barge-in Free Spoken Dialogue System Using Nullspace Based Sound Field Control and Beamforming.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2006

Improving Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics in Noisy Environments Using Multi-Template Models.
IEICE Trans. Inf. Syst., 2006

Utterance-Based Selective Training for the Automatic Creation of Task-Dependent Acoustic Models.
IEICE Trans. Inf. Syst., 2006

Blind Separation of Acoustic Signals Combining SIMO-Model-Based Independent Component Analysis and Binary Masking.
EURASIP J. Adv. Signal Process., 2006

Transcription Cost Reduction for Constructing Acoustic Models Using Acoustic Likelihood Selection Criteria.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Maximum likelihood voice conversion based on GMM with STRAIGHT mixed excitation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Speaker verification with non-audible murmur segments.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Acoustic modeling for spoken dialogue systems based on unsupervised utterance-based selective training.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Blind Source Separation Combining Simo-Ica and Simo-Model-Based Binary Masking.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Double-Talk Free Spoken Dialogue Interface Combining Sound Field Control With Semi-Blind Source Separation.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Improving Rapid Unsupervised Speaker Adaptation Based On Hmm Sufficient Statistics.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

ICA and Binary-Mask-Based Blind Source Separation with Small Directional Microphones.
Proceedings of the Independent Component Analysis and Blind Signal Separation, 2006

Two-stage blind separation of moving sound sources with pocket-size real-time DSP module.
Proceedings of the 14th European Signal Processing Conference, 2006

2005
Estimation of Shape Parameter of GGD Function by Negentropy Matching.
Neural Process. Lett., 2005

Multistage SIMO-Model-Based Blind Source Separation Combining Frequency-Domain ICA and Time-Domain ICA.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

On-Line Relaxation Algorithm Applicable to Acoustic Fluctuation for Inverse Filter in Multichannel Sound Reproduction System.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

A Self-Generator Method for Initial Filters of SIMO-ICA Applied to Blind Separation of Binaural Sound Mixtures.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Blind Separation and Deconvolution for Convolutive Mixture of Speech Combining SIMO-Model-Based ICA and Multichannel Inverse Filtering.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Blind Separation of Speech by Fixed-Point ICA with Source Adaptive Negentropy Approximation.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Interface for Barge-in Free Spoken Dialogue System Combining Adaptive Sound Field Control and Microphone Array.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Subband-Based Blind Separation for Convolutive Mixtures of Speech.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Designing Target Cost Function Based on Prosody of Speech Database.
IEICE Trans. Inf. Syst., 2005

Blind sound scene decomposition for robot audition using SIMO-model-based ICA.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Two-stage blind source separation based on ICA and binary masking for real-time robot audition system.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Noise-robust hands-free speech recognition based on spatial subtraction array and known noise superimposition.
Proceedings of the 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2005

Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptations.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Applications of NAM microphones in speech recognition for privacy in human-machine communication.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Investigating the role of the Lombard reflex in non-audible murmur (NAM) recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Speech Enhancement Based on Blind Source Separation in Car Environments.
Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Blind source separation combining SIMO-model-based ICA and adaptive beamforming.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Blind separation of binaural sound mixtures using SIMO-ICA with self-generator for initial filter.
Proceedings of the 13th European Signal Processing Conference, 2005

Two-stage blind source separation combining SIMO-model-based ICA and adaptive beamforming.
Proceedings of the 13th European Signal Processing Conference, 2005

Blind separation of more than two sources based on high-convergence algorithm combining ICA and beamforming.
Proceedings of the 13th European Signal Processing Conference, 2005

Barge-in free spoken dialogue interface using nullspace-based sound field control and beamforming.
Proceedings of the 13th European Signal Processing Conference, 2005

A tissue-conductive acoustic sensor applied in speech recognition for privacy.
Proceedings of the 2005 joint conference on Smart objects and ambient intelligence, 2005

2004
Negentropy based voice-activity detection for noise estimation in very low SNR condition.
IEICE Electron. Express, 2004

Robots that can hear, understand and talk.
Adv. Robotics, 2004

Perceptual Evaluation of Quality Deterioration Owing to Prosody Modification.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

MAP estimation of speech spectral component under GGD a priori.
Proceedings of the ISCA Tutorial and Research Workshop on Statistical and Perceptual Audio Processing, 2004

Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Non-audible murmur (NAM) speech recognition using a stethoscopic NAM microphone.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Robust speech recognition with spectral subtraction in low SNR.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Interface for barge-in free spoken dialogue system using adaptive sound field control.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Multistage SIMO-model-based blind source separation combining frequency-domain ICA and time-domain ICA.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Blind separation of binaural sound mixtures using SIMO-model-based independent component analysis.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Public speech-oriented guidance system with adult and child discrimination capability.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Overdetermined blind separation for convolutive mixtures of speech based on multistage ICA using subarray processing.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Evaluation of Multistage SIMO-Model-Based Blind Source Separation Combining Frequency-Domain ICA and Time-Domain ICA.
Proceedings of the Independent Component Analysis and Blind Signal Separation, 2004

Single Channel Speech Enhancement: MAP Estimation Using GGD Prior Under Blind Setup.
Proceedings of the Independent Component Analysis and Blind Signal Separation, 2004

Stable and Low-Distortion Algorithm Based on Overdetermined Blind Separation for Convolutive Mixtures of Speech.
Proceedings of the Independent Component Analysis and Blind Signal Separation, 2004

Evaluation of blind separation and deconvolution for binaural-sound mixtures using SIMO-model-based ICA.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

On-line adaptive algorithm to acoustic fluctuation for inverse filter relaxation in sound reproduction system.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

Audible (normal) speech and inaudible murmur recognition using NAM microphone.
Proceedings of the 2004 12th European Signal Processing Conference, 2004

2003
The fundamental limitation of frequency domain blind source separation for convolutive mixtures of speech.
IEEE Trans. Speech Audio Process., 2003

Fast-Convergence Algorithm for Blind Source Separation Based on Array Signal Processing.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Stable Learning Algorithm for Blind Separation of Temporally Correlated Acoustic Signals Combining Multistage ICA and Linear Prediction.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Blind Source Separation of Acoustic Signals Based on Multistage ICA Combining Frequency-Domain ICA and Time-Domain ICA.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Blind Source Separation Combining Independent Component Analysis and Beamforming.
EURASIP J. Adv. Signal Process., 2003

Equivalence between Frequency-Domain Blind Source Separation and Frequency-Domain Adaptive Beamforming for Convolutive Mixtures.
EURASIP J. Adv. Signal Process., 2003

Stable learning algorithm for low-distortion blind separation of real speech mixture combining multistage ICA and linear prediction.
Proceedings of the ITRW on Non-Linear Speech Processing, 2003

Blind separation and deconvolution of MIMO system driven by colored inputs using SIMO-model-based ICA with information-geometric learning.
Proceedings of the NNSP 2003, 2003

High-fidelity blind separation for convolutive mixture of acoustic signals using SIMO-model-based independent component analysis.
Proceedings of the Seventh International Symposium on Signal Processing and Its Applications, 2003

Blind separation and deconvolution for convolutive mixture of speech using SIMO-model-based ICA and multichannel inverse filtering.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised speaker adaptation based on HMM sufficient statistics in various noisy environments.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Simple designing methods of corpus-based visual speech synthesis.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

GMM-based voice conversion applied to emotional speech synthesis.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Blind source separation based on binaural ICA.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Interface for barge-in free spoken dialogue system based on sound field control and microphone array.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Subband based blind source separation for convolutive mixtures of speech.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Sound Reproduction System Including Adaptive Compensation of Temperature Fluctuation Effect for Broad-Band Sound Control.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2002

Time domain blind source separation of non-stationary convolved signals by utilizing geometric beamforming.
Proceedings of the 12th IEEE Workshop on Neural Networks for Signal Processing, 2002

ASKA: receptionist robot with speech dialogue system.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, September 30, 2002

Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Speech enhancement in car environment using blind source separation.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Selective multi-path acoustic model based on database likelihoods.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Design and collection of acoustic sound data for hands-free speech recognition and sound scene understanding.
Proceedings of the 2002 IEEE International Conference on Multimedia and Expo, 2002

Sound reproduction system with adaptive compensation of temperature fluctuation effect.
Proceedings of the 14th International Conference on Digital Signal Processing, 2002

Blind source separation based on fast-convergence algorithm using ICA and beamforming for real convolutive mixture.
Proceedings of the IEEE International Conference on Acoustics, 2002

Bund source separation based on Multi-Stage ICA combining frequency-domain ICA and time-domain ICA.
Proceedings of the IEEE International Conference on Acoustics, 2002

Equivalence between frequency domain blind source separation and frequency domain adaptive beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2002

Adaptive compensation of temperature fluctuation effect in sound reproduction system.
Proceedings of the 11th European Signal Processing Conference, 2002

Evaluation of fast-convergence algorithm for ICA-based blind source separation of real convolutive mixture.
Proceedings of the 11th European Signal Processing Conference, 2002

Comparison of time-domain ICA, frequency-domain ICA and multistage ICA for blind source separation.
Proceedings of the 11th European Signal Processing Conference, 2002

2001
Unsupervised noisy environment adaptation algorithm using MLLR and speaker selection.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

High quality voice conversion based on Gaussian mixture model with dynamic frequency warping.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Blind source separation for speech based on fast-convergence algorithm with ICA and beamforming.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Automatic n-gram language model creation from web resources.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Equivalence between frequency domain blind source separation and frequency domain adaptive null beamformers.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum.
Proceedings of the IEEE International Conference on Acoustics, 2001

Blind source separation combining frequency-domain ICA and beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2001

Direction of arrival estimation based on nonlinear microphone array.
Proceedings of the IEEE International Conference on Acoustics, 2001

Fundamental limitation of frequency domain blind source separation for convolutive mixture of speech.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Straight-based voice conversion algorithm based on Gaussian mixture model.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Blind source separation based on subband ICA and beamforming.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speech enhancement using nonlinear microphone array with noise adaptive complementary beamforming.
Proceedings of the IEEE International Conference on Acoustics, 2000

Evaluation of blind signal separation method using directivity pattern under reverberant conditions.
Proceedings of the IEEE International Conference on Acoustics, 2000

Speech enhancement based on noise adaptive nonlinear microphone array.
Proceedings of the 10th European Signal Processing Conference, 2000

1999
Speech enhancement using nonlinear microphone array under nonstationary noise conditions.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speech enhancement using nonlinear microphone array with complementary beamforming.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

Compensating of room acoustic transfer functions affected by change of room temperature.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999


  Loading...