Tomoki Toda
Orcid: 0000-0001-8146-1279
According to our database1,
Tomoki Toda
authored at least 429 papers
between 2000 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Dual-Channel Target Speaker Extraction Based on Conditional Variational Autoencoder and Directional Information.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE Signal Process. Lett., 2024
Improved Architecture for High-resolution Piano Transcription to Efficiently Capture Acoustic Characteristics of Music Signals.
CoRR, 2024
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions.
CoRR, 2024
Improvements of Discriminative Feature Space Training for Anomalous Sound Detection in Unlabeled Conditions.
CoRR, 2024
Quantifying the effect of speech pathology on automatic and human speaker verification.
CoRR, 2024
2DP-2MRC: 2-Dimensional Pointer-based Machine Reading Comprehension Method for Multimodal Moment Retrieval.
CoRR, 2024
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection.
CoRR, 2024
CoRR, 2024
Learning Multidimensional Disentangled Representations of Instrumental Sounds for Musical Similarity Assessment.
CoRR, 2024
Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment.
CoRR, 2024
Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling.
IEEE Access, 2024
An Investigation of Fundamental Frequency Pattern Prediction for Japanese Electrolaryngeal Speech Enhancement Based on Frame-Wise Phoneme Representations.
IEEE Access, 2024
Electrolaryngeal Speech Intelligibility Enhancement through Robust Linguistic Encoders.
Proceedings of the IEEE International Conference on Acoustics, 2024
Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2024
FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the 32nd European Signal Processing Conference, 2024
Proceedings of the 32nd European Signal Processing Conference, 2024
2023
High-Fidelity and Pitch-Controllable Neural Vocoder Based on Unified Source-Filter Networks.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
On the Effectiveness of ASR Representations in Real-world Noisy Speech Emotion Recognition.
CoRR, 2023
AAS-VC: On the Generalization Ability of Automatic Alignment Search based Non-autoregressive Sequence-to-sequence Voice Conversion.
CoRR, 2023
Directional Target Speaker Extraction under Noisy Underdetermined Conditions through Conditional Variational Autoencoder with Global Style Tokens.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Semi-supervised Multimodal Emotion Recognition with Consensus Decision-making and Label Correction.
Proceedings of the 1st International Workshop on Multimodal and Responsible Affective Computing, 2023
Sequence-to-Sequence Network Training Methods for Automatic Guitar Transcription With Tokenized Outputs.
Proceedings of the 24th International Society for Music Information Retrieval Conference, 2023
Analysis of Mean Opinion Scores in Subjective Evaluation of Synthetic Speech Based on Tail Probabilities.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Emotion Awareness in Multi-utterance Turn for Improving Emotion Prediction in Multi-Speaker Conversation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Preference-based training framework for automatic speech quality assessment using deep neural network.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Text-To-Speech Synthesis Based on Latent Variable Conversion Using Diffusion Probabilistic Model and Variational Autoencoder.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Low-Latency Electrolaryngeal Speech Enhancement Based on Fastspeech2-Based Voice Conversion and Self-Supervised Speech Representation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Sound Field Interpolation with Unsupervised Calibration for Freely Spaced Circular Microphone Array in Rotation-Robust Beamforming.
Proceedings of the 31st European Signal Processing Conference, 2023
A Comparative Study of Voice Conversion Models With Large-Scale Speech and Singing Data: The T13 Systems for the Singing Voice Conversion Challenge 2023.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
ED-CEC: Improving Rare word Recognition Using ASR Postprocessing Based on Error Detection and Context-Aware Error Correction.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Improving Severity Preservation of Healthy-to-Pathological Voice Conversion With Global Style Tokens.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
The Voicemos Challenge 2023: Zero-Shot Subjective Speech Quality Prediction for Multiple Domains.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
An Analysis of Personalized Speech Recognition System Development for the Deaf and Hard-of-Hearing.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
Speech Commun., 2022
Investigation of Japanese PnG BERT Language Model in Text-to-Speech Synthesis for Pitch Accent Language.
IEEE J. Sel. Top. Signal Process., 2022
IEEE J. Sel. Top. Signal Process., 2022
Music Similarity Calculation of Individual Instrumental Sounds Using Metric Learning.
CoRR, 2022
A Cyclical Approach to Synthetic and Natural Speech Mismatch Refinement of Neural Post-filter for Low-cost Text-to-speech System.
CoRR, 2022
Two-Stage Training Method for Japanese Electrolaryngeal Speech Enhancement Based on Sequence-to-Sequence Voice Conversion.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Spoken-Text-Style Transfer with Conditional Variational Autoencoder and Content Word Storage.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Investigating Self-supervised Pretraining Frameworks for Pathological Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
An Evaluation of Three-Stage Voice Conversion Framework for Noisy and Reverberant Conditions.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
An Investigation of Streaming Non-Autoregressive sequence-to-sequence Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Modified Sound Field Interpolation Method for Rotation-robust Beamforming with Unequally Spaced Circular Microphone Array.
Proceedings of the 30th European Signal Processing Conference, 2022
Improvement of Serial Approach to Anomalous Sound Detection by Incorporating Two Binary Cross-Entropies for Outlier Exposure.
Proceedings of the 30th European Signal Processing Conference, 2022
Proceedings of the 30th European Signal Processing Conference, 2022
2021
Quasi-Periodic WaveNet: An Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE Access, 2021
Low-latency real-time non-parallel voice conversion based on cyclic variational autoencoder and multiband WaveRNN with data-driven linear prediction.
Proceedings of the 11th ISCA Speech Synthesis Workshop, 2021
Singing Fundamental Frequency Contour Generation Using Generalized Command-Response Model and Score-Conditional Variational Autoencoder.
Proceedings of the 2021 IEEE 31st International Workshop on Machine Learning for Signal Processing (MLSP), 2021
Unified Source-Filter GAN: Unified Source-Filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
High-Fidelity and Low-Latency Universal Neural Vocoder Based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
High-Intelligibility Speech Synthesis for Dysarthric Speakers with LPCNet-Based TTS and CycleVAE-Based VC.
Proceedings of the IEEE International Conference on Acoustics, 2021
Crank: An Open-Source Software for Nonparallel Voice Conversion Based on Vector-Quantized Variational Autoencoder.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 29th European Signal Processing Conference, 2021
An Ensemble Approach to Anomalous Sound Detection Based on Conformer-Based Autoencoder and Binary Classifier Incorporated with Metric Learning.
Proceedings of the 6th Workshop on Detection and Classification of Acoustic Scenes and Events 2021 (DCASE 2021), 2021
Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Mandarin Electro-Laryngeal Speech Enhancement based on Statistical Voice Conversion and Manual Tone Control.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Investigation of Text-to-Speech-based Synthetic Parallel Data for Sequence-to-Sequence Non-Parallel Voice Conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
2020
Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Any-to-One Sequence-to-Sequence Voice Conversion using Self-Supervised Discrete Speech Representations.
CoRR, 2020
Non-Parallel Voice Conversion System With WaveNet Vocoder and Collapsed Speech Suppression.
IEEE Access, 2020
A Cyclical Post-Filtering Approach to Mismatch Refinement of Neural Vocoder for Text-to-Speech Systems.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Cyclic Spectral Modeling for Unsupervised Unit Discovery into Voice Conversion with Excitation and Waveform Modeling.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Semi-Supervised Self-Produced Speech Enhancement and Suppression Based on Joint Source Modeling of Air- and Body-Conducted Signals Using Variational Autoencoder.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Voice Transformer Network: Sequence-to-Sequence Voice Conversion Using Transformer with Text-to-Speech Pretraining.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Intelligibility Enhancement Based on Speech Waveform Modification Using Hearing Impairment.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Efficient Shallow Wavenet Vocoder Using Multiple Samples Output Based on Laplacian Distribution and Linear Prediction.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Espnet-TTS: Unified, Reproducible, and Integratable Open Source End-to-End Text-to-Speech Toolkit.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Semi-Supervised Enhancement and Suppression of Self-Produced Speech Using Correspondence between Air- and Body-Conducted Signals.
Proceedings of the 28th European Signal Processing Conference, 2020
Implementation of low-latency electrolaryngeal speech enhancement based on multi-task CLDNN.
Proceedings of the 28th European Signal Processing Conference, 2020
Conformer-Based Sound Event Detection with Semi-Supervised Learning and Data Augmentation.
Proceedings of 5th the Workshop on Detection and Classification of Acoustic Scenes and Events 2020 (DCASE 2020), 2020
Baseline System of Voice Conversion Challenge 2020 with Cyclic Variational Autoencoder and Parallel WaveGAN.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
The NU Voice Conversion System for the Voice Conversion Challenge 2020: On the Effectiveness of Sequence-to-sequence Models and Autoregressive Neural Vocoders.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Voice Conversion Challenge 2020 -- Intra-lingual semi-parallel and cross-lingual voice conversion --.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Cross-Lingual Voice Conversion using a Cyclic Variational Auto-encoder and a WaveNet Vocoder.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Phoneme Embeddings on Predicting Fundamental Frequency Pattern for Electrolaryngeal Speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
Speech-to-Singing Voice Conversion: The Challenges and Strategies for Improving Vocal Conversion Processes.
IEEE Signal Process. Mag., 2019
Voice Conversion With CycleRNN-Based Spectral Mapping and Finely Tuned WaveNet Vocoder.
IEEE Access, 2019
Underdetermined Source Separation Based on Generalized Multichannel Variational Autoencoder.
IEEE Access, 2019
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
Generalization of Spectrum Differential based Direct Waveform Modification for Voice Conversion.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
An Investigation of Features for Fundamental Frequency Pattern Prediction in Electrolaryngeal Speech Enhancement.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
Improving Singing Aid System for Laryngectomees With Statistical Voice Conversion and VAE-SPACE.
Proceedings of the 20th International Society for Music Information Retrieval Conference, 2019
Quasi-Periodic WaveNet Vocoder: A Pitch Dependent Dilated Convolution Model for Parametric Speech Generation.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Robustness of Statistical Voice Conversion Based on Direct Waveform Modification Against Background Sounds.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Voice Conversion with Cyclic Recurrent Neural Network and Fine-tuned Wavenet Vocoder.
Proceedings of the IEEE International Conference on Acoustics, 2019
Investigations of Real-time Gaussian Fftnet and Parallel Wavenet Neural Vocoders with Simple Acoustic Features.
Proceedings of the IEEE International Conference on Acoustics, 2019
Scene-dependent Anomalous Acoustic-event Detection Based on Conditional Wavenet and I-vector.
Proceedings of the IEEE International Conference on Acoustics, 2019
Generalized Multichannel Variational Autoencoder for Underdetermined Source Separation.
Proceedings of the 27th European Signal Processing Conference, 2019
Proceedings of the 27th European Signal Processing Conference, 2019
Development of a Real-time Bionic Voice Generation System based on Statistical Excitation Prediction.
Proceedings of the 21st International ACM SIGACCESS Conference on Computers and Accessibility, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Tacotron-Based Acoustic Model Using Phoneme Alignment for Practical Neural Text-to-Speech Systems.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
2018
Intra-gender statistical singing voice conversion with direct waveform modification using log-spectral differential.
Speech Commun., 2018
Mach. Transl., 2018
Stereophonic Music Separation Based on Non-Negative Tensor Factorization with Cepstral Distance Regularization.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2018
Daily Activity Recognition with Large-Scaled Real-Life Recording Datasets Based on Deep Neural Network Using Multi-Modal Signals.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2018
Frequency domain variants of velvet noise and their application to speech processing and synthesis: with appendices.
CoRR, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Audio-visual Voice Conversion Using Deep Canonical Correlation Analysis for Deep Bottleneck Features.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Frequency Domain Variants of Velvet Noise and Their Application to Speech Processing and Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Designing a Pneumatic Bionic Voice Prosthesis - A Statistical Approach for Source Excitation Generation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
An Investigation of Noise Shaping with Perceptual Weighting for Wavenet-Based Speech Generation.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Connectionist Temporal Classification-based Sound Event Encoder for Converting Sound Events into Onomatopoeic Representations.
Proceedings of the 26th European Signal Processing Conference, 2018
Electrolaryngeal Speech Enhancement with Statistical Voice Conversion based on CLDNN.
Proceedings of the 26th European Signal Processing Conference, 2018
Proceedings of the 26th European Signal Processing Conference, 2018
Development of "KamiRepo" system with automatic student identification to handle handwritten assignments on LMS.
Proceedings of the 2018 IEEE Global Engineering Education Conference, 2018
Self-Produced Speech Enhancement and Suppression Method using Air- and Body-Conductive Microphones.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
Articulatory Controllable Speech Modification Based on Statistical Inversion and Production Mappings.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
A Vibration Control Method of an Electrolarynx Based on Statistical <i>F</i><sub>0</sub> Pattern Prediction.
IEICE Trans. Inf. Syst., 2017
A modulation property of time-frequency derivatives of filtered phase and its application to aperiodicity and fo estimation.
CoRR, 2017
Missing component restoration for masked speech signals based on time-domain spectrogram factorization.
Proceedings of the 27th IEEE International Workshop on Machine Learning for Signal Processing, 2017
Physically Constrained Statistical F<sub>0</sub> Prediction for Electrolaryngeal Speech Enhancement.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Speech Enhancement Using Non-Negative Spectrogram Models with Mel-Generalized Cepstral Regularization.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
A New Cosine Series Antialiasing Function and its Application to Aliasing-Free Glottal Source Models for Speech and Singing Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
A Modulation Property of Time-Frequency Derivatives of Filtered Phase and its Application to Aperiodicity and f<sub>o</sub> Estimation.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
A noise suppression method for body-conducted soft speech based on non-negative tensor factorization of air- and body-conducted signals.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
BLSTM-HMM hybrid system combined with sound activity detection network for polyphonic Sound Event Detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Stereophonic music separation based on non-negative tensor factorization with cepstrum regularization.
Proceedings of the 25th European Signal Processing Conference, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
An investigation of recurrent neural network for daily activity recognition using multi-modal signals.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
An investigation of how to design control parameters for statistical voice timbre control.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Accurate estimation of f0 and aperiodicity based on periodicity detector residuals and deviations of phase derivatives.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
ACM Trans. Interact. Intell. Syst., 2016
Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Postfilters to Modify the Modulation Spectrum for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Speech Commun., 2016
A Statistical Sample-Based Approach to GMM-Based Voice Conversion Using Tied-Covariance Acoustic Models.
IEICE Trans. Inf. Syst., 2016
Non-Native Text-to-Speech Preserving Speaker Individuality Based on Partial Correction of Prosodic and Phonetic Characteristics.
IEICE Trans. Inf. Syst., 2016
Enhancing Event-Related Potentials Based on Maximum a Posteriori Estimation with a Spatial Correlation Prior.
IEICE Trans. Inf. Syst., 2016
Improvements of Voice Timbre Control Based on Perceived Age in Singing Voice Conversion.
IEICE Trans. Inf. Syst., 2016
Nonaudible murmur enhancement based on statistical voice conversion and noise suppression with external noise monitoring.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
F0 transformation techniques for statistical voice conversion with direct waveform modification with spectral differential.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Proceedings of the Dialogues with Social Robots, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Acoustic-to-Articulatory Inversion Mapping Based on Latent Trajectory Gaussian Mixture Model.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Model Integration for HMM- and DNN-Based Speech Synthesis Using Product-of-Experts Framework.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
A Hybrid System for Continuous Word-Level Emphasis Modeling Based on HMM State Clustering and Adaptive Training.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
An estimation method of voice timbre evaluation values using feature extraction with Gaussian mixture model based on reference singer.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Statistical F0 prediction for electrolaryngeal speech enhancement considering generative process of F0 contours within product of experts framework.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Noise suppression method for body-conducted soft speech enhancement based on external noise monitoring.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Implementation of F0 transformation for statistical singing voice conversion based on direct waveform modification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Real-time vibration control of an electrolarynx based on statistical F0 contour prediction.
Proceedings of the 24th European Signal Processing Conference, 2016
Removing noise from event-related potentials using a probabilistic generative model with grouped covariance matrices.
Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2016
Proceedings of the Workshop on Detection and Classification of Acoustic Scenes and Events, 2016
2015
Trans. Assoc. Comput. Linguistics, 2015
IEICE Trans. Inf. Syst., 2015
An Investigation of Machine Translation Evaluation Metrics in Cross-lingual Question Answering.
Proceedings of the Tenth Workshop on Statistical Machine Translation, 2015
Construction and analysis of social-affective interaction corpus in English and Indonesian.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015
Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015
Learning to Generate Pseudo-Code from Source Code Using Statistical Machine Translation (T).
Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015
Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering, 2015
Improving translation of emphasis with pause prediction in speech-to-speech translation systems.
Proceedings of the 12th International Workshop on Spoken Language Translation: Papers, 2015
Proceedings of the 20th International Conference on Intelligent User Interfaces, 2015
Articulatory controllable speech modification based on Gaussian mixture models with direct waveform modification using spectrum differential.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Modulation spectrum-constrained trajectory training algorithm for HMM-based speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Non-audible murmur enhancement based on statistical conversion using air- and body-conductive microphones in noisy environments.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Non-native speech synthesis preserving speaker individuality based on partial correction of prosodic and phonetic characteristics.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Statistical singing voice conversion based on direct waveform modification with global variance.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Preserving word-level emphasis in speech-to-speech translation using linear regression HSMMs.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Combination of two-dimensional cochleogram and spectrogram features for deep learning-based ASR.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Modulation spectrum-constrained trajectory training algorithm for GMM-based Voice Conversion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Parameter generation algorithm considering Modulation Spectrum for HMM-based speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
EEG signal enhancement using multi-channel wiener filter with a spatial correlation prior.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
An evaluation of EEG ocular artifact removal with a multi-channel wiener filter based on probabilistic generative model.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015
Proceedings of the Blizzard Challenge 2015, 2015
An Enhanced Electrolarynx with Automatic Fundamental Frequency Control based on Statistical Prediction.
Proceedings of the 17th International ACM SIGACCESS Conference on Computers & Accessibility, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
A study of social-affective communication: Automatic prediction of emotion triggers and responses in television talk shows.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
The NAIST ASR system for the 2015 Multi-Genre Broadcast challenge: On combination of deep learning systems using a rank-score function.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Aliasing-free implementation of discrete-time glottal source models and their applications to speech synthesis and F0 extractor evaluation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015
Syntax-based Simultaneous Translation through Prediction of Unseen Syntactic Constituents.
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
Proceedings of the Natural Language Dialog Systems and Intelligent Assistants, 2015
2014
IEEE ACM Trans. Audio Speech Lang. Process., 2014
Parameter Generation Methods With Rich Context Models for High-Quality and Flexible Text-To-Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014
A Hybrid Approach to Electrolaryngeal Speech Enhancement Based on Noise Reduction and Statistical Excitation Generation.
IEICE Trans. Inf. Syst., 2014
Utilizing Human-to-Human Conversation Examples for a Multi Domain Chat-Oriented Dialog System.
IEICE Trans. Inf. Syst., 2014
Structured Adaptive Regularization of Weight Vectors for a Robust Grapheme-to-Phoneme Conversion Model.
IEICE Trans. Inf. Syst., 2014
IEICE Trans. Inf. Syst., 2014
Proceedings of SSST@EMNLP 2014, 2014
Improving the robustness of example-based dialog retrieval using recursive neural network paraphrase identification.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and A Network-based ASR System.
Proceedings of the Ninth International Conference on Language Resources and Evaluation, 2014
Proceedings of the Situated Dialog in Speech-Based Human-Computer Interaction, 2014
Proceedings of the Situated Dialog in Speech-Based Human-Computer Interaction, 2014
Articulatory controllable speech modification based on statistical feature mapping with Gaussian mixture models.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Direct F<sub>0</sub> control of an electrolarynx based on statistical excitation feature prediction and its evaluation through simulation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Data-driven generation of text balloons based on linguistic and acoustic features of a comics-anime corpus.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Structured soft margin confidence weighted learning for grapheme-to-phoneme conversion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Statistical singing voice conversion with direct waveform modification based on the spectrum differential.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Excitation source analysis for high-quality speech manipulation systems based on an interference-free representation of group delay with minimum phase response compensation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
A hearing impairment simulation method using audiogram-based approximation of auditory charatecteristics.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
An evaluation of excitation feature prediction in a hybrid approach to electrolaryngeal speech enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014
Proceedings of the 2014 IEEE Global Conference on Signal and Information Processing, 2014
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, 2014
Proceedings of the COLING 2014, 2014
Proceedings of the COLING 2014, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
An evaluation of target speech for a nonaudible murmur enhancement system in noisy environments.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
An inter-speaker evaluation through simulation of electrolarynx control based on statistical F0 prediction.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Recursive neural network paraphrase identification for example-based dialog retrieval.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Gender-dependent spectrum differential models for perceived age control based on direct waveform modification in singing voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Excitation source design for high-quality speech manipulation systems based on a temporally static group delay representation of periodic signals.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, 2014
Linguistic and Acoustic Features for Automatic Identification of Autism Spectrum Disorders in Children's Narrative.
Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 2014
2013
Investigation of intra-speaker spectral parameter variation and its prediction towards improvement of spectral conversion metric.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
Proceedings of the 10th International Workshop on Spoken Language Translation: Papers, 2013
Proceedings of the 10th International Workshop on Spoken Language Translation: Evaluation Campaign@IWSLT 2013, 2013
A hybrid approach to electrolaryngeal speech enhancement based on spectral subtraction and statistical voice conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Improvements to HMM-based speech synthesis based on parameter generation with rich context models.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
A digital signal processor implementation of silent/electrolaryngeal speech enhancement based on real-time statistical voice conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
An investigation of acoustic features for singing voice conversion based on perceptual age.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Beyond bandlimited sampling of speech spectral envelope imposed by the harmonic structure of voiced sounds.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Simple, lexicalized choice of translation timing for simultaneous speech translation.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Evaluation of a singing voice conversion method based on many-to-many eigenvoice conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Modality and contextual differences in computer based non-verbal communication training.
Proceedings of the IEEE 4th International Conference on Cognitive Infocommunications, 2013
Proceedings of the Working Notes for CLEF 2013 Conference , 2013
Inter-Sentence Features and Thresholded Minimum Error Rate Training: NAIST at CLEF 2013 QA4MRE.
Proceedings of the Working Notes for CLEF 2013 Conference , 2013
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013
Proceedings of the First Workshop on Natural Language Processing for Medical and Healthcare Fields@IJCNLP 2013, 2013
2012
Statistical Voice Conversion Techniques for Body-Conducted Unvoiced Speech Enhancement.
IEEE Trans. Speech Audio Process., 2012
Speech Commun., 2012
J. Intell. Robotic Syst., 2012
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012
Proceedings of the 2012 International Workshop on Spoken Language Translation, 2012
Proceedings of the Natural Interaction with Robots, 2012
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
An Evaluation of Parameter Generation Methods with Rich Context Models in HMM-Based Speech Synthesis.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the IEEE 3rd International Conference on Cognitive Infocommunications, 2012
Singing voice conversion method based on many-to-many eigenvoice conversion and training data generation using a singing-to-singing synthesis system.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012
2011
Speaker-Adaptive Speech Synthesis Based on Eigenvoice Conversion and Language-Dependent Prosodic Conversion in Speech-to-Speech Translation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
An evaluation of alaryngeal speech enhancement methods based on voice conversion techniques.
Proceedings of the IEEE International Conference on Acoustics, 2011
Acoustic model training for non-audible murmur recognition using transformed normal speech data.
Proceedings of the IEEE International Conference on Acoustics, 2011
Blind noise suppression for Non-Audible Murmur recognition with stereo signal processing.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011
2010
IEEE Trans. Speech Audio Process., 2010
Speech Commun., 2010
IEICE Trans. Inf. Syst., 2010
IEICE Trans. Inf. Syst., 2010
Evaluation of Extremely Small Sound Source Signals Used in Speaking-Aid System with Statistical Voice Conversion.
IEICE Trans. Inf. Syst., 2010
Esophageal Speech Enhancement Based on Statistical Voice Conversion with Gaussian Mixture Models.
IEICE Trans. Inf. Syst., 2010
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
The use of air-pressure sensor in electrolaryngeal speech enhancement based on statistical voice conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Statistical approach to enhancing esophageal speech based on Gaussian mixture models.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010
2009
IEEE Trans. Speech Audio Process., 2009
Techniques in rapid unsupervised speaker adaptation based on HMM-Sufficient Statistics.
Speech Commun., 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Technologies for processing body-conducted speech detected with non-audible murmur microphone.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
The NICT Entry for the Blizzard Challenge 2009: an Enhanced HMM-based Speech Synthesis System with Trajectory Training considering Global Variance and State-Dependent Mixed Excitation.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009
2008
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model.
Speech Commun., 2008
IEICE Trans. Inf. Syst., 2008
Building an Effective Speech Corpus by Utilizing Statistical Multidimensional Scaling Method.
IEICE Trans. Inf. Syst., 2008
Cost Reduction of Acoustic Modeling for Real-Environment Applications Using Unsupervised and Selective Training.
IEICE Trans. Inf. Syst., 2008
Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Evaluation of speaking-aid system with voice conversion for laryngectomees toward its use in practical environments.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Low-delay voice conversion based on maximum likelihood estimation of spectral parameter trajectory.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS 2007" for the Blizzard Challenge 2007.
Proceedings of the IEEE International Conference on Acoustics, 2008
Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM.
Proceedings of the IEEE International Conference on Acoustics, 2008
On the state definition for a trainable excitation model in HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008
The HTS-2008 System: Yet Another Evaluation of the Speaker-Adaptive HMM-based Speech Synthesis System in The 2008 Blizzard Challenge.
Proceedings of the Blizzard Challenge 2008, 2008
Proceedings of the Blizzard Challenge 2008, 2008
2007
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory.
IEEE Trans. Speech Audio Process., 2007
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005.
IEICE Trans. Inf. Syst., 2007
A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2007
Reducing Computation Time of the Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics.
IEICE Trans. Inf. Syst., 2007
Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
An evaluation of many-to-one voice conversion algorithms with pre-stored speaker data sets.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Regression approaches to voice quality controll based on one-to-many eigenvoice conversion.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Speaker adaptive training for one-to-many eigenvoice conversion based on Gaussian mixture model.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Impact of various small sound source signals on voice conversion accuracy in speech communication aid for laryngectomees.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Development of preschool children subsystem for ASR and q&a in a real-environment speech-oriented guidance task.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Speaker-independent HMM-based speech synthesis system - HTS-2007 system for the Blizzard Challenge 2007.
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007
2006
An evaluation of cost functions sensitively capturing local degradation of naturalness for segment selection in concatenative speech synthesis.
Speech Commun., 2006
Improving Rapid Unsupervised Speaker Adaptation Based on HMM-Sufficient Statistics in Noisy Environments Using Multi-Template Models.
IEICE Trans. Inf. Syst., 2006
Utterance-Based Selective Training for the Automatic Creation of Task-Dependent Acoustic Models.
IEICE Trans. Inf. Syst., 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Speaking aid system for total laryngectomees using voice conversion of body transmitted artificial speech.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Acoustic modeling for spoken dialogue systems based on unsupervised utterance-based selective training.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Developing a Test Bed of English Text-to-Speech System XIMERA for the Blizzard Challenge 2006.
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006
2005
IEICE Trans. Inf. Syst., 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Optimizing sub-cost functions for segment selection based on perceptual evaluations in concatenative speech synthesis.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
2003
Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Segment selection considering local degradation of naturalness in concatenative speech synthesis.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002
Evaluation of cross-language voice conversion using bilingual and non-bilingual databases.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Designing Japanese speech database covering wide range in prosody for hybrid speech synthesizer.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
Unit selection algorithm for Japanese speech synthesis based on both phoneme unit and diphone unit.
Proceedings of the IEEE International Conference on Acoustics, 2002
2001
High quality voice conversion based on Gaussian mixture model with dynamic frequency warping.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001
Voice conversion algorithm based on Gaussian mixture model with dynamic frequency warping of STRAIGHT spectrum.
Proceedings of the IEEE International Conference on Acoustics, 2001
2000
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000