Keiichi Tokuda

Orcid: 0000-0001-6143-0133

According to our database1, Keiichi Tokuda authored at least 273 papers between 1990 and 2024.

Collaborative distances:

Awards

IEEE Fellow

IEEE Fellow 2014, "For contributions to hidden Markov model-based speech synthesis".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
LIMMITS'24: Multi-Speaker, Multi-Lingual Indic TTS with Voice Cloning.
Proceedings of the IEEE International Conference on Acoustics, 2024

PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation.
CoRR, 2023

SPTK4: An Open-Source Software Toolkit for Speech Signal Processing.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023

Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System.
Proceedings of the IEEE International Conference on Acoustics, 2023

Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023

Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Autoregressive Variational Autoencoder with a Hidden Semi-Markov Model-Based Structured Attention for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

PeriodNet: A Non-Autoregressive Raw Waveform Generative Model With a Structure Separating Periodic and Aperiodic Components.
IEEE Access, 2021

Periodnet: A Non-Autoregressive Waveform Generation Model with a Structure Separating Periodic and Aperiodic Components.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F<sub>0</sub> Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Singing voice synthesis based on convolutional neural networks.
CoRR, 2019

Low computational cost speech synthesis based on deep neural networks using hidden semi-Markov model structures.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Deep neural network based real-time speech vocoder with periodic and aperiodic inputs.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Impacts of input linguistic feature representation on Japanese end-to-end speech synthesis.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Statistical Approach to Speech Synthesis: Past, Present and Future.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speaker-dependent Wavenet-based Delay-free Adpcm Speech Coding.
Proceedings of the IEEE International Conference on Acoustics, 2019

Singing Voice Synthesis Based on Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Mel-Cepstrum-Based Quantization Noise Shaping Applied to Neural-Network-Based Speech Waveform Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

WaveNet-Based Zero-Delay Lossless Speech Coding.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Statistical Voice Conversion Based on Wavenet.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Image Recognition Based on Separable Lattice Hmms Using a Deep Neural Network for Output Probability Distributions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

The NITech text-to-speech system for the Blizzard Challenge 2018.
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

Discriminative Feature Extraction Based on Sequential Variational Autoencoder for Speaker Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Singing Voice Conversion Using Posted Waveform Data on Music Social Media.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Speaker Adaptation for Speech Synthesis Based on Deep Neural Networks Using Hidden Semi-Markov Model Structures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Image Recognition Based on Convolutional Neural Networks Using Features Generated from Separable Lattice Hidden Markov Models.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Recent Development of the DNN-based Singing Voice Synthesis System - Sinsy.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Speech Synthesis Using WaveNet Vocoder Based on Periodic/Aperiodic Decomposition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Simultaneous Optimization of Multiple Tree-Based Factor Analyzed HMM for Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Articulatory Text-to-Speech Synthesis Using the Digital Waveguide Mesh Driven by a Deep Neural Network.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Image recognition based on discriminative models using features generated from separable lattice HMMS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

The NITech text-to-speech system for the Blizzard Challenge 2017.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

The blizzard machine learning challenge 2017.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Generalization of Thai tone contour in HMM-based speech synthesis.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

User Generated Dialogue Systems: uDialogue.
Proceedings of the Human-Harmonized Information Technology, Volume 2, 2017

2016
A Bayesian Approach to Image Recognition Based on Separable Lattice Hidden Markov Models.
IEICE Trans. Inf. Syst., 2016

Temporal modeling in neural network based statistical parametric speech synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

A Hierarchical Predictor of Synthetic Speech Naturalness Using Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Singing Voice Synthesis Based on Deep Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Redefining the Linguistic Context Feature Set for HMM and DNN TTS Through Position and Parsing.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Directly modeling voiced and unvoiced components in speech waveforms by neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Trajectory training considering global variance for speech synthesis based on neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The NITech text-to-speech system for the Blizzard Challenge 2016.
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

2015
Using speaker adaptive training to realize Mandarin-Tibetan cross-lingual speech synthesis.
Multim. Tools Appl., 2015

Simultaneous optimization of multiple tree structures for factor analyzed HMM-based speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The effect of neural networks in statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

The NITECH HMM-based text-to-speech system for the Blizzard Challenge 2015.
Proceedings of the Blizzard Challenge 2015, 2015


2014
Introduction to the Issue on Statistical Parametric Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Contextual Additive Structure for HMM-Based Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Image Recognition Based on Separable Lattice Trajectory 2-D HMMs.
IEICE Trans. Inf. Syst., 2014

Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2014

A mel-cepstral analysis technique restoring high frequency components from low-sampling-rate speech.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

HMM-Based singing voice synthesis and its application to Japanese and English.
Proceedings of the IEEE International Conference on Acoustics, 2014

Voice interaction system with 3D-CG virtual agent for stand-alone smartphones.
Proceedings of the second international conference on Human-agent interaction, 2014

Overview of NITECH HMM-based text-to-speech system for Blizzard Challenge 2014.
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014

The Blizzard Challenge 2014.
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014

2013
Speech Synthesis Based on Hidden Markov Models.
Proc. IEEE, 2013

A Bayesian Framework Using Multiple Model Structures for Speech Recognition.
IEICE Trans. Inf. Syst., 2013

Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis.
Comput. Speech Lang., 2013

Cross-lingual speaker adaptation based on factor analysis using bilingual speech data for HMM-based speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Contextual partial additive structure for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Integration of acoustic modeling and mel-cepstral analysis for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Separable lattice 2-D HMMS introducing state duration control for recognition of images with various variations.
Proceedings of the IEEE International Conference on Acoustics, 2013

Mmdagent - A fully open-source toolkit for voice interaction systems.
Proceedings of the IEEE International Conference on Acoustics, 2013

Overview of NITECH HMM-based speech synthesis system for Blizzard Challenge 2013.
Proceedings of the Blizzard Challenge 2013, 2013

Realizing Tibetan speech synthesis by speaker adaptive training.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Image recognition based on hidden Markov eigen-image models using variational Bayesian method.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Product of Experts for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2012

Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping.
Speech Commun., 2012

Impacts of machine translation and speech synthesis on speech-to-speech translation.
Speech Commun., 2012

An Extension of Separable Lattice 2-D HMMs for Rotational Data Variations.
IEICE Trans. Inf. Syst., 2012

Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A model structure integration based on a Bayesian framework for speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Face recognition based on separable lattice 2-D HMMS using variational bayesian method.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Pitch adaptive training for hmm-based singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Face recognition based on extended separable lattice 2-D HMMS.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2012.
Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012

2011
Continuous Stochastic Feature Mapping Based on Trajectory HMMs.
IEEE Trans. Speech Audio Process., 2011

Bayesian Context Clustering Using Cross Validation for Speech Recognition.
IEICE Trans. Inf. Syst., 2011

Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

GMM-Based Missing-Feature Reconstruction on Multi-Frame Windows.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Large-Scale Subjective Evaluations of Speech Rate Control Methods for HMM-Based Speech Synthesizers.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

Global variance modeling on frequency domain delta LSP for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

An analysis of machine translation and speech synthesis in speech-to-speech translation system.
Proceedings of the IEEE International Conference on Acoustics, 2011

Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2011.
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011

2010
Thousands of Voices for HMM-Based Speech Synthesis-Analysis and Application of TTS Systems Built on Various ASR Corpora.
IEEE Trans. Speech Audio Process., 2010

A Covariance-Tying Technique for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2010

Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Spectral modeling with contextual additive structure for HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Recent development of the HMM-based singing voice synthesis system - Sinsy.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Bayesian speech synthesis framework integrating training and synthesis processes.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Voice activity detection based on conditional random fields using multiple features.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

HMM-based singing voice synthesis system using pitch-shifted pseudo training data.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker adaptation based on nonlinear spectral transform for speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Statistical parametric speech synthesis based on product of experts.
Proceedings of the IEEE International Conference on Acoustics, 2010

Face recognition based on separable lattice 2-D HMM with state duration modeling.
Proceedings of the IEEE International Conference on Acoustics, 2010

Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

Factor analyzed voice models for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

A Deterministic Annealing-Based Training Algorithm For Statistical Machine Translation Models.
Proceedings of the 14th Annual conference of the European Association for Machine Translation, 2010

NICT Blizzard Challenge 2010 Entry.
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2010.
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009
Robust Speaker-Adaptive HMM-Based Text-to-Speech Synthesis.
IEEE Trans. Speech Audio Process., 2009

Statistical parametric speech synthesis.
Speech Commun., 2009

A Reordering Model Using a Source-Side Parse-Tree for Statistical Machine Translation.
IEICE Trans. Inf. Syst., 2009

Reordering Model Using Syntactic Information of a Source Tree for Statistical Machine Translation.
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation, 2009

Thousands of voices for HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An improved minimum generation error based model adaptation for HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Deterministic annealing based training algorithm for Bayesian speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A decision tree-based clustering approach to state definition in an excitation modeling framework for HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A Bayesian approach to Hidden Semi-Markov Model based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Stereo-based stochastic noise compensation based on trajectory GMMS.
Proceedings of the IEEE International Conference on Acoustics, 2009

Voice conversion based on simultaneous modelling of spectrum and F0.
Proceedings of the IEEE International Conference on Acoustics, 2009

Minimum generation error training by using original spectrum as reference for log spectral distortion measure.
Proceedings of the IEEE International Conference on Acoustics, 2009

Full covariance state duration modeling for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2009

A Bayesian approach to HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2009

Overview of NIT HMM-based speech synthesis system for Blizzard Challenge 2009.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

The NICT Entry for the Blizzard Challenge 2009: an Enhanced HMM-based Speech Synthesis System with Trajectory Training considering Global Variance and State-Dependent Mixed Excitation.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

2008
Statistical mapping between articulatory movements and acoustic spectrum using a Gaussian mixture model.
Speech Commun., 2008

The Nitech-NAIST HMM-Based Speech Synthesis System for the Blizzard Challenge 2006.
IEICE Trans. Inf. Syst., 2008

A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System.
IEICE Trans. Inf. Syst., 2008

Cross-Lingual Speaker Adaptation for HMM-Based Speech Synthesis.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Probabilistic feature mapping based on trajectory HMMs.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Probabilistic answer selection based on conditional random fields for spoken dialog system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Minimum generation error training with direct log spectral distortion on LSPs for HMM-based speech synthesis.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Acoustic modeling based on model structure annealing for speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Unsupervised adaptation for HMM-based speech synthesis.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speaker recognition based on variational Bayesian method.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Bayesian context clustering using cross valid prior distribution for HMM-based speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS 2007" for the Blizzard Challenge 2007.
Proceedings of the IEEE International Conference on Acoustics, 2008

Statistical approach to vocal tract transfer function estimation based on factor analyzed trajectory HMM.
Proceedings of the IEEE International Conference on Acoustics, 2008

Acoustic modeling with contextual additive structure for HMM-based speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

On the state definition for a trainable excitation model in HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008

The HTS-2008 System: Yet Another Evaluation of the Speaker-Adaptive HMM-based Speech Synthesis System in The 2008 Blizzard Challenge.
Proceedings of the Blizzard Challenge 2008, 2008

The NICT/ATR speech synthesis system for the Blizzard Challenge 2008.
Proceedings of the Blizzard Challenge 2008, 2008

2007
Voice Conversion Based on Maximum-Likelihood Estimation of Spectral Parameter Trajectory.
IEEE Trans. Speech Audio Process., 2007

Vector quantization of mel-cepstral coefficients using distortion measure for spectral analysis.
Syst. Comput. Jpn., 2007

Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005.
IEICE Trans. Inf. Syst., 2007

A Hidden Semi-Markov Model-Based Speech Synthesis System.
IEICE Trans. Inf. Syst., 2007

State Duration Modeling for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2007

A Speech Parameter Generation Algorithm Considering Global Variance for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2007

Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences.
Comput. Speech Lang., 2007

The HMM-based speech synthesis system (HTS) version 2.0.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Communicative speech synthesis with XIMERA: a first step.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Spectral conversion based on statistical models including time-sequence matching.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

An excitation model for HMM-based speech synthesis based on residual modeling.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Model-space MLLR for trajectory HMMs.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A trainable excitation model for HMM-based speech synthesis.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Face Recognition using Hidden Markov Eigenface Models.
Proceedings of the IEEE International Conference on Acoustics, 2007

Speaker-independent HMM-based speech synthesis system - HTS-2007 system for the Blizzard Challenge 2007.
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007

ATRECSS - ATR English speech corpus for speech synthesis.
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007

2006
Very low bit rate speech coding based on HMM with speaker adaptation.
Syst. Comput. Jpn., 2006

An HMM-Based Approach to Flexible Speech Synthesis.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Speaker adaptation of trajectory HMMs using feature-space MLLR.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Voice conversion based on mixtures of factor analyzers.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An HMM-based singing voice synthesis system.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Reducing computation on parallel decoding using frame-wise confidence scores.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Estimating Trajectory Hmm Parameters Using Monte Carlo Em With Gibbs Sampler.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Hidden Semi-Markov Model Based Speech Recognition System using Weighted Finite-State Transducer.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Face Recognition Based on Separable Lattice HMMS.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Developing a Test Bed of English Text-to-Speech System XIMERA for the Blizzard Challenge 2006.
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006

2005
Simultaneous clustering of phonetic context, dimension, and state position for acoustic modeling using decision trees.
Syst. Comput. Jpn., 2005

Incorporating a mixed excitation model and postfilter into HMM-based text-to-speech synthesis.
Syst. Comput. Jpn., 2005

Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification.
IEICE Trans. Inf. Syst., 2005

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models.
IEICE Trans. Inf. Syst., 2005

Applying Sparse KPCA for Feature Extraction in Speech Recognition.
IEICE Trans. Inf. Syst., 2005

Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition.
IEICE Trans. Inf. Syst., 2005

The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

HMM-based european Portuguese TTS system.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Spectral Conversion Based on Maximum Likelihood Estimation Considering Global Variance of Converted Parameter.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Sparse KPCA for Feature Extraction in Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Minimum Classification Error Interactive Training for Speaker Identification.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.
Proceedings of the Life-like characters - tools, affective functions, and applications., 2004

Speaker adaptation of pitch and spectrum for HMM-based speech synthesis.
Syst. Comput. Jpn., 2004

On the Use of Kernel PCA for Feature Extraction in Speech Recognition.
IEICE Trans. Inf. Syst., 2004

An introduction of trajectory model into HMM-based speech synthesis.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

Mapping from articulatory movements to vocal tract spectrum with Gaussian mixture model for articulatory speech synthesis.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

XIMERA: a new TTS from ATR based on corpus-based technologies.
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004

Hidden semi-Markov model based speech synthesis.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Constructing emotional speech synthesizers with limited speech database.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Acoustic-to-articulatory inversion mapping with Gaussian mixture model.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Decision-tree backing-off in HMM-based speech synthesis.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Deterministic annealing EM algorithm in parameter estimation for acoustic model.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A Viterbi algorithm for a trajectory model derived from HMM with explicit relationship between static and dynamic features.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Parameter sharing and minimum classification error training of mixtures of factor analyzers for speaker identification.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
A Training Method of Average Voice Model for HMM-Based Speech Synthesis.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Mixture Density Models Based on Mel-Cepstral Representation of Gaussian Process.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2003

Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Towards the development of a brazilian portuguese text-to-speech system based on HMM.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A training method for average voice model based on shared decision tree context clustering and speaker adaptive training.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Speech recognition using voice-characteristic-dependent acoustic models.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

Improving the performance of HMM-based very low bit rate speech coding.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
Pitch pattern generation using multispace probability distribution HMM.
Syst. Comput. Jpn., 2002

Decision tree distribution tying based on a dimensional split technique.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A context clustering technique for average voice model in HMM-based speech synthesis.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Eigenvoices for HMM-based speech synthesis.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A new algorithm for updating adaptive system.
Proceedings of the IEEE Asia Pacific Conference on Circuits and Systems 2002, 2002

2001
A new approach to designing a feature extractor in speaker identification based on discriminative feature extraction.
Speech Commun., 2001

Very low bit rate speech coding based on HMMs.
Syst. Comput. Jpn., 2001

Fast convergence transversal adaptive filtering algorithm for impulsive environment based on T distribution assumption.
Proceedings of the 2001 International Symposium on Circuits and Systems, 2001

Mixed excitation for HMM-based speech synthesis.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Text-to-speech synthesis with arbitrary speaker's voice from average voice.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A robust speaker verification system against imposture using an HMM-based speech synthesis system.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Minimum classification error training for speaker identification using Gaussian mixture models based on multi-space probability distribution.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR.
Proceedings of the IEEE International Conference on Acoustics, 2001

Speaker identification using Gaussian mixture models based on multi-space probability distribution.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
RLS-type two-dimensional adaptive filter with a t-distribution assumption.
Signal Process., 2000

HMM-based text-to-audio-visual speech synthesis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Audio-visual speech recognition using MCE-based hmms and model-dependent stream weights.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Imposture using synthetic speech against speaker verification based on spectrum and pitch.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Normalized Training for HMM-Based Visual Speech Recognition.
Proceedings of the 2000 International Conference on Image Processing, 2000

Speech parameter generation algorithms for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2000

Robust estimation of an AR multi-channel model by using t-distribution assumption.
Proceedings of the 10th European Signal Processing Conference, 2000

1999
A New Robust Two Dimensional Spectral Estimation Based on an AR Model Excited by a t Distribution Process and its QR-Decomposition Recursive Algorithm.
J. Circuits Syst. Comput., 1999

LMS-like two dimensional adaptive filter with t-distribution assumption and nonsymmetrical half-plane support.
Proceedings of the IEEE-EURASIP Workshop on Nonlinear Signal and Image Processing (NSIP'99), 1999

Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Intensity- and location-normalized training for HMM-based visual speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

On the security of HMM-based speaker verification systems against imposture using synthetic speech.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Location Normalization of HMM-Based Lip Reading: Experiments for the M2VTS Database.
Proceedings of the 1999 International Conference on Image Processing, 1999

Image Modeling Using Two Dimensional Exponential Systems.
Proceedings of the 1999 International Conference on Image Processing, 1999

Hidden Markov models based on multi-space probability distribution for pitch pattern modeling.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999

1998
Robust recursive time series modeling based on an AR model excited by a t-distribution process.
IEEE Trans. Signal Process., 1998

Speaker adaptation for HMM-based speech synthesis system using MLLR.
Proceedings of the Third ESCA/COCOSDA Workshop on Speech Synthesis, 1998

Duration modeling for HMM-based speech synthesis.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

HMM-based visual speech recognition using intensity and location normalization.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A very low bit rate speech coder using HMM with speaker adaptation.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A 16 kbit/s wideband CELP coder using MEL-generalized cepstral analysis and its subjective evaluation.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

A very low bit rate speech coder using HMM-based speech recognition/synthesis techniques.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Text-to-visual speech synthesis based on parameter generation from HMM.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

A wideband CELP speech coder at 16 kbit/s based on mel-generalized cepstral analysis.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Visual Speech Synthesis Based on Parameter Generation From HMM: Speech-Driven and Text-And-Speech-Driven Approaches.
Proceedings of the Auditory-Visual Speech Processing, 1998

1997
Speaker interpolation in HMM-based speech synthesis system.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

HMM compensation for noisy speech recognition based on cepstral parameter generation.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Voice characteristics conversion for HMM-based speech synthesis system.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Efficient encoding of mel-generalized cepstrum for CELP coders.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

1996
CELP coding system based on mel-generalized cepstral analysis.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Robust two dimensional spectral estimation based on AR model excited by a t-distribution process.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Speech synthesis using HMMs with dynamic features.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

1995
Adaptive cepstral analysis of speech.
IEEE Trans. Speech Audio Process., 1995

An algorithm for speech parameter generation from continuous mixture HMMs with dynamic features.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

Speech parameter generation from HMM using dynamic features.
Proceedings of the 1995 International Conference on Acoustics, 1995

CELP coding based on mel-cepstral analysis.
Proceedings of the 1995 International Conference on Acoustics, 1995

1994
AR Spectrum Estimation Based on Wavelet Representation.
Proceedings of the 1994 IEEE International Symposium on Circuits and Systems, ISCAS 1994, London, England, UK, May 30, 1994

Mel-generalized cepstral analysis - a unified approach to speech spectral estimation.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speech coding based on adaptive MEL-cepstral analysis for noisy channels.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speech coding based on adaptive mel-cepstral analysis.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

Robust recursive spectral estimation based on AR model excited by a t-distribution process.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1992
Spectral estimation based on AR-model excited by t-distribution process.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

Design of stable two-dimensional IIR digital filters with arbitrary magnitude function.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

An adaptive algorithm for mel-cepstral analysis of speech.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1990
Generalized cepstral analysis of speech - unified approach to LPC and cepstral method.
Proceedings of the First International Conference on Spoken Language Processing, 1990

Adaptive filtering based on cepstral representation-adaptive cepstral analysis of speech.
Proceedings of the 1990 International Conference on Acoustics, 1990


  Loading...