Yoshihiko Nankaku

According to our database1, Yoshihiko Nankaku authored at least 124 papers between 1999 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Singing voice synthesis based on frame-level sequence-to-sequence models considering vocal timing deviation.
CoRR, 2023

Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System.
Proceedings of the IEEE International Conference on Acoustics, 2023

Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
End-to-End Text-to-Speech Based on Latent Representation of Speaking Styles Using Spontaneous Dialogue.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Autoregressive Variational Autoencoder with a Hidden Semi-Markov Model-Based Structured Attention for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

Enhancing Social Telepresence on Text Communication Using Robot Avatar that Reflects User's Chatting States.
Proceedings of the 11th IEEE Global Conference on Consumer Electronics, 2022

2021
Sinsy: A Deep Neural Network-Based Singing Voice Synthesis System.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

PeriodNet: A Non-Autoregressive Raw Waveform Generative Model With a Structure Separating Periodic and Aperiodic Components.
IEEE Access, 2021

Periodnet: A Non-Autoregressive Waveform Generation Model with a Structure Separating Periodic and Aperiodic Components.
Proceedings of the IEEE International Conference on Acoustics, 2021

2020
Hierarchical Multi-Grained Generative Model for Expressive Speech Synthesis.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Singing voice synthesis based on convolutional neural networks.
CoRR, 2019

Low computational cost speech synthesis based on deep neural networks using hidden semi-Markov model structures.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Deep neural network based real-time speech vocoder with periodic and aperiodic inputs.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Impacts of input linguistic feature representation on Japanese end-to-end speech synthesis.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Speaker-dependent Wavenet-based Delay-free Adpcm Speech Coding.
Proceedings of the IEEE International Conference on Acoustics, 2019

Singing Voice Synthesis Based on Generative Adversarial Networks.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Mel-Cepstrum-Based Quantization Noise Shaping Applied to Neural-Network-Based Speech Waveform Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018

WaveNet-Based Zero-Delay Lossless Speech Coding.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

Statistical Voice Conversion Based on Wavenet.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Image Recognition Based on Separable Lattice Hmms Using a Deep Neural Network for Output Probability Distributions.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

The NITech text-to-speech system for the Blizzard Challenge 2018.
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

Discriminative Feature Extraction Based on Sequential Variational Autoencoder for Speaker Recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Singing Voice Conversion Using Posted Waveform Data on Music Social Media.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Speaker Adaptation for Speech Synthesis Based on Deep Neural Networks Using Hidden Semi-Markov Model Structures.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Image Recognition Based on Convolutional Neural Networks Using Features Generated from Separable Lattice Hidden Markov Models.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Recent Development of the DNN-based Singing Voice Synthesis System - Sinsy.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Speech Synthesis Using WaveNet Vocoder Based on Periodic/Aperiodic Decomposition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Simultaneous Optimization of Multiple Tree-Based Factor Analyzed HMM for Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Articulatory Text-to-Speech Synthesis Using the Digital Waveguide Mesh Driven by a Deep Neural Network.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Image recognition based on discriminative models using features generated from separable lattice HMMS.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

User Generated Dialogue Systems: uDialogue.
Proceedings of the Human-Harmonized Information Technology, Volume 2, 2017

2016
A Bayesian Approach to Image Recognition Based on Separable Lattice Hidden Markov Models.
IEICE Trans. Inf. Syst., 2016

Temporal modeling in neural network based statistical parametric speech synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Singing Voice Synthesis Based on Deep Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Voice Conversion Based on Trajectory Model Training of Neural Networks Considering Global Variance.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Redefining the Linguistic Context Feature Set for HMM and DNN TTS Through Position and Parsing.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Trajectory training considering global variance for speech synthesis based on neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

2015
Simultaneous optimization of multiple tree structures for factor analyzed HMM-based speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Prosodically-enhanced recurrent neural network language models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

The effect of neural networks in statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Contextual Additive Structure for HMM-Based Speech Synthesis.
IEEE J. Sel. Top. Signal Process., 2014

Image Recognition Based on Separable Lattice Trajectory 2-D HMMs.
IEICE Trans. Inf. Syst., 2014

Integration of Spectral Feature Extraction and Modeling for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2014

A mel-cepstral analysis technique restoring high frequency components from low-sampling-rate speech.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Integration of speaker and pitch adaptive training for HMM-based singing voice synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014

HMM-Based singing voice synthesis and its application to Japanese and English.
Proceedings of the IEEE International Conference on Acoustics, 2014

2013
Speech Synthesis Based on Hidden Markov Models.
Proc. IEEE, 2013

A Bayesian Framework Using Multiple Model Structures for Speech Recognition.
IEICE Trans. Inf. Syst., 2013

Cross-lingual speaker adaptation based on factor analysis using bilingual speech data for HMM-based speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Contextual partial additive structure for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Integration of acoustic modeling and mel-cepstral analysis for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013

Separable lattice 2-D HMMS introducing state duration control for recognition of images with various variations.
Proceedings of the IEEE International Conference on Acoustics, 2013

Image recognition based on hidden Markov eigen-image models using variational Bayesian method.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012
Product of Experts for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2012

An Extension of Separable Lattice 2-D HMMs for Rotational Data Variations.
IEICE Trans. Inf. Syst., 2012

Cross-lingual Speaker Adaptation for HMM-based Speech Synthesis based on Perceptual Characteristics and Speaker Interpolation.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A Bayesian Approach to Speaker Recognition Based on GMMs Using Multiple Model Structures.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A model structure integration based on a Bayesian framework for speech recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Face recognition based on separable lattice 2-D HMMS using variational bayesian method.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Pitch adaptive training for hmm-based singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Face recognition based on extended separable lattice 2-D HMMS.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011
Continuous Stochastic Feature Mapping Based on Trajectory HMMs.
IEEE Trans. Speech Audio Process., 2011

Bayesian Context Clustering Using Cross Validation for Speech Recognition.
IEICE Trans. Inf. Syst., 2011

GMM-Based Missing-Feature Reconstruction on Multi-Frame Windows.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Evaluation of Tree-Trellis Based Decoding in Over-Million LVCSR.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

Global variance modeling on frequency domain delta LSP for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
A Covariance-Tying Technique for HMM-Based Speech Synthesis.
IEICE Trans. Inf. Syst., 2010

Spectral modeling with contextual additive structure for HMM-based speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Recent development of the HMM-based singing voice synthesis system - Sinsy.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Bayesian speech synthesis framework integrating training and synthesis processes.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Voice activity detection based on conditional random fields using multiple features.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

HMM-based singing voice synthesis system using pitch-shifted pseudo training data.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speaker adaptation based on nonlinear spectral transform for speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Statistical parametric speech synthesis based on product of experts.
Proceedings of the IEEE International Conference on Acoustics, 2010

Face recognition based on separable lattice 2-D HMM with state duration modeling.
Proceedings of the IEEE International Conference on Acoustics, 2010

Factor analyzed voice models for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010

A Deterministic Annealing-Based Training Algorithm For Statistical Machine Translation Models.
Proceedings of the 14th Annual conference of the European Association for Machine Translation, 2010

2009
State mapping based method for cross-lingual speaker adaptation in HMM-based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Deterministic annealing based training algorithm for Bayesian speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A Bayesian approach to Hidden Semi-Markov Model based speech synthesis.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Stereo-based stochastic noise compensation based on trajectory GMMS.
Proceedings of the IEEE International Conference on Acoustics, 2009

Voice conversion based on simultaneous modelling of spectrum and F0.
Proceedings of the IEEE International Conference on Acoustics, 2009

A Bayesian approach to HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
A Fully Consistent Hidden Semi-Markov Model-Based Speech Recognition System.
IEICE Trans. Inf. Syst., 2008

Simultaneous Acoustic, Prosodic, and Phrasing Model Training for TTs Conversion Systems.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Probabilistic feature mapping based on trajectory HMMs.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Simultaneous conversion of duration and spectrum based on statistical models including time-sequence matching.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Probabilistic answer selection based on conditional random fields for spoken dialog system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Acoustic modeling based on model structure annealing for speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Speaker recognition based on variational Bayesian method.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Bayesian context clustering using cross valid prior distribution for HMM-based speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Acoustic modeling with contextual additive structure for HMM-based speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Spectral conversion based on statistical models including time-sequence matching.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

An excitation model for HMM-based speech synthesis based on residual modeling.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007

Model-space MLLR for trajectory HMMs.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A trainable excitation model for HMM-based speech synthesis.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Face Recognition using Hidden Markov Eigenface Models.
Proceedings of the IEEE International Conference on Acoustics, 2007

2006
Speaker adaptation of trajectory HMMs using feature-space MLLR.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Voice conversion based on mixtures of factor analyzers.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

An HMM-based singing voice synthesis system.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Reducing computation on parallel decoding using frame-wise confidence scores.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Estimating Trajectory Hmm Parameters Using Monte Carlo Em With Gibbs Sampler.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Hidden Semi-Markov Model Based Speech Recognition System using Weighted Finite-State Transducer.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

On the Use of Phonetic Information for Mapping from Articulatory Movements to Vocal Tract Spectrum.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Face Recognition Based on Separable Lattice HMMS.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Parameter Sharing in Mixture of Factor Analyzers for Speaker Identification.
IEICE Trans. Inf. Syst., 2005

Continuous Speech Recognition Based on General Factor Dependent Acoustic Models.
IEICE Trans. Inf. Syst., 2005

Applying Sparse KPCA for Feature Extraction in Speech Recognition.
IEICE Trans. Inf. Syst., 2005

Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition.
IEICE Trans. Inf. Syst., 2005

Sparse KPCA for Feature Extraction in Speech Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
On the Use of Kernel PCA for Feature Extraction in Speech Recognition.
IEICE Trans. Inf. Syst., 2004

Deterministic annealing EM algorithm in parameter estimation for acoustic model.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Parameter sharing and minimum classification error training of mixtures of factor analyzers for speaker identification.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Speech recognition using voice-characteristic-dependent acoustic models.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2000
Normalized Training for HMM-Based Visual Speech Recognition.
Proceedings of the 2000 International Conference on Image Processing, 2000

1999
Intensity- and location-normalized training for HMM-based visual speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999


  Loading...