Heiga Zen
Orcid: 0000-0002-8959-5471
According to our database1,
Heiga Zen
authored at least 122 papers
between 2002 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
CoRR, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
IEEE Signal Process. Mag., July, 2023
Extracting representative subset from extensive text data for training pre-trained language models.
Inf. Process. Manag., May, 2023
Guest Editorial: Special Issue on Affective Speech and Language Synthesis, Generation, and Conversion.
IEEE Trans. Affect. Comput., 2023
Miipher: A Robust Speech Restoration Model Integrating Self-Supervised Speech and Text Representations.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
FiPPiE: A Computationally Efficient Differentiable method for Estimating Fundamental Frequency From Spectrograms.
Proceedings of the 12th ISCA Speech Synthesis Workshop, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Conference on Robot Learning, 2023
2022
Wavefit: an Iterative and Non-Autoregressive Neural Vocoder Based on Fixed-Point Iteration.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the Thirteenth Language Resources and Evaluation Conference, 2022
SpecGrad: Diffusion Probabilistic Model based Neural Vocoder with Adaptive Noise Spectral Shaping.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Training Text-To-Speech Systems From Synthetic Data: A Practical Approach For Accent Transfer Tasks.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 9th International Conference on Learning Representations, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
Non-Attentive Tacotron: Robust and Controllable Neural TTS Synthesis Including Unsupervised Duration Modeling.
CoRR, 2020
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior.
CoRR, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Speech Processing for Digital Home Assistants: Combining signal processing with deep-learning techniques.
IEEE Signal Process. Mag., 2019
CoRR, 2019
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
Proceedings of the 7th International Conference on Learning Representations, 2019
2018
Sequence-to-sequence Neural Network Model with 2D Attention for Learning Japanese Pitch Accents.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 35th International Conference on Machine Learning, 2018
Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018
2017
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
2016
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Using instantaneous frequency and aperiodicity detection to estimate F0 for high-quality speech synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Fast, Compact, and High Quality LSTM-RNN Based Statistical Parametric Speech Synthesizers for Mobile Devices.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Multi-Language Multi-Speaker Acoustic Modeling for LSTM-RNN Based Statistical Parametric Speech Synthesis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Directly modeling voiced and unvoiced components in speech waveforms by neural networks.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends.
IEEE Signal Process. Mag., 2015
Unidirectional long short-term memory recurrent neural network with recurrent output layer for low-latency speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Directly modeling speech waveforms by neural networks for statistical parametric speech synthesis.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Deep mixture density networks for acoustic modeling in statistical parametric speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
IEEE Trans. Speech Audio Process., 2013
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
2012
IEEE Trans. Speech Audio Process., 2012
IEEE Trans. Speech Audio Process., 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Cepstral analysis based on the glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
IEEE Trans. Speech Audio Process., 2011
Context adaptive training with factorized decision trees for HMM-based statistical parametric speech synthesis.
Speech Commun., 2011
IEICE Trans. Inf. Syst., 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Decision tree-based context clustering based on cross validation and hierarchical priors.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
IEICE Trans. Inf. Syst., 2010
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010
Statistical parametric speech synthesis with joint estimation of acoustic and excitation model parameters.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Context adaptive training with factorized decision trees for HMM-based speech synthesis.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
An implementation of decision tree-based context clustering on graphics processing units.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
IEEE Trans. Speech Audio Process., 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Tying covariance matrices to reduce the footprint of HMM-based speech synthesis systems.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
2008
IEICE Trans. Inf. Syst., 2008
IEICE Trans. Inf. Syst., 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Bayesian context clustering using cross valid prior distribution for HMM-based speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Performance evaluation of the speaker-independent HMM-based speech synthesis system "HTS 2007" for the Blizzard Challenge 2007.
Proceedings of the IEEE International Conference on Acoustics, 2008
Acoustic modeling with contextual additive structure for HMM-based speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008
The HTS-2008 System: Yet Another Evaluation of the Speaker-Adaptive HMM-based Speech Synthesis System in The 2008 Blizzard Challenge.
Proceedings of the Blizzard Challenge 2008, 2008
2007
Details of the Nitech HMM-Based Speech Synthesis System for the Blizzard Challenge 2005.
IEICE Trans. Inf. Syst., 2007
IEICE Trans. Inf. Syst., 2007
IEICE Trans. Inf. Syst., 2007
Reformulating the HMM as a trajectory model by imposing explicit relationships between static and dynamic feature vector sequences.
Comput. Speech Lang., 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Improved average-voice-based speech synthesis using gender-mixed modeling and a parameter generation algorithm considering GV.
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the Sixth ISCA Workshop on Speech Synthesis, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
Speaker-independent HMM-based speech synthesis system - HTS-2007 system for the Blizzard Challenge 2007.
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007
2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Hidden Semi-Markov Model Based Speech Recognition System using Weighted Finite-State Transducer.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
Simultaneous clustering of phonetic context, dimension, and state position for acoustic modeling using decision trees.
Syst. Comput. Jpn., 2005
IEICE Trans. Inf. Syst., 2005
IEICE Trans. Inf. Syst., 2005
Deterministic Annealing EM Algorithm in Acoustic Modeling for Speaker and Speech Recognition.
IEICE Trans. Inf. Syst., 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
On building a concatenative speech synthesis system from the blizzard challenge speech databases.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
2004
IEICE Trans. Inf. Syst., 2004
Proceedings of the Fifth ISCA ITRW on Speech Synthesis, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
A Viterbi algorithm for a trajectory model derived from HMM with explicit relationship between static and dynamic features.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
2003
Decision tree-based simultaneous clustering of phonetic contexts, dimensions, and state positions for acoustic modeling.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Trajectory modeling based on HMMs with the explicit relationship between static and dynamic features.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Towards the development of a brazilian portuguese text-to-speech system based on HMM.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003
2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002