Shinji Takaki
Orcid: 0000-0001-7294-7699
According to our database1,
Shinji Takaki
authored at least 52 papers
between 2010 and 2023.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2023
Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System.
Proceedings of the IEEE International Conference on Acoustics, 2023
2021
PeriodNet: A Non-Autoregressive Raw Waveform Generative Model With a Structure Separating Periodic and Aperiodic Components.
IEEE Access, 2021
Periodnet: A Non-Autoregressive Waveform Generation Model with a Structure Separating Periodic and Aperiodic Components.
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F<sub>0</sub> Model for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Modeling of Rakugo Speech and Its Limitations: Toward Speech Synthesis That Entertains Audiences.
IEEE Access, 2020
Fast and High-Quality Singing Voice Synthesis System Based on Convolutional Neural Networks.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Semi-Supervised Learning Based on Hierarchical Generative Models for End-to-End Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Transformation of low-quality device-recorded speech to high-quality speech using improved SEGAN model.
CoRR, 2019
Training a Neural Speech Waveform Model using Spectral Losses of Short-Time Fourier Transform and Continuous Wavelet Transform.
CoRR, 2019
Rakugo speech synthesis using segment-to-segment neural transduction and style tokens - toward speech synthesis for entertaining audiences.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019
Does the Lombard Effect Improve Emotional Communication in Noise? - Analysis of Emotional Speech Acted in Noise.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Investigation of Enhanced Tacotron Text-to-speech Synthesis Systems with Self-attention for Pitch Accent Language.
Proceedings of the IEEE International Conference on Acoustics, 2019
Neural Source-filter-based Waveform Model for Statistical Parametric Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Speech Commun., 2018
Investigating different representations for modeling and controlling multiple emotions in DNN-based speech synthesis.
Speech Commun., 2018
Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra.
CoRR, 2018
Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder.
IEEE Access, 2018
Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
A Comparison of Recent Waveform Generation and Acoustic Modeling Methods for Neural-Network-Based Speech Synthesis.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
2017
An RNN-Based Quantized F0 Model with Multi-Tier Feedback Links for Text-to-Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Direct Modeling of Frequency Spectra and Waveform Generation Based on Phase Recovery for DNN-Based Speech Synthesis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
2016
Constructing a Deep Neural Network Based Spectral Model for Statistical Speech Synthesis.
Proceedings of the Recent Advances in Nonlinear Speech Processing, 2016
Investigation of Using Continuous Representation of Various Linguistic Units in Neural Network Based Text-to-Speech Synthesis.
IEICE Trans. Inf. Syst., 2016
A Comparative Study of the Performance of HMM, DNN, and RNN based Speech Synthesis Systems Trained on Very Large Speaker-Dependent Corpora.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Speaker Adaptation of Various Components in Deep Neural Network based Speech Synthesis.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016
Enhance the Word Vector with Prosodic Information for the Recurrent Neural Network Based TTS System.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Speech Enhancement for a Noise-Robust Text-to-Speech Synthesis System Using Deep Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Using Text and Acoustic Features in Predicting Glottal Excitation Waveforms for Parametric Speech Synthesis with Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
A deep auto-encoder based low-dimensional feature extraction from FFT spectral envelopes for statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016
2015
Proceedings of the Mathematics and Computation in Music - 5th International Conference, 2015
Multiple feed-forward deep neural networks for statistical parametric speech synthesis.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
2014
IEEE J. Sel. Top. Signal Process., 2014
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014
2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Separable lattice 2-D HMMS introducing state duration control for recognition of images with various variations.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the Blizzard Challenge 2013, 2013
2012
Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012
2011
An optimization algorithm of independent mean and variance parameter tying structures for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011
2010
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010