Toru Nakashika

Orcid: 0000-0003-1863-6771

According to our database1, Toru Nakashika authored at least 51 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




An Investigation on the Speech Recovery from EEG Signals Using Transformer.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Gamma-VAE: Speech representation based on VAE assuming gamma distribution for both latent variables and observation.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

DDPMVC: Non-parallel any-to-many voice conversion using diffusion encoder.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Gamma Boltzmann Machine for Audio Modeling.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Acoustic Scenery Recognition Using CWT and Deep Neural Network.
Proceedings of the New Trends in Intelligent Software Methodologies, Tools and Techniques, 2021

Speech Chain VC: Linking Linguistic and Acoustic Levels via Latent Distinctive Features for RBM-Based Voice Conversion.
IEICE Trans. Inf. Syst., 2020

Many-to-Many Symbolic Multi-Track Music Genre Transfer.
Proceedings of the Knowledge Innovation Through Intelligent Software Methodologies, Tools and Techniques, 2020

Complex-Valued Variational Autoencoder: A Novel Deep Generative Model for Direct Representation of Complex Spectra.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Simultaneous Conversion of Speaker Identity and Emotion Based on Multiple-Domain Adaptive RBM.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Gamma Boltzmann Machine for Simultaneously Modeling Linear- and Log-amplitude Spectra.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

Complex-Valued Restricted Boltzmann Machine for Speaker-Dependent Speech Parameterization From Complex Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Pre-Training of DNN-Based Speech Synthesis Based on Bidirectional Conversion between Text and Speech.
IEICE Trans. Inf. Syst., 2019

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition.
EURASIP J. Audio Speech Music. Process., 2019

STFT Spectral Loss for Training a Neural Speech Waveform Model.
Proceedings of the IEEE International Conference on Acoustics, 2019

Deep Relational Model: A Joint Probabilistic Model with a Hierarchical Structure for Bidirectional Estimation of Image and Labels.
IEICE Trans. Inf. Syst., 2018

Complex-Valued Restricted Boltzmann Machine for Direct Speech Parameterization from Complex Spectra.
CoRR, 2018

Bidirectional Voice Conversion Based on Joint Training Using Gaussian-Gaussian Deep Relational Model.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

DNN-based Speech Synthesis for Small Data Sets Considering Bidirectional Speech-Text Conversion.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

LSTBM: A Novel Sequence Representation of Speech Spectra Using Restricted Boltzmann Machine with Long Short-Term Memory.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Speaker-adaptive-trainable Boltzmann machine and its application to non-parallel voice conversion.
EURASIP J. Audio Speech Music. Process., 2017

Complex-Valued Restricted Boltzmann Machine for Direct Learning of Frequency Spectra.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

CAB: An Energy-Based Speaker Clustering Model for Rapid Adaptation in Non-Parallel Voice Conversion.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Generative Acoustic-Phonemic-Speaker Model Based on Three-Way Restricted Boltzmann Machine.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Modeling deep bidirectional relationships for image classification and generation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Speaker adaptive model based on Boltzmann machine for non-parallel training in voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

3WRBM-based speech factor modeling for arbitrary-source and non-parallel voice conversion.
Proceedings of the 24th European Signal Processing Conference, 2016

Selection of an optimum random matrix using a genetic algorithm for acoustic feature extraction.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Voice conversion using speaker-dependent conditional restricted Boltzmann machine.
EURASIP J. Audio Speech Music. Process., 2015

Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization.
EURASIP J. Audio Speech Music. Process., 2015

Content-based Image Retrieval Using Rotation-invariant Histograms of Oriented Gradients.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Sparse nonlinear representation for voice conversion.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Feature extraction using pre-trained convolutive bottleneck nets for dysarthric speech recognition.
Proceedings of the 23rd European Signal Processing Conference, 2015

Noise-robust voice conversion using a small parallel data based on non-negative matrix factorization.
Proceedings of the 23rd European Signal Processing Conference, 2015

Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines.
IEICE Trans. Inf. Syst., 2014

High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Error correction of automatic speech recognition based on normalized web distance.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

3D-Object Recognition Based on LLC Using Depth Spatial Pyramid.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Voice conversion in time-invariant speaker-independent space.
Proceedings of the IEEE International Conference on Acoustics, 2014

Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary.
Proceedings of the IEEE International Conference on Acoustics, 2014

High-Frequency Restoration Using Deep Belief Nets for Super-resolution.
Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Voice conversion in high-order eigen space using deep belief nets.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Sparse representation for outliers suppression in semi-supervised image annotation.
Proceedings of the IEEE International Conference on Acoustics, 2013

A Combination of Hand-Crafted and Hierarchical High-Level Learnt Feature Extraction for Music Genre Classification.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2013, 2013

Local-feature-map Integration Using Convolutional Neural Networks for Music Genre Classification.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Constrained Spectrum Generation Using A Probabilistic Spectrum Envelope for Mixed Music Analysis.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Generic object recognition using automatic region extraction and dimensional feature integration utilizing multiple kernel learning.
Proceedings of the IEEE International Conference on Acoustics, 2011

Speech synthesis by modeling harmonics structure with multiple function.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
