Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

Analysis of Native Listeners' Facial Microexpressions While Shadowing Non-Native Speech - Potential of Shadowers' Facial Expressions for Comprehensibility Prediction.

[BibT_eX]

[DOI]

Tasavat Trisitichoke

Shintaro Ando

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cooking State Recognition based on Acoustic Event Detection.

[BibT_eX]

[DOI]

Yusaku Korematsu

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 11th Workshop on Multimedia for Cooking and Eating Activities, 2019

The UTokyo speech synthesis system for Blizzard Challenge 2019.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019

Speech representation based on tensor factor analysis and its application to speaker recognition and language identification.

[BibT_eX]

[DOI]

Daisuke Saito

So Suzuki

Nobuaki Minematsu

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Experimental investigation on the efficacy of Affine-DTW in the quality of voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

DNN-based Statistical Parametric Speech Synthesis Incorporating Non-negative Matrix Factorization.

[BibT_eX]

[DOI]

Shunsuke Goto

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder.

[BibT_eX]

[DOI]

IEEE Access, 2018

Noise Reduction Method for Intra-Body Communication by Using Compensation Electrode.

[BibT_eX]

[DOI]

Proceedings of the TENCON 2018, 2018

DNN-Based Scoring of Language Learners' Proficiency Using Learners' Shadowings and Native Listeners' Responsive Shadowings.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.

[BibT_eX]

[DOI]

Fernando Villavicencio

Tomi Kinnunen

Zhen-Hua Ling

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.

[BibT_eX]

[DOI]

Fernando Villavicencio

Zhen-Hua Ling

Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Comparative Study of Statistical Conversion of Face to Voice Based on Their Subjective Impressions.

[BibT_eX]

[DOI]

Yasuhito Ohsugi

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Analysis of Transient Signal Due to Person Movement in Gate System Using Intra-Body Communication.

[BibT_eX]

[DOI]

Proceedings of the 12th International Conference on Sensing Technology, 2018

Analysis of Unintentional Signal Propagation in Intra-Body Communication.

[BibT_eX]

[DOI]

Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

A Revisit to Feature Handling for High-quality Voice Conversion Based on Gaussian Mixture Model.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Comparison of Text-Based and Visual-Based Programming Input Methods for First-Time Learners.

[BibT_eX]

[DOI]

Daisuke Saito

Hironori Washizaki

Yoshiaki Fukazawa

J. Inf. Technol. Educ. Res., 2017

Quantitative learning effect evaluation of programming learning tools.

[BibT_eX]

[DOI]

Proceedings of the IEEE 6th International Conference on Teaching, 2017

Development and Maintenance of Practical and In-service Systems for Recording Shadowing Utterances and Their Assessment.

[BibT_eX]

[DOI]

Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

New Features and Effectiveness of Suzuki-kun, the First and Only Prosodic Reading Tutor of Tokyo Japanese.

[BibT_eX]

[DOI]

Nobuaki Minematsu

Daisuke Saito

Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Automatic Scoring of Shadowing Speech Based on DNN Posteriors and Their DTW.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic-to-Articulatory Mapping Based on Mixture of Probabilistic Canonical Correlation Analysis.

[BibT_eX]

[DOI]

Hidetsugu Uchida

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Use of Global and Acoustic Features Associated with Contextual Factors to Adapt Language Models for Spontaneous Speech Recognition.

[BibT_eX]

[DOI]

Shohei Toyama

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Parallel-Data-Free Many-to-Many Voice Conversion Based on DNN Integrated with Eigenspace Using a Non-Parallel Speech Corpus.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

The UTokyo speech synthesis system for Blizzard Challenge 2017.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

2016

Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Prosodic Reading Tutor of Japanese, Suzuki-kun: The first and only educational tool to teach the formal Japanese.

[BibT_eX]

[DOI]

Nobuaki Minematsu

Daisuke Saito

Nobuyuki Nishizawa

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Improved prediction of the accent gap between speakers of English for individual-based clustering of World Englishes.

[BibT_eX]

[DOI]

Fumiya Shiozawa

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Influence of the Programming Environment on Programming Education.

[BibT_eX]

[DOI]

Daisuke Saito

Hironori Washizaki

Yoshiaki Fukazawa

Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, 2016

Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis.

[BibT_eX]

[DOI]

Yi Zhao

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker.

[BibT_eX]

[DOI]

Hidetsugu Uchida

Daisuke Saito

Nobuaki Minematsu

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The Voice Conversion Challenge 2016.

[BibT_eX]

[DOI]

Tomoki Toda

Ling-Hui Chen

Daisuke Saito

Fernando Villavicencio

Mirjam Wester

Zhizheng Wu

Junichi Yamagishi

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Divergence estimation based on deep neural networks and its use for language identification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The UTokyo System for Blizzard Challenge 2016.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

Arbitrary speaker conversion based on speaker space bases constructed by deep neural networks.

[BibT_eX]

[DOI]

Tetsuya Hashimoto

Daisuke Saito

Nobuaki Minematsu

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Work in progress: A comparison of programming way: Illustration-based programming and text-based programming.

[BibT_eX]

[DOI]

Daisuke Saito

Hironori Washizaki

Yoshiaki Fukazawa

Proceedings of the IEEE International Conference on Teaching, 2015

Automatic prediction of intelligibility of English words spoken with Japanese accents - comparative study of features and models used for prediction.

[BibT_eX]

[DOI]

Teeraphon Pongkittiphan

Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Development of a prosodic reading tutor of Japanese - effective use of TTS and F0 contour modeling techniques for CALL.

[BibT_eX]

[DOI]

Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Noise-robust and stress-free visualization of pronunciation diversity of World Englishes using a learner's self-centered viewpoint.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Statistical acoustic-to-articulatory mapping unified with speaker normalization based on voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A measure of phonetic similarity to quantify pronunciation variation by using ASR technology.

[BibT_eX]

[DOI]

Tianze Shi

Shun Kasahara

Teeraphon Pongkittiphan

Nobuaki Minematsu

Daisuke Saito

Keikichi Hirose

Proceedings of the 18th International Congress of Phonetic Sciences, 2015

SAS: A speaker verification spoofing database containing diverse attacks.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014

Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Visualization of pronunciation diversity of world Englishes from a speaker's self-centered viewpoint.

[BibT_eX]

[DOI]

Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014

Minecraft-based preparatory training for software development project.

[BibT_eX]

[DOI]

Daisuke Saito

Akira Takebayashi

Tsuneo Yamaura

Proceedings of the 2014 IEEE International Professional Communication Conference, 2014

Application of matrix variate Gaussian mixture model to statistical voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Semi-supervised noise dictionary adaptation for exemplar-based noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Improved and robust prediction of pronunciation distance for individual-basis clustering of World Englishes pronunciation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

A turning control of electric wheeled walker device by PSD camera information.

[BibT_eX]

[DOI]

Daisuke Saito

Toshiyuki Murakami

Proceedings of the IEEE 13th International Workshop on Advanced Motion Control, 2014

2013

A New Approach to Programming Language Education for Beginners with Top-Down Learning.

[BibT_eX]

[DOI]

Daisuke Saito

Tsuneo Yamaura

Int. J. Eng. Pedagog., 2013

Text-to-speech synthesizer based on combination of composite wavelet and hidden Markov models.

[BibT_eX]

[DOI]

Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Probabilistic speech F<sub>0</sub> contour model incorporating statistical vocabulary model of phrase-accent command sequence.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Adaptive template adjustment for personalized gesture recognition based on a finger-worn device.

[BibT_eX]

[DOI]

Yinghui Zhou

Daisuke Saito

Lei Jing

Proceedings of the International Joint Conference on Awareness Science and Technology & Ubi-Media Computing, 2013

Discriminative piecewise linear transformation based on deep learning for noise robust automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012

Statistical Voice Conversion Based on Noisy Channel Model.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion.

[BibT_eX]

[DOI]

Daisuke Saito

Nobuaki Minematsu

Keikichi Hirose

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Assistance for Novice Users on Creating Songs from Japanese Lyrics.

[BibT_eX]

[DOI]

Satoru Fukayama

Daisuke Saito

Shigeki Sagayama

Proceedings of the Non-Cochlear Sound: Proceedings of the 38th International Computer Music Conference, 2012

A tandem connectionist model using combination of multi-scale spectro-temporal features for acoustic event detection.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Correcting for non-uniform illumination when photographing the mural in the royal tomb of Amenophis III (III) Correcting mural images.

[BibT_eX]

[DOI]

Proceedings of the 6th European Conference on Colour in Graphics, Imaging, and Vision, 2012

2011

One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Gesture Design of Hand-to-Speech Converter Derived from Speech-to-Hand Converter Based on Probabilistic Integration Model.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Adaptation of Prosody in Speech Synthesis by Changing Command Values of the Generation Process Model of Fundamental Frequency.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

High accurate model-integration-based voice conversion using dynamic features and model structure optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

2010

Improved generation of prosodic features in HMM-based Mandarin speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Probabilistic integration of joint density model and speaker model for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

HMM-based sequence-to-frame mapping for voice conversion.

[BibT_eX]

[DOI]

Yu Qiao

Daisuke Saito

Nobuaki Minematsu

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

A numerical method for solving the Vlasov-Poisson equation based on the conservative IDO scheme.

[BibT_eX]

[DOI]

J. Comput. Phys., 2009

Optimal event search using a structural cost function - improvement of structure to speech conversion.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008

Decomposition of rotational distortion caused by VTL difference using eigenvalues of its transformation matrix.

[BibT_eX]

[DOI]

Daisuke Saito

Nobuaki Minematsu

Keikichi Hirose

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Structure to speech conversion - speech generation based on infant-like vocal imitation.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Directional dependency of cepstrum on vocal tract length.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2006

The effect of Age on Web-safe Color Visibility for a White Background.

[BibT_eX]

[DOI]

Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006

Daisuke Saito

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...