Daisuke Saito

Orcid: 0000-0003-4263-5453

According to our database1, Daisuke Saito authored at least 130 papers between 2006 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Measuring Complexity in Visual Programming for Elementary School Students.
J. Inf. Process., 2024

A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings.
CoRR, 2024

Simulating Native Speaker Shadowing for Nonnative Speech Assessment with Latent Speech Representations.
CoRR, 2024

A Pilot Study of GSLM-based Simulation of Foreign Accentuation Only Using Native Speech Corpora.
CoRR, 2024

Enhancing Programming Education through Game-Based Learning: Design and Implementation of a Puyo Puyo-Inspired Teaching Tool.
Proceedings of the 55th ACM Technical Symposium on Computer Science Education, 2024

Cost-Effective Capacity Enhancement of Survivable Optical Networks by Supplemental Band Expansion and Backup Resource Sharing.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2024

Do Learned Speech Symbols Follow Zipf's Law?
Proceedings of the IEEE International Conference on Acoustics, 2024

Evaluating Preschoolers' Block Programming Using Complexity and Personality Traits.
Proceedings of the 36th International Conference on Software Engineering Education and Training, 2024

2023
Improving Semi-Supervised Differentiable Synthesizer Sound Matching for Practical Applications.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

Quality-diversity for Synthesizer Sound Matching.
J. Inf. Process., 2023

Work-in-Progress: Relating Logical Thinking Skills to Program Complexity in Children's Programming Education.
Proceedings of the IEEE International Conference on Teaching, 2023

Density and Entropy of Spoken Syllables in American English and Japanese English Estimated with Acoustic Word Embeddings.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Learners' Prosodic Control in the Task of Expressive Storytelling and Predicted Native Listeners' Impressions of the Learners' Speech.
Proceedings of the 9th Workshop on Speech and Language Technology in Education, 2023

Gender Characteristics and Computational Thinking in Scratch.
Proceedings of the 54th ACM Technical Symposium on Computer Science Education, Volume 2, 2023

Cost-effective Network Capacity Enhancement with Multi-band Virtual Bypass Links.
Proceedings of the Optical Fiber Communications Conference and Exhibition, 2023

Automatic Prediction of Language Learners' Listenability Using Speech and Text Features Extracted from Listening Drills.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multiple Acoustic Features Speech Emotion Recognition Using Cross-Attention Transformer.
Proceedings of the IEEE International Conference on Acoustics, 2023

Programming Education for Young People using the Falling-Puzzle Game, "Puyo Puyo".
Proceedings of the IEEE Global Engineering Education Conference, 2023

2022
Singer Diarization for Polyphonic Music With Unison Singing.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Voice Conversion Based on Deep Neural Networks for Time-Variant Linear Transformations.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

INmfCA Algorithm for Training of Nonparallel Voice Conversion Systems Based on Non-Negative Matrix Factorization.
IEICE Trans. Inf. Syst., 2022

Scratch Project Analysis: Relationship Between Gender and Computational Thinking Skill.
Proceedings of the IEEE International Conference on Teaching, 2022

Automatic Prediction of Intelligibility of Words and Phonemes Produced Orally by Japanese Learners of English.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Detection of Learners' Listening Breakdown with Oral Dictation and Its Use to Model Listening Skill Improvement Exclusively Through Shadowing.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Text-to-speech synthesis using spectral modeling based on non-negative autoencoder.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Quantifying Discriminability between NMF Bases.
Proceedings of the IEEE International Conference on Acoustics, 2022

2021
Comparing Participants' Brainwaves During Solo, Pair, and Mob Programming.
Proceedings of the Agile Processes in Software Engineering and Extreme Programming, 2021

Analog In-memory Computing in FeFET-based 1T1R Array for Edge AI Applications.
Proceedings of the 2021 Symposium on VLSI Circuits, Kyoto, Japan, June 13-19, 2021, 2021

Development of a Game to Foster Programming Thinking for Learning through Reading Program.
Proceedings of the 2021 IEEE International Conference on Engineering, 2021

Optimized Prediction of Fluency of L2 English Based on Interpretable Network Using Quantity of Phonation and Quality of Pronunciation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Synthesizer Sound Matching with Differentiable DSP.
Proceedings of the 22nd International Society for Music Information Retrieval Conference, 2021

Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Work-in-Progress: Analysis of the use of Mentoring with Online Mob Programming.
Proceedings of the IEEE Global Engineering Education Conference, 2021

Preliminary Literature Review of Machine Learning System Development Practices.
Proceedings of the IEEE 45th Annual Computers, Software, and Applications Conference, 2021

Automated Educational Program Mapping on Learning Standards in Computer Science.
Proceedings of the IEEE 45th Annual Computers, Software, and Applications Conference, 2021

Multi-Granularity Annotation of Instantaneous Intelligibility of Learners' Utterances Based on Shadowing Techniques.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Acoustic Simulation of Body-conducted Speech and Its Use to Convert One's Recorded Voices to One's Own Voices.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Rubric for Measuring and Visualizing the Effects of Learning Computer Programming for Elementary School Students.
J. Inf. Technol. Educ. Innov. Pract., 2020

Tensor Factor Analysis for Arbitrary Speaker Conversion.
IEICE Trans. Inf. Syst., 2020

Assessing Elementary School Students' Programming Thinking Skills using Rubrics.
Proceedings of the IEEE International Conference on Teaching, 2020

Interpretable Driver Models Discovery in Data.
Proceedings of the 23rd IEEE International Conference on Intelligent Transportation Systems, 2020

Nonparallel Training of Exemplar-Based Voice Conversion System Using INCA-Based Alignment Technique.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Discriminative Method to Extract Coarse Prosodic Structure and its Application for Statistical Phrase/Accent Command Estimation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Shadowability Annotation with Fine Granularity on L2 Utterances and its Improvement with Native Listeners' Script-Shadowing.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Attention-Based Speaker Embeddings for One-Shot Voice Conversion.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

2019
Many-to-Many and Completely Parallel-Data-Free Voice Conversion Based on Eigenspace DNN.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Preliminary Systematic Literature Review of Machine Learning System Development Process.
CoRR, 2019

Learning Effects in Programming Learning Using Python and Raspberry Pi: Case Study with Elementary School Students.
Proceedings of the IEEE International Conference on Engineering, Technology and Education, 2019

Voice Conversion without Explicit Separation of Source and Filter Components Based on Non-negative Matrix Factorization.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Generative Modeling of F0 Contours Leveraged by Phrase Structure and Its Application to Statistical Focus Control.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Voice conversion based on full-covariance mixture density networks for time-variant linear transformations.
Proceedings of the 10th ISCA Speech Synthesis Workshop, 2019

Native Listeners' Shadowing of Non-native Utterances as Spoken Annotation Representing Comprehensibility of the Utterances.
Proceedings of the 8th ISCA International Workshop on Speech and Language Technology in Education, 2019

Rubric to Evaluate Programming Learning of Elementary School Students.
Proceedings of the 50th ACM Technical Symposium on Computer Science Education, 2019

A Large Collection of Sentences Read Aloud by Vietnamese Learners of Japanese and Native Speaker's Reverse Shadowings.
Proceedings of the 22nd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2019

Analysis of Native Listeners' Facial Microexpressions While Shadowing Non-Native Speech - Potential of Shadowers' Facial Expressions for Comprehensibility Prediction.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Cooking State Recognition based on Acoustic Event Detection.
Proceedings of the 11th Workshop on Multimedia for Cooking and Eating Activities, 2019

The UTokyo speech synthesis system for Blizzard Challenge 2019.
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019

Speech representation based on tensor factor analysis and its application to speaker recognition and language identification.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Experimental investigation on the efficacy of Affine-DTW in the quality of voice conversion.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

DNN-based Statistical Parametric Speech Synthesis Incorporating Non-negative Matrix Factorization.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Wasserstein GAN and Waveform Loss-Based Acoustic Model Training for Multi-Speaker Text-to-Speech Synthesis Systems Using a WaveNet Vocoder.
IEEE Access, 2018

Noise Reduction Method for Intra-Body Communication by Using Compensation Electrode.
Proceedings of the TENCON 2018, 2018

DNN-Based Scoring of Language Learners' Proficiency Using Learners' Shadowings and Native Listeners' Responsive Shadowings.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018

A Comparative Study of Statistical Conversion of Face to Voice Based on Their Subjective Impressions.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Study of Objective Measurement of Comprehensibility through Native Speakers' Shadowing of Learners' Utterances.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Analysis of Transient Signal Due to Person Movement in Gate System Using Intra-Body Communication.
Proceedings of the 12th International Conference on Sensing Technology, 2018

Analysis of Unintentional Signal Propagation in Intra-Body Communication.
Proceedings of the IEEE 7th Global Conference on Consumer Electronics, 2018

A Revisit to Feature Handling for High-quality Voice Conversion Based on Gaussian Mixture Model.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Comparison of Text-Based and Visual-Based Programming Input Methods for First-Time Learners.
J. Inf. Technol. Educ. Res., 2017

Quantitative learning effect evaluation of programming learning tools.
Proceedings of the IEEE 6th International Conference on Teaching, 2017

Development and Maintenance of Practical and In-service Systems for Recording Shadowing Utterances and Their Assessment.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

New Features and Effectiveness of Suzuki-kun, the First and Only Prosodic Reading Tutor of Tokyo Japanese.
Proceedings of the 7th ISCA International Workshop on Speech and Language Technology in Education, 2017

Automatic Scoring of Shadowing Speech Based on DNN Posteriors and Their DTW.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Acoustic-to-Articulatory Mapping Based on Mixture of Probabilistic Canonical Correlation Analysis.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Use of Global and Acoustic Features Associated with Contextual Factors to Adapt Language Models for Spontaneous Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Parallel-Data-Free Many-to-Many Voice Conversion Based on DNN Integrated with Eigenspace Using a Non-Parallel Speech Corpus.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

The UTokyo speech synthesis system for Blizzard Challenge 2017.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

2016
Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

Prosodic Reading Tutor of Japanese, Suzuki-kun: The first and only educational tool to teach the formal Japanese.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Improved prediction of the accent gap between speakers of English for individual-based clustering of World Englishes.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Influence of the Programming Environment on Programming Education.
Proceedings of the 2016 ACM Conference on Innovation and Technology in Computer Science Education, 2016

Speaker Representations for Speaker Adaptation in Multiple Speakers' BLSTM-RNN-Based Speech Synthesis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Voice Conversion Based on Matrix Variate Gaussian Mixture Model Using Multiple Frame Features.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Prediction of the Articulatory Movements of Unseen Phonemes of a Speaker Using the Speech Structure of Another Speaker.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The Voice Conversion Challenge 2016.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Automatic Assessment and Error Detection of Shadowing Speech: Case of English Spoken by Japanese Learners.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Divergence estimation based on deep neural networks and its use for language identification.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The UTokyo System for Blizzard Challenge 2016.
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

Arbitrary speaker conversion based on speaker space bases constructed by deep neural networks.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015
Work in progress: A comparison of programming way: Illustration-based programming and text-based programming.
Proceedings of the IEEE International Conference on Teaching, 2015

Automatic prediction of intelligibility of English words spoken with Japanese accents - comparative study of features and models used for prediction.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Development of a prosodic reading tutor of Japanese - effective use of TTS and F0 contour modeling techniques for CALL.
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015

Noise-robust and stress-free visualization of pronunciation diversity of World Englishes using a learner's self-centered viewpoint.
Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Statistical acoustic-to-articulatory mapping unified with speaker normalization based on voice conversion.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A measure of phonetic similarity to quantify pronunciation variation by using ASR technology.
Proceedings of the 18th International Congress of Phonetic Sciences, 2015

SAS: A speaker verification spoofing database containing diverse attacks.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

2014
Speaker-basis Accent Clustering Using Invariant Structure Analysis and the Speech Accent Archive.
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Visualization of pronunciation diversity of world Englishes from a speaker's self-centered viewpoint.
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014

Minecraft-based preparatory training for software development project.
Proceedings of the 2014 IEEE International Professional Communication Conference, 2014

Application of matrix variate Gaussian mixture model to statistical voice conversion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Semi-supervised noise dictionary adaptation for exemplar-based noise robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2014

Improved and robust prediction of pronunciation distance for individual-basis clustering of World Englishes pronunciation.
Proceedings of the IEEE International Conference on Acoustics, 2014

A turning control of electric wheeled walker device by PSD camera information.
Proceedings of the IEEE 13th International Workshop on Advanced Motion Control, 2014

2013
A New Approach to Programming Language Education for Beginners with Top-Down Learning.
Int. J. Eng. Pedagog., 2013

Text-to-speech synthesizer based on combination of composite wavelet and hidden Markov models.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Probabilistic speech F<sub>0</sub> contour model incorporating statistical vocabulary model of phrase-accent command sequence.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Adaptive template adjustment for personalized gesture recognition based on a finger-worn device.
Proceedings of the International Joint Conference on Awareness Science and Technology & Ubi-Media Computing, 2013

Discriminative piecewise linear transformation based on deep learning for noise robust automatic speech recognition.
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013

2012
Statistical Voice Conversion Based on Noisy Channel Model.
IEEE Trans. Speech Audio Process., 2012

Hidden Markov Convolutive Mixture Model for Pitch Contour Analysis of Speech.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Effects of Speaker Adaptive Training on Tensor-based Arbitrary Speaker Conversion.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Assistance for Novice Users on Creating Songs from Japanese Lyrics.
Proceedings of the Non-Cochlear Sound: Proceedings of the 38th International Computer Music Conference, 2012

A tandem connectionist model using combination of multi-scale spectro-temporal features for acoustic event detection.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Correcting for non-uniform illumination when photographing the mural in the royal tomb of Amenophis III (III) Correcting mural images.
Proceedings of the 6th European Conference on Colour in Graphics, Imaging, and Vision, 2012

2011
One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Gesture Design of Hand-to-Speech Converter Derived from Speech-to-Hand Converter Based on Probabilistic Integration Model.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Adaptation of Prosody in Speech Synthesis by Changing Command Values of the Generation Process Model of Fundamental Frequency.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

High accurate model-integration-based voice conversion using dynamic features and model structure optimization.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Improved generation of prosodic features in HMM-based Mandarin speech synthesis.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Probabilistic integration of joint density model and speaker model for voice conversion.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

HMM-based sequence-to-frame mapping for voice conversion.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
A numerical method for solving the Vlasov-Poisson equation based on the conservative IDO scheme.
J. Comput. Phys., 2009

Optimal event search using a structural cost function - improvement of structure to speech conversion.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

2008
Decomposition of rotational distortion caused by VTL difference using eigenvalues of its transformation matrix.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Structure to speech conversion - speech generation based on infant-like vocal imitation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Directional dependency of cepstrum on vocal tract length.
Proceedings of the IEEE International Conference on Acoustics, 2008

2006
The effect of Age on Web-safe Color Visibility for a White Background.
Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006


  Loading...