Akinori Ito

Orcid: 0000-0002-8835-7877

According to our database1, Akinori Ito authored at least 196 papers between 1990 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


On csauthors.net:


Development of a Personal Guide Robot That Leads a Guest Hand-in-Hand While Keeping a Distance.
Sensors, April, 2024

Preserving Speaker Information in Direct Speech-to-Speech Translation with Non-Autoregressive Generation and Pretraining.
CoRR, 2024

Embedding Digital Signature into CSV Files Using Data Hiding.
CoRR, 2024

Multilingual Meta-Transfer Learning for Low-Resource Speech Recognition.
IEEE Access, 2024

A Replaceable Curiosity-Driven Candidate Agent Exploration Approach for Task-Oriented Dialog Policy Learning.
IEEE Access, 2024

Scheduled Curiosity-Deep Dyna-Q: Efficient Exploration for Dialog Policy Learning.
IEEE Access, 2024

Speaker Intimacy Estimation in Chat-Talks Based on Verbal and Non-Verbal Information.
IEEE Access, 2024

Character Expressions in Meta-Learning for Extremely Low Resource Language Speech Recognition.
Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Evaluation of Environmental Sound Classification using Vision Transformer.
Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Toward Photo-Realistic Facial Animation Generation Based on Keypoint Features.
Proceedings of the 2024 16th International Conference on Machine Learning and Computing, 2024

Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

A Study on Variable Embedding Locations of Reversible Spectral Speech Watermarking.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM.
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2024

Confidence-based Utterance Selection for a Recognizer-free Spoken Dialogue System.
Proceedings of the 15th International Conference on Machine Learning and Computing, 2023

Multimodal Expressive Embodied Conversational Agent Design.
Proceedings of the HCI International 2023 Posters, 2023

Development of a Teleoperated Play Tag Robot with Semi-Automatic Play<sup>*</sup>.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2022

Path Following Algorithm with Small Error for Guide Robot.
Proceedings of the Robot Intelligence Technology and Applications 7, 2022

A Light-weight Hand-waving Gesture Recognition Method Using Kinect V2 and Frequency Analysis.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2021

Multimodal Dialogue Response Timing Estimation Using Dialogue Context Encoder.
Proceedings of the Conversational AI for Natural Human-Centric Interaction, 2021

Neural Spoken-Response Generation Using Prosodic and Linguistic Context for Conversational Systems.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using Sentence Speakability.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Analysis of Feature Extraction by Convolutional Neural Network for Speech Emotion Recognition.
Proceedings of the 10th IEEE Global Conference on Consumer Electronics, 2021

Automatic assessment of English proficiency for Japanese learners without reference sentences based on deep neural network acoustic models.
Speech Commun., 2020

A Symbol-level Melody Completion Based on a Convolutional Neural Network with Generative Adversarial Learning.
J. Inf. Process., 2020

Evaluation of Person Tracking Methods for Human-Robot Physical Play.
Proceedings of the 2020 IEEE/SICE International Symposium on System Integration, 2020

Construction and Analysis of a Multimodal Chat-talk Corpus for Dialog Systems Considering Interpersonal Closeness.
Proceedings of The 12th Language Resources and Evaluation Conference, 2020

Multi-Stream Attention-Based BLSTM with Feature Segmentation for Speech Emotion Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Filler Prediction Based on Bidirectional LSTM for Generation of Natural Response of Spoken Dialog.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Incremental Response Generation Using Prefix-to-Prefix Model for Dialogue System.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Successive Japanese Lyrics Generation Based on Encoder-Decoder Model.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Analysis and Estimation of Sentence Speakability for English Pronunciation Evaluation.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Spoken Term Detection Based on Acoustic Models Trained in Multiple Languages for Zero-Resource Language.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

CycleGAN-Based High-Quality Non-Parallel Voice Conversion with Spectrogram and WaveRNN.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Improving Pronunciation Clarity of Dysarthric Speech Using CycleGAN with Multiple Speakers.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

A Study on Minimum Spectral Error Analysis of Speech.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

LJSing: Large-Scale Singing Voice Corpus of Single Japanese Singer.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Integration of Accent Sandhi and Prosodic Features Estimation for Japanese Text-to-Speech Synthesis.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

Improving human scoring of prosody using parametric speech synthesis.
Speech Commun., 2019

A Pedestrian Avoidance Method Considering Personal Space for a Guide Robot.
Robotics, 2019

Realization of a Robot System That Plays "Darumasan-Ga-Koronda " Game with Humans.
Robotics, 2019

Latent Words Recurrent Neural Network Language Models for Automatic Speech Recognition.
IEICE Trans. Inf. Syst., 2019

Domain Adaptation Based on Mixture of Latent Words Language Models for Automatic Speech Recognition.
IEICE Trans. Inf. Syst., 2018

IEICE Trans. Inf. Syst., 2018

Improving User Impression in Spoken Dialog System with Gradual Speech Form Control.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

An Analysis of the Effect of Emotional Speech Synthesis on Non-Task-Oriented Dialogue System.
Proceedings of the 19th Annual SIGdial Meeting on Discourse and Dialogue, 2018

Analyzing Effect of Physical Expression on English Proficiency for Multimodal Computer-Assisted Language Learning.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

A Study on a Spoken Dialogue System with Cooperative Emotional Speech Synthesis Using Acoustic and Linguistic Information.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Melody Completion Based on Convolutional Neural Networks and Generative Adversarial Learning.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Two-Stage Sequence-to-Sequence Neural Voice Conversion with Low-to-High Definition Spectrogram Mapping.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Comparison of Speech Recognition Performance Between Kaldi and Google Cloud Speech API.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Leveraging a Small Corpus by Different Frame Shifts for Training of a Speech Recognizer.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Muting Machine Speech Using Audio Watermarking.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

DNN-Based Talking Movie Generation with Face Direction Consideration.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Segmental Pitch Control Using Speech Input Based on Differential Contexts and Features for Customizable Neural Speech Synthesis.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Data Collection and Analysis for Automatically Generating Record of Human Behaviors by Environmental Sound Recognition.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Evaluation of English Speech Recognition for Japanese Learners Using DNN-Based Acoustic Models.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Improvement of Accent Sandhi Rules Based on Japanese Accent Dictionaries.
Proceedings of the Recent Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2018

Spoken Term Detection of Zero-Resource Language using Machine Learning.
Proceedings of the 2018 International Conference on Intelligent Information Technology, 2018

Effect of Mutual Self-Disclosure in Spoken Dialog System on User Impression.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

A Crowd Avoidance Method Using Circular Avoidance Path for Robust Person Following.
J. Robotics, 2017

Cluster-based approach to discriminate the user's state whether a user is embarrassed or thinking to an answer to a prompt.
J. Multimodal User Interfaces, 2017

Guest Editorial: Introduction to the Special Issue on the Enrichment of Sound, Speech and Music Media.
J. Inf. Hiding Multim. Signal Process., 2017

Manipulating Vocal Signal in Mixed Music Sounds using Side Information based on the Fundamental Frequency.
J. Inf. Hiding Multim. Signal Process., 2017

Enrichment of Audio Signal using Side Information.
J. Inf. Hiding Multim. Signal Process., 2017

Enhancement of person detection and tracking for a robot that plays with human.
Proceedings of the IEEE/SICE International Symposium on System Integration, 2017

Development and Evaluation of Julius-Compatible Interface for Kaldi ASR.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Response Selection of Interview-Based Dialog System Using User Focus and Semantic Orientation.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

A Study on 2D Photo-Realistic Facial Animation Generation Using 3D Facial Feature Points and Deep Neural Networks.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Evaluation of Nonlinear Tempo Modification Methods Based on Sinusoidal Modeling.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Dialog-Based Interactive Movie Recommendation: Comparison of Dialog Strategies.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Detection of Singing Mistakes from Singing Voice.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Collection of Example Sentences for Non-task-Oriented Dialog Using a Spoken Dialog System and Comparison with Hand-Crafted DB.
Proceedings of the HCI International 2017 - Posters' Extended Abstracts, 2017

Analysis of efficient multimodal features for estimating user's willingness to talk: Comparison of human-machine and human-human dialog.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Investigation of Combining Various Major Language Model Technologies including Data Expansion and Adaptation.
IEICE Trans. Inf. Syst., 2016

Effectiveness of Game Jam-based iterative program for game production in Japan.
Comput. Graph., 2016

Estimation of User's Willingness to Talk About the Topic: Analysis of Interviews Between Humans.
Proceedings of the Dialogues with Social Robots, 2016

Improvements of iSuperColliderKit and its Applications.
Proceedings of the 2016 International Computer Music Conference, 2016

Multiple description vector quantizer design based on redundant representation of central code.
Proceedings of the 24th European Signal Processing Conference, 2016

Game jam based iterative curriculum for game production in Japan.
Proceedings of the SIGGRAPH Asia 2015 Symposium on Education, 2015

Playing with a Robot: Realization of "Red Light, Green Light" Using a Laser Range Finder.
Proceedings of the Third International Conference on Robot, Vision and Signal Processing, 2015

Development of a mobile robot moving on a handrail - Control for preceding a person keeping a distance.
Proceedings of the 24th IEEE International Symposium on Robot and Human Interactive Communication, 2015

Entropy-based sentence selection for speech synthesis using phonetic and prosodic contexts.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Latent words recurrent neural network language models.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Combinations of various language model technologies including data expansion and adaptation in spontaneous speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Conversion of Speaker's Face Image Using PCA and Animation Unit for Video Chatting.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

Tempo Modification of Mixed Music Signal by Nonlinear Time Scaling and Sinusoidal Modeling.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

iSuperColliderKit: A Toolkit for iOS Using an Internal SuperCollider Server as a Sound Engine.
Proceedings of the Looking Back, 2015

On Appropriateness and Estimation of the Emotion of Synthesized Response Speech in a Spoken Dialogue System.
Proceedings of the HCI International 2015 - Posters' Extended Abstracts, 2015

Hierarchical Latent Words Language Models for Robust Modeling to Out-Of Domain Tasks.
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Packet Loss Concealment of Voice-over IP Packet using Redundant Parameter Transmission Under Severe Loss Conditions.
J. Inf. Hiding Multim. Signal Process., 2014

Automatic evaluation of singing enthusiasm for karaoke.
Comput. Speech Lang., 2014

User Modeling by Using Bag-of-Behaviors for Building a Dialog System Sensitive to the Interlocutor's Internal State.
Proceedings of the SIGDIAL 2014 Conference, 2014

Analysis of spectral enhancement using global variance in HMM-based speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Analysis of English Pronunciation of Singing Voices Sung by Japanese Speakers.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Assessing the Intended Enthusiasm of Singing Voice Using Energy Variance.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Robot: Have I done something wrong? - Analysis of prosodic features of speech commands under the robot's unintended behavior.
Proceedings of the International Conference on Audio, 2014

A study on the effect of speech rate on perception of spoken easy Japanese using speech synthesis.
Proceedings of the International Conference on Audio, 2014

Subjective evaluation of packet loss recovery techniques for voice over IP.
Proceedings of the International Conference on Audio, 2014

Controlling Switching Pause Using an AR Agent for Interactive CALL System.
Proceedings of the HCI International 2014 - Posters' Extended Abstracts, 2014

Speech recognition in a home environment using parallel decoding with GMM-based noise modeling.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

A Packet Loss Recovery of G.729 Speech Using Discriminative Model and N-Gram.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Acoustic Features and Auditory Impressions of Death Growl and Screaming Voice.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Evaluation of Sinusoidal Modeling for Polyphonic Music Signal.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Multi-modal Voice Activity Detection by Embedding Image Features into Speech Signal.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

ASAHI: OK for failure: a robot for supporting daily life, equipped with a robot avatar.
Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, 2013

Estimation of User's State during a Dialog Turn with Sequential Multi-modal Features.
Proceedings of the HCI International 2013 - Posters' Extended Abstracts, 2013

Speech recognition under noisy environments using multiple microphones based on asynchronous and intermittent measurements.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Round-Robin Duel Discriminative Language Models.
IEEE Trans. Speech Audio Process., 2012

Model Shrinkage for Discriminative Language Models.
IEICE Trans. Inf. Syst., 2012

Estimating a User's Internal State before the First Input Utterance.
Adv. Hum. Comput. Interact., 2012

Packet loss concealment of VoIP under severe loss conditions.
Proceedings of the 15th International Symposium on Wireless Personal Multimedia Communications, 2012

Spoken document retrieval by discriminative modeling in a high dimensional feature space.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Effect of Robot Height on Comfortableness of Spoken Dialog.
Proceedings of the 2012 5th International Conference on Human System Interactions, 2012

Estimation of User's Internal State before the User's First Utterance Using Acoustic Features and Face Orientation.
Proceedings of the 2012 5th International Conference on Human System Interactions, 2012

A packet loss recovery of G.729 speech under severe packet loss condition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

A spoken dialogue system using virtual conversational agent with augmented reality.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Recognition of utterances with grammatical mistakes based on optimization of language model towards interactive CALL systems.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

A Japanese lyrics writing support system for amateur songwriters.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

A System for Evaluating Singing Enthusiasm for Karaoke.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Language Model Expansion Using Webdata for Spoken Document Retrieval.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Training a Language Model Using Webdata for Large Vocabulary Japanese Spontaneous Speech Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various Environments.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Manipulating Vocal Signal in Mixed Music Sounds Using Small Amount of Side Information.
Proceedings of the Seventh International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2011

Round-robin duel discriminative language models in one-pass decoding with on-the-fly error correction.
Proceedings of the IEEE International Conference on Acoustics, 2011

Bit rate reduction of the MELP coder using Lempel-Ziv segment quantization.
Proceedings of the IEEE International Conference on Acoustics, 2011

An Analysis of Indonesian Traditional "Wayang Kulit" Puppet 3D Shapes Based on Their Roles in the Story.
Proceedings of the 2011 Second International Conference on Culture and Computing, 2011

Multiple Description Coding Using Time Domain Division for MP3 coded Sound Signal.
J. Inf. Hiding Multim. Signal Process., 2010

Designing Side Information of Multiple Description Coding.
J. Inf. Hiding Multim. Signal Process., 2010

Information Hiding for G.711 Speech Based on Substitution of Least Significant Bits and Estimation of Tolerable Distortion.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2010

Speech Recognition under Multiple Noise Environment Based on Multi-Mixture HMM and Weight Optimization by the Aspect Model.
IEICE Trans. Inf. Syst., 2010

Improved Reference Speaker Weighting Using Aspect Model.
IEICE Trans. Inf. Syst., 2010

Construction trial of a practical education curriculum for game development by industry-university collaboration in Japan.
Comput. Graph., 2010

Document expansion using relevant web documents for spoken document retrieval.
Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering, 2010

An effect of formant amplitude in vowel perception.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Improvement of Packet Loss Concealment for MP3 Audio Based on Switching of Concealment Method and Estimation of MDCT Signs.
Proceedings of the Sixth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2010), 2010

Aspect-model-based reference speaker weighting.
Proceedings of the IEEE International Conference on Acoustics, 2010

A speaker adaptation method for non-native speech using learners' native utterances for computer-assisted language learning systems.
Speech Commun., 2009

Novel Tonal Feature and Statistical User Modeling for Query-by-Humming.
J. Inf. Process., 2009

Dictation of Japanese Speech Based on Kana and Kanji Character String.
Int. J. Comput. Process. Orient. Lang., 2009

Automatic Query Generation and Query Relevance Measurement for Unsupervised Language Model Adaptation of Speech Recognition.
EURASIP J. Audio Speech Music. Process., 2009

Construction trial of a practical education curriculum for game development by industry/university collaboration.
Proceedings of the International Conference on Computer Graphics and Interactive Techniques, 2009

Detailed description of triphone model using SSS-free algorithm.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Relative importance of formant and whole-spectral cues for vowel perception.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Evaluation of English intonation based on combination of multiple evaluation scores.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Data Hiding is a Better Way for Transmitting Side Information for MP3 Bitstream.
Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009

A Band Extension of G.711 Speech with Low Computational Cost for Data Hiding Application.
Proceedings of the Fifth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2009), 2009

Detection of Abnormal Sound Using Multi-stage GMM for Surveillance Microphone.
Proceedings of the Fifth International Conference on Information Assurance and Security, 2009

Multiple description coding of an audio stream by optimum recovery transforms.
J. Digit. Inf. Manag., 2008

Selection of Optimum Vocabulary and Dialog Strategy for Noise-Robust Spoken Dialog Systems.
IEICE Trans. Inf. Syst., 2008

Automatic clustering of part-of-speech for vocabulary divided PLSA language model.
Proceedings of the 4th International Conference on Natural Language Processing and Knowledge Engineering, 2008

Intonation evaluation of English utterances using synthesized speech for Computer-Assisted Language Learning.
Proceedings of the 4th International Conference on Natural Language Processing and Knowledge Engineering, 2008

Recognition of English utterances with grammatical and lexical mistakes for dialogue-based CALL system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Discrimination of task-related words for vocabulary design of spoken dialog systems.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A fast speaker adaptation method using aspect model.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Packet Loss Concealment for MDCT-Based Audio Codec Using Correlation-Based Side Information.
Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2008), 2008

A New Segment Quantization Using Lempel-Ziv Algorithm and Its Application to Quantization of Line Spectral Frequencies.
IEEE Trans. Commun., 2007

Music Information Retrieval from a Singing Voice Using Lyrics and Melody Information.
EURASIP J. Adv. Signal Process., 2007

Increasing Correlation using a Few Bits for Multiple Description Coding.
Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007), 2007

An effective music information retrieval method using three-dimensional continuous DP.
IEEE Trans. Multim., 2006

A grammatical error detection method for dialogue-based CALL system.
Inf. Media Technol., 2006

Music Information Retrieval from a Singing Voice Based on Verification of Recognized Hypotheses.
Proceedings of the ISMIR 2006, 2006

Unsupervised language model adaptation based on automatic text collection from WWW.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A user simulator based on voiceXML for evaluation of spoken dialog systems.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Multiple Description Coding of an Audio Stream by Optimum Recovery Transform.
Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006

Lyrics Recognition from a Singing Voice Based on Finite State Automaton for Music Information Retrieval.
Proceedings of the ISMIR 2005, 2005

Construction method of acoustic models dealing with various background noises based on combination of HMMs.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Pronunciation error detection method based on error rule clustering using a decision tree.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Internal noise suppression for speech recognition by small robots.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Smile and Laughter Recognition using Speech Processing and Face Recognition from Conversation Video.
Proceedings of the 4th International Conference on Cyberworlds (CW 2005), 2005

Comparison Of Features For DP-Matching Based Query-by-Humming System.
Proceedings of the ISMIR 2004, 2004

Speaker adaptation method for CALL system using bilingual speakers' utterances.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A Japanese dialogue-based CALL system with mispronunciation and grammar error detection.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

A spoken dialog system based on automatic grammar generation and template-based weighting for autonomous mobile robots.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Noise adaptive spoken dialog system based on selection of multiple dialog strategies.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Three-dimensional continuous DP algorithm for multiple pitch candidates in a music information retrieval system.
Proceedings of the ISMIR 2003, 2003

An optimized multi-duration HMM for spontaneous speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Error Tolerant Melody Matching Method in Music Information Retrieval.
Proceedings of the Adaptive Multimedia Retrieval: First International Workshop, 2003

Erratum: Language modeling by stochastic dependency grammer for Japanese speech recognition.
Syst. Comput. Jpn., 2002

Construction and evaluation of language models based on stochastic context-free grammar for speech recognition Chiori Hori, Masaharu Katoh, Akinori Ito, Masaki Koh.
Syst. Comput. Jpn., 2002

Continuous Speech Recognition Consortium an Open Repository for CSR Tools and Models.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Language modeling by stochastic dependency grammar for Japanese speech recognition.
Syst. Comput. Jpn., 2001

New state clustering of hidden Markov network with Korean phonological rules for speech recognition.
Proceedings of the Fourth IEEE Workshop on Multimedia Signal Processing, 2001

IPA Japanese Dictation Free Software Project.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Free software toolkit for Japanese large vocabulary continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

A new metric for stochastic language model evaluation.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

N-gram language model adaptation using small corpus for spoken dialog recognition.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Language modeling by string pattern n-gram for Japanese speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

A New HMnet Construction Algorithm Requiring No Contextual Factors.
IEICE Trans. Inf. Syst., 1995

Performance prediction of word recognition using the transition information between phonemes or between characters.
Syst. Comput. Jpn., 1994

A Coutinuous Speech Recognition System Using A Modified LVQ2 Method and A Dependency Grammar with Semantic Constraints.
Int. J. Pattern Recognit. Artif. Intell., 1994

The performance prediction method on sentence recognition system using a finite state automaton.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

A new word pre-selection method based on an extended redundant hash addressing for continuous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 1993

Word pre-selection using a redundant hash addressing method for continuous speech recognition.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

A Japanese text dictation system based on phoneme recognition and a dependency grammar.
Proceedings of the 1991 International Conference on Acoustics, 1991

A Japanese text dictation system based on phoneme recognition using a modified LVQ2 method.
Proceedings of the First International Conference on Spoken Language Processing, 1990
