Katunobu Itou

Affiliations:
  • Hosei University, Tokyo, Japan
  • Nagoya University, Graduate School of Information Science, Japan
  • National Institute of Advanced Industrial Science and Technology (AIST), Tukuba, Japan
  • Tokyo Institute of Technology, Japan


According to our database1, Katunobu Itou authored at least 91 papers between 1990 and 2022.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2022
Homophonic Music Composition Using Pipelined LSTMs for Melody and Harmony Generation.
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2022

Cross-Lingual Transfer Learning Approach to Phoneme Error Detection via Latent Phonetic Representation.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2020
F0 Estimation Using Blind Source Separation for Analyzing Noh Singing.
Proceedings of the 30th IEEE International Workshop on Machine Learning for Signal Processing, 2020

2019
Voice authentication by text dependent single utterance for in-car environment.
Proceedings of the Tenth International Symposium on Information and Communication Technology, 2019

2018
DNN-Based Near- and Far-Field Source Separation Using Spherical-Harmonic-Analysis-Based Acoustic Features.
Proceedings of the 16th International Workshop on Acoustic Signal Enhancement, 2018

Automatic Electronic Organ Reduction System Based on Melody Clustering Considering Melodic and Instrumental Characteristics.
Proceedings of the 2018 IEEE International Symposium on Multimedia, 2018

2016
Prominence detection for presentation training system.
Proceedings of the Seventh Symposium on Information and Communication Technology, 2016

2014
Intra-note segmentation via sticky HMM with DP emission.
Proceedings of the IEEE International Conference on Acoustics, 2014

2012
Discriminant analysis of the utterance state while singing.
Proceedings of the IEEE International Symposium on Signal Processing and Information Technology, 2012

2010
Speaker model updating by the conversational sounds in speaker verification.
Proceedings of the iiWAS'2010, 2010

2009
Construction of a Test Collection for Spoken Document Retrieval from Lecture Audio Data.
J. Inf. Process., 2009

The use of acoustically detected filled and silent pauses in spontaneous speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Effect of the Topic Dependent Translation Models for Patent Translation - Experiment at NTCIR-7.
Proceedings of the 7th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2008

In-car Speech Data Collection along with Various Multimodal Signals.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

Test Collections for Spoken Document Retrieval from Lecture Audio Data.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

2007
Language model adaptation for fixed phrases by amplifying partial n-gram sequences.
Syst. Comput. Jpn., 2007

Driver Modeling Based on Driving Behavior and Its Evaluation in Driver Identification.
Proc. IEEE, 2007

Non-factoid Question Answering Experiments at NTCIR-6: Towards Answer Type Detection for Realworld Questions.
Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2007

Statistical Machine Translation based Passage Retrieval for Cross-Lingual Question Answering --- Experiments at NTCIR-6.
Proceedings of the 6th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2007

A Stochastic Representation of the Dynamics of Sung Melody.
Proceedings of the 8th International Conference on Music Information Retrieval, 2007

Statistical segmentation and recognition of fingertip trajectories for a gesture interface.
Proceedings of the 9th International Conference on Multimodal Interfaces, 2007

2006
LODEM: A system for on-demand video lectures.
Speech Commun., 2006

Driver Identification Using Driving Behavior Signals.
IEICE Trans. Inf. Syst., 2006

Single-Channel Multiple Regression for In-Car Speech Enhancement.
IEICE Trans. Inf. Syst., 2006

Statistical Analysis for Thesaurus Construction using an Encyclopedic Corpus.
Proceedings of the Fifth International Conference on Language Resources and Evaluation, 2006

Characterizing in-Car Conversational Speech of Different Dialogue Modes.
Proceedings of the First International Conference on Innovative Computing, Information and Control (ICICIC 2006), 30 August, 2006

Cepstral Analysis of Driving Behavioral Signals for Driver Identification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Adaptive Regression Based Framework for In-Car Speech Recognition.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

Development of Micro-Dodecahedral Loudspeaker for Measuring Head-Related Transfer Functions in The Proximal region.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Construction and Evaluation of a Large In-Car Speech Corpus.
IEICE Trans. Inf. Syst., 2005

Multiple Regression of Log Spectra for In-Car Speech Recognition Using Multiple Distributed Microphones.
IEICE Trans. Inf. Syst., 2005

Adaptive Nonlinear Regression Using Multiple Distributed Microphones for In-Car Speech Recognition.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2005

Speech Recognition Using Finger Tapping Timings.
IEICE Trans. Inf. Syst., 2005

Cyclone: an encyclopedic web search site.
Proceedings of the 14th international conference on World Wide Web, 2005

Bi-directional Cross Language Question Answering using a Single Monolingual QA System.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Question Answering Experiments at NTCIR-5: Acquisition of Answer Evaluation Patterns and Context Processing using Passage Retrieval.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Exploiting Anchor Text for the Navigational Web Retrieval at NTCIR-5.
Proceedings of the Fifth NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2005

Modeling of individualities in driving through spectral analysis of behavioral signals.
Proceedings of the Eighth International Symposium on Signal Processing and Its Applications, 2005

Data collection and evaluation of speech recognition for motorbike riders.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Discrimination between singing and speaking voices.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Subjective and objective quality assessment of regression-enhanced speech in real car environments.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Improved Noise Spectra Estimation and Log-spectral Regression for In-car Speech Recognition.
Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Analysis of a large in-car speech corpus.
Proceedings of the 21st International Conference on Data Engineering Workshops, 2005

Two-stage Noise Spectra Estimation and Regression based In-car Speech Recognition using Single Distant Microphone.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Analysis of a large in-car speech corpus and its application to the multimodel ASR.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Galatea: Open-Source Software for Developing Anthropomorphic Spoken Dialog Agents.
Proceedings of the Life-like characters - tools, affective functions, and applications., 2004

In-Car Speech Recognition Using Distributed Multiple Microphones.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Question Answering Using "Common Sense" and Utility Maximization Principle.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Experiments on Web Retrieval Driven by Spontaneously Spoken Queries.
Proceedings of the Fourth NTCIR Workshop on Research in Information Access Technologies Information Retrieval, 2004

Collecting Spontaneously Spoken Queries for Information Retrieval.
Proceedings of the Fourth International Conference on Language Resources and Evaluation, 2004

Speech-Recognition Interfaces for Music Information Retrieval: 'Speech Completion' and 'Speech Spotter'.
Proceedings of the ISMIR 2004, 2004

Recent progress of open-source LVCSR engine julius and Japanese model repository.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Effects of language modeling on speech-driven question answering.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech spotter: on-demand speech recognition in human-human conversation on the telephone or in face-to-face situations.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Analysis of in-car speech recognition experiments using a large-scale multi-mode dialogue corpus.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Unsupervised topic adaptation for lecture speech retrieval.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Speech recognition using synchronization between speech and finger tapping.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Biometric identification using driving behavioral signals.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

2003
Speech starter: noise-robust endpoint detection by using filled pauses.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Speech shift: direct speech-input-mode switching through intentional control of voice pitch.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

A cross-media retrieval system for lecture videos.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Building a test collection for speech-driven web retrieval.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Adapting language models for frequent fixed phrases by emphasizing n-gram subsets.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Language Modeling for Multi-Domain Speech-Driven Text Retrieval
CoRR, 2002

Evaluating Speech-Driven IR in the NTCIR-3 Web Retrieval Task.
Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

Towards Speech-Driven Question Answering: Experiments Using the NTCIR-3 Question Answering Collection.
Proceedings of the Third NTCIR Workshop on Research in Information Retrieval, 2002

Continuous Speech Recognition Consortium an Open Repository for CSR Tools and Models.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Producing a Large-scale Encyclopedic Corpus over the Web.
Proceedings of the Third International Conference on Language Resources and Evaluation, 2002

Speech completion: on-demand completion assistance using filled pauses for speech input interfaces.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Selective back-off smoothing for incorporating grammatical constraints into the n-gram language model.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A Method for Open-Vocabulary Speech-Driven Text Retrieval.
Proceedings of the 2002 Conference on Empirical Methods in Natural Language Processing, 2002

2001
Jijo-2: An Office Robot that Communicates and Learns.
IEEE Intell. Syst., 2001

Speech-Driven Text Retrieval: Using Target IR Collections for Statistical Language Model Adaptation in Speech Recognition.
Proceedings of the Information Retrieval Techniques for Speech Applications [this book is based on the workshop "Information Retrieval Techniques for Speech Applications", 2001

Spoken Language Interface of the Jijo-2 Office Robot.
Proceedings of the Robotics Research, The Tenth International Symposium, 2001

Real-time sound source localization and separation system and its application to automatic speech recognition.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

A structured statistical language model conditioned by arbitrarily abstracted grammatical categories based on GLR parsing.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

2000
IPA Japanese Dictation Free Software Project.
Proceedings of the Second International Conference on Language Resources and Evaluation, 2000

Free software toolkit for Japanese large vocabulary continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Semi-automatic language model acquisition without large corpora.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

1999
A real-time filled pause detection system for spontaneous speech recognition.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

1998
Sharable software repository for Japanese large vocabulary continuous speech recognition.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

The design of the newspaper-based Japanese large vocabulary continuous speech recognition corpus.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

1996
RWC multimodal database for interactions by integration of spoken language and visual information.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

1995
Active Agent Oriented Multimodal Interface System.
Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995

1994
Annotating illocutionary force types and phonological features into a spontaneous dialogue corpus: an experimental study.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Collecting and analyzing nonverbal elements for maintenance of dialog using a wizard of oz simulation.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

1993
Detection of unknown words in large vocabulary speech recognition.
Proceedings of the Third European Conference on Speech Communication and Technology, 1993

1992
Detection of unknown words and automatic estimation of their transcriptions in continuous speech recognition.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

A spoken language dialogue system for automatic collection of spontaneous speech.
Proceedings of the Second International Conference on Spoken Language Processing, 1992

Continuous speech recognition by context-dependent phonetic HMM and an efficient algorithm for finding N-Best sentence hypotheses.
Proceedings of the 1992 IEEE International Conference on Acoustics, 1992

1990
Japanese phonetic typewriter using HMM phone units and syllable trigrams.
Proceedings of the First International Conference on Spoken Language Processing, 1990


  Loading...