Takahiro Shinozaki

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Unsupervised Spoken Term Discovery Using wav2vec 2.0.

[BibT_eX]

[DOI]

Yu Iwamoto

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Automated Development of DNN Based Spoken Language Systems Using Evolutionary Algorithms.

[BibT_eX]

[DOI]

Shinji Watanabe

Kevin Duh

Proceedings of the Deep Neural Evolution - Deep Learning with Evolutionary Computation, 2020

Time-Domain Target-Speaker Speech Separation with Waveform-Based Speaker Embedding.

[BibT_eX]

[DOI]

Jianshu Zhao

Shengzhou Gao

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Pronunciation Erroneous Tendency Detection with Language Adversarial Represent Learning.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Sound Source Localization From Audio-Image Pairs Using Input Gradient Map.

[BibT_eX]

[DOI]

Tomohiro Tanaka

Proceedings of the 25th International Conference on Pattern Recognition, 2020

Spoken Language Acquisition Based on Reinforcement Learning and Word Unit Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Dual Inheritance Evolution Strategy for Deep Neural Network Optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Congress on Evolutionary Computation, 2020

2019

Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2019

Effective and Stable Neuron Model Optimization Based on Aggregated CMA-ES.

[BibT_eX]

[DOI]

Han Xu

Ryota Kobayashi

Proceedings of the IEEE International Conference on Acoustics, 2019

Efficient Free Keyword Detection Based on CNN and End-to-End Continuous DP-Matching.

[BibT_eX]

[DOI]

Tomohiro Tanaka

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Cross-Domain Speaker Recognition using Cycle-Consistent Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection.

[BibT_eX]

[DOI]

Taku Kato

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

F-Measure Based End-to-End Optimization of Neural Network Keyword Detectors.

[BibT_eX]

[DOI]

Tomohiro Tanaka

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Reward Only Training of Encoder-Decoder Digit Recognition Systems Based on Policy Gradient Methods.

[BibT_eX]

[DOI]

Yilong Peng

Hayato Shibata

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Evolution Strategy Based Automatic Tuning of Neural Machine Translation Systems.

[BibT_eX]

[DOI]

Hao Qin

Kevin Duh

Proceedings of the 14th International Conference on Spoken Language Translation, 2017

Semi-Supervised Learning of a Pronunciation Dictionary from Disjoint Phonemic Transcripts and Text.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Development and Evaluation of Julius-Compatible Interface for Kaldi ASR.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

A Study on 2D Photo-Realistic Facial Animation Generation Using 3D Facial Feature Points and Deep Neural Networks.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Composite embedding systems for ZeroSpeech2017 Track1.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016

Improving Eye Motion Sequence Recognition Using Electrooculography Based on Context-Dependent HMM.

[BibT_eX]

[DOI]

Comput. Intell. Neurosci., 2016

Automated structure discovery and parameter tuning of neural network language model based on evolution strategy.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

2015

Conversion of Speaker's Face Image Using PCA and Animation Unit for Video Chatting.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

Structure discovery of deep neural network based on evolutionary algorithms.

[BibT_eX]

[DOI]

Shinji Watanabe

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Automation of system building for state-of-the-art large vocabulary speech recognition using evolution strategy.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014

Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Automatic scoring method for open answer task in the SJ-CAT speaking test considering utterance difficulty level.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems.

[BibT_eX]

[DOI]

Fuming Fang

Takao Kobayashi

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

A statistical approach for person verification using human behavioral patterns.

[BibT_eX]

[DOI]

Felipe Gómez-Caballero

Koichi Shinoda

EURASIP J. Image Video Process., 2013

Reverberant speech recognition based on denoising autoencoder.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Statistical Person Verification Using Behavioral Patterns from Complex Human Motion.

[BibT_eX]

[DOI]

Felipe Gómez-Caballero

Koichi Shinoda

Proceedings of the New Trends in Image Analysis and Processing - ICIAP 2013, 2013

2012

Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules.

[BibT_eX]

[DOI]

Inf. Media Technol., 2012

HMM Based Continuous EOG Recognition for Eye-input Speech Interface.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Unsupervised CV language model adaptation based on direct likelihood maximization sentence selection.

[BibT_eX]

[DOI]

Yasuo Horiuchi

Shingo Kuroiwa

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Pipeline decomposition of speech decoders and their implementation based on delayed evaluation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Open answer scoring for S-CAT automated speaking test system using support vector regression.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011

Pseudo speaker models for text-independent speaker verification using rank threshold.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011

Person authentication using 3D human motion.

[BibT_eX]

[DOI]

Felipe Gómez-Caballero

Koichi Shinoda

Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding, 2011

Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010

Unsupervised Acoustic Model Adaptation Based on Ensemble Methods.

[BibT_eX]

[DOI]

Yu Kubota

IEEE J. Sel. Top. Signal Process., 2010

Gaussian Mixture Optimization Based on Efficient Cross-Validation.

[BibT_eX]

[DOI]

IEEE J. Sel. Top. Signal Process., 2010

Investigations on ensemble based unsupervised adaptation methods.

[BibT_eX]

[DOI]

Yu Kubota

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Target speech GMM-based spectral compensation for noise robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Unsupervisec cross-validation adaptation algorithms for improved adaptation performance.

[BibT_eX]

[DOI]

Yu Kubota

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

Cross-validation and aggregated EM training for robust parameter estimation.

[BibT_eX]

[DOI]

Mari Ostendorf

Comput. Speech Lang., 2008

Aggregated cross-validation and its efficient application to Gaussian mixture optimization.

[BibT_eX]

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

2007

Gaussian mixture optimization for HMM based on efficient cross-validation.

[BibT_eX]

[DOI]

Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Cross-Validation EM Training for Robust Parameter Estimation.

[BibT_eX]

[DOI]

Mari Ostendorf

Proceedings of the IEEE International Conference on Acoustics, 2007

Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

HMM training based on CV-EM and CV Gaussian mixture optimization.

[BibT_eX]

[DOI]

Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006

Investigation on Mandarin broadcast news speech recognition.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Hmm State Clustering Based on Efficient Cross-Validation.

[BibT_eX]

[DOI]

Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005

Pushing the envelope - aside [speech recognition].

[BibT_eX]

[DOI]

IEEE Signal Process. Mag., 2005

Data sampling for improved speech recognizer training.

[BibT_eX]

[DOI]

Mari Ostendorf

Les E. Atlas

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Cluster-based modeling for ubiquitous speech recognition.

[BibT_eX]

[DOI]

Tomohisa Ichiba

Edward W. D. Whittaker

Koji Iwano

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects.

[BibT_eX]

[DOI]

IEICE Trans. Inf. Syst., 2004

Spontaneous speech recognition using a massively parallel decoder.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003

Time adjustable mixture weights for speaking rate fluctuation.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised class-based language model adaptation for spontaneous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002

A new lexicon optimization method for LVCSR based on linguistic and acoustic characteristics of words.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Analysis on individual differences in automatic transcription of spontaneous presentations.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2002

2001

Towards automatic transcription of spontaneous presentations.

[BibT_eX]

[DOI]

Chiori Hori