Takahiro Shinozaki

Orcid: 0000-0001-8114-8450

According to our database1, Takahiro Shinozaki authored at least 96 papers between 2000 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Self-Supervised Syllable Discovery Based on Speaker-Disentangled HuBERT.
CoRR, 2024

Deep Generic Representations for Domain-Generalized Anomalous Sound Detection.
CoRR, 2024

Self-Supervised Speaker Verification with Adaptive Threshold and Hierarchical Training.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Streaming End-to-End Target-Speaker Automatic Speech Recognition and Activity Detection.
IEEE Access, 2023

Memory Network-Based End-To-End Neural ES-KMeans for Improved Word Segmentation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder.
Proceedings of the IEEE International Conference on Acoustics, 2023

Multi-Domain Dialogue State Tracking with Disentangled Domain-Slot Attention.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022
Exploiting Adapters for Cross-Lingual Low-Resource Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Automatic Spoken Language Acquisition Based on Observation and Dialogue.
IEEE J. Sel. Top. Signal Process., 2022

USB: A Unified Semi-supervised Learning Benchmark.
CoRR, 2022

FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning.
CoRR, 2022

Multi-Domain Dialogue State Tracking with Top-K Slot Self Attention.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

Self-Adaptive Multilingual ASR Rescoring with Language Identification and Unified Language Model.
Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

USB: A Unified Semi-supervised Learning Benchmark for Classification.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Censer: Curriculum Semi-supervised Learning for Speech Recognition Based on Self-supervised Pre-training.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Augmented Adversarial Self-Supervised Learning for Early-Stage Alzheimer's Speech Detection.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Self-Supervised Learning with Multi-Target Contrastive Coding for Non-Native Acoustic Modeling of Mispronunciation Verification.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Streaming Target-Speaker ASR with Neural Transducer.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration.
Proceedings of the IEEE International Conference on Acoustics, 2022

Exploiting Unlabeled Data for Target-Oriented Opinion Words Extraction.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

Margin Calibration for Long-Tailed Visual Recognition.
Proceedings of the Asian Conference on Machine Learning, 2022

2021
Non-native acoustic modeling for mispronunciation verification based on language adversarial representation learning.
Neural Networks, 2021

Unsupervised Acoustic-to-Articulatory Inversion Neural Network Learning Based on Deterministic Policy Gradient.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021

Self-Supervised Spoken Question Understanding and Speaking with Automatic Vocabulary Learning.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

FlexMatch: Boosting Semi-Supervised Learning with Curriculum Pseudo Labeling.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Cross-Domain Speech Recognition with Unsupervised Character-Level Distribution Matching.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Meta-Adapter: Efficient Cross-Lingual Adaptation With Meta-Learning.
Proceedings of the IEEE International Conference on Acoustics, 2021

Low-Resource Mandarin Prosodic Structure Prediction Using Self-Training.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Unsupervised Spoken Term Discovery Using wav2vec 2.0.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020
Automated Development of DNN Based Spoken Language Systems Using Evolutionary Algorithms.
Proceedings of the Deep Neural Evolution - Deep Learning with Evolutionary Computation, 2020

Time-Domain Target-Speaker Speech Separation with Waveform-Based Speaker Embedding.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Sound-Image Grounding Based Focusing Mechanism for Efficient Automatic Spoken Language Acquisition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Pronunciation Erroneous Tendency Detection with Language Adversarial Represent Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Large-Scale End-to-End Multilingual Speech Recognition and Language Identification with Multi-Task Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Unsupervised Sound Source Localization From Audio-Image Pairs Using Input Gradient Map.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Spoken Language Acquisition Based on Reinforcement Learning and Word Unit Segmentation.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Dual Inheritance Evolution Strategy for Deep Neural Network Optimization.
Proceedings of the IEEE Congress on Evolutionary Computation, 2020

2019
Evolution-Strategy-Based Automation of System Development for High-Performance Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Effective and Stable Neuron Model Optimization Based on Aggregated CMA-ES.
Proceedings of the IEEE International Conference on Acoustics, 2019

Efficient Free Keyword Detection Based on CNN and End-to-End Continuous DP-Matching.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

Cross-Domain Speaker Recognition using Cycle-Consistent Adversarial Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
Reinforcement Learning of Speech Recognition System Based on Policy Gradient and Hypothesis Selection.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

F-Measure Based End-to-End Optimization of Neural Network Keyword Detectors.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Reward Only Training of Encoder-Decoder Digit Recognition Systems Based on Policy Gradient Methods.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Evolution Strategy Based Automatic Tuning of Neural Machine Translation Systems.
Proceedings of the 14th International Conference on Spoken Language Translation, 2017

Semi-Supervised Learning of a Pronunciation Dictionary from Disjoint Phonemic Transcripts and Text.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Development and Evaluation of Julius-Compatible Interface for Kaldi ASR.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

A Study on 2D Photo-Realistic Facial Animation Generation Using 3D Facial Feature Points and Deep Neural Networks.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Voice Conversion from Arbitrary Speakers Based on Deep Neural Networks with Adversarial Learning.
Proceedings of the Advances in Intelligent Information Hiding and Multimedia Signal Processing, 2017

Composite embedding systems for ZeroSpeech2017 Track1.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

2016
Improving Eye Motion Sequence Recognition Using Electrooculography Based on Context-Dependent HMM.
Comput. Intell. Neurosci., 2016

Automated structure discovery and parameter tuning of neural network language model based on evolution strategy.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

2015
Conversion of Speaker's Face Image Using PCA and Animation Unit for Video Chatting.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

Structure discovery of deep neural network based on evolutionary algorithms.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Automation of system building for state-of-the-art large vocabulary speech recognition using evolution strategy.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

2014
Accent type and phrase boundary estimation using acoustic and language models for automatic prosodic labeling.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Automatic scoring method for open answer task in the SJ-CAT speaking test considering utterance difficulty level.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

An automatic input protocol recommendation method for tailored switch-to-speech communication aid systems.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A statistical approach for person verification using human behavioral patterns.
EURASIP J. Image Video Process., 2013

Reverberant speech recognition based on denoising autoencoder.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Statistical Person Verification Using Behavioral Patterns from Complex Human Motion.
Proceedings of the New Trends in Image Analysis and Processing - ICIAP 2013, 2013

2012
Distance-based Factor Graph Linearization and Sampled Max-sum Algorithm for Efficient 3D Potential Decoding of Macromolecules.
Inf. Media Technol., 2012

HMM Based Continuous EOG Recognition for Eye-input Speech Interface.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Unsupervised CV language model adaptation based on direct likelihood maximization sentence selection.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Pipeline decomposition of speech decoders and their implementation based on delayed evaluation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Open answer scoring for S-CAT automated speaking test system using support vector regression.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

2011
Pseudo speaker models for text-independent speaker verification using rank threshold.
Proceedings of the 7th International Conference on Natural Language Processing and Knowledge Engineering, 2011

Person authentication using 3D human motion.
Proceedings of the 2011 joint ACM workshop on Human gesture and behavior understanding, 2011

Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

2010
Unsupervised Acoustic Model Adaptation Based on Ensemble Methods.
IEEE J. Sel. Top. Signal Process., 2010

Gaussian Mixture Optimization Based on Efficient Cross-Validation.
IEEE J. Sel. Top. Signal Process., 2010

Investigations on ensemble based unsupervised adaptation methods.
Proceedings of the IEEE International Conference on Acoustics, 2010

2009
Target speech GMM-based spectral compensation for noise robust speech recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Unsupervisec cross-validation adaptation algorithms for improved adaptation performance.
Proceedings of the IEEE International Conference on Acoustics, 2009

2008
Cross-validation and aggregated EM training for robust parameter estimation.
Comput. Speech Lang., 2008

Aggregated cross-validation and its efficient application to Gaussian mixture optimization.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation.
Proceedings of the IEEE International Conference on Acoustics, 2008

2007
Gaussian mixture optimization for HMM based on efficient cross-validation.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Cross-Validation EM Training for Robust Parameter Estimation.
Proceedings of the IEEE International Conference on Acoustics, 2007

Model Complexity Selection and Cross-Validation EM Training for Robust Speaker Diarization.
Proceedings of the IEEE International Conference on Acoustics, 2007

HMM training based on CV-EM and CV Gaussian mixture optimization.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

2006
Investigation on Mandarin broadcast news speech recognition.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Hmm State Clustering Based on Efficient Cross-Validation.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Pushing the envelope - aside [speech recognition].
IEEE Signal Process. Mag., 2005

Data sampling for improved speech recognizer training.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Cluster-based modeling for ubiquitous speech recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Dynamic Bayesian Network-Based Acoustic Models Incorporating Speaking Rate Effects.
IEICE Trans. Inf. Syst., 2004

Spontaneous speech recognition using a massively parallel decoder.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

2003
Time adjustable mixture weights for speaking rate fluctuation.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Unsupervised class-based language model adaptation for spontaneous speech recognition.
Proceedings of the 2003 IEEE International Conference on Acoustics, 2003

2002
A new lexicon optimization method for LVCSR based on linguistic and acoustic characteristics of words.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Analysis on individual differences in automatic transcription of spontaneous presentations.
Proceedings of the IEEE International Conference on Acoustics, 2002

2001
Towards automatic transcription of spontaneous presentations.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Ubiquitous speech processing.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Toward the realization of spontaneous speech recognition - introduction of a Japanese priority program and preliminary results -.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000


  Loading...