Satoshi Tamura

According to our database1, Satoshi Tamura authored at least 73 papers between 2001 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 




2024 Country Report Timor Leste.
Proceedings of the 27th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2024

Speech Recognition for Indigenous Language Using Self-Supervised Learning and Natural Language Processing.
Proceedings of the 13th International Conference on Pattern Recognition Applications and Methods, 2024

Few-Shot Anomalous Sound Detection Based on Anomaly Map Estimation Using Pseudo Abnormal Data.
Proceedings of the IEEE International Conference on Acoustics, 2024

Speech Recognition for Minority Languages Using HuBERT and Model Adaptation.
Proceedings of the 12th International Conference on Pattern Recognition Applications and Methods, 2023

Visual-only Voice Activity Detection using Human Motion in Conference Video.
Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods, 2022

Efficient Multi-angle Audio-visual Speech Recognition using Parallel WaveGAN based Scene Classifier.
Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods, 2022

Multi-Angle Lipreading with Angle Classification-Based Feature Extraction and Its Application to Audio-Visual Speech Recognition.
Future Internet, 2021

GAMVA: A Japanese Audio-Visual Multi-Angle Speech Corpus.
Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

Combination of temporal and spatial denoising methods for cine MRI.
Proceedings of the 3rd IEEE Global Conference on Life Sciences and Technologies, 2021

Speech Recognition using Deep Canonical Correlation Analysis in Noisy Environments.
Proceedings of the 10th International Conference on Pattern Recognition Applications and Methods, 2021

Anomalous Sound Detection Based On Attention Mechanism.
Proceedings of the 29th European Signal Processing Conference, 2021

Multi-view Convolution for Lipreading.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

Multi-angle lipreading using angle classification and angle-specific feature integration.
Proceedings of the International Conference on Communications, 2020

Feature Extraction Methods Proposed for Speech Recognition Are Effective on Road Condition Monitoring Using Smartphone Inertial Sensors.
Sensors, 2019

A Deep Learning-Based Approach for Road Pothole Detection in Timor Leste.
Proceedings of the 2018 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Singpapore, Singapore, July 31, 2018

An Automatic Survey System for Paved and Unpaved Road Classification and Road Anomaly Detection using Smartphone Sensor.
Proceedings of the 2018 IEEE International Conference on Service Operations and Logistics, and Informatics (SOLI), Singpapore, Singapore, July 31, 2018

Audio-visual Voice Conversion Using Deep Canonical Correlation Analysis for Deep Bottleneck Features.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Toward a High Performance Piano Practice Support System for Beginners.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Lipreading using deep bottleneck features for optical and depth images.
Proceedings of the 14th International Conference on Auditory-Visual Speech Processing, 2017

Toward effective noise reduction for sub-Nyquist high-frame-rate MRI techniques with deep learning.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Swallowing function evaluation using deep-learning-based acoustic signal processing.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Investigation of DNN-Based Audio-Visual Speech Recognition.
IEICE Trans. Inf. Syst., 2016

A fully integrated GaN-based power IC including gate drivers for high-efficiency DC-DC Converters.
Proceedings of the 2016 IEEE Symposium on VLSI Circuits, 2016

Spoken Document Retrieval Using Neighboring Documents and Extended Language Models for Query Likelihood Model.
Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Investigation of clinical process visualization using EMR data in clinics.
Proceedings of the AMIA 2016, 2016

Integration of deep bottleneck features for audio-visual speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multi-modal service operation estimation using DNN-based acoustic bag-of-features.
Proceedings of the 23rd European Signal Processing Conference, 2015

Stream weight estimation using higher order statistics in multi-modal speech recognition.
Proceedings of the Auditory-Visual Speech Processing, 2015

Audio-visual speech recognition using deep bottleneck features and high-performance lipreading.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Data collection for mobile audio-visual speech recognition in various environments.
Proceedings of the 2014 17th Oriental Chapter of the International Committee for the Co-ordination and Standardization of Speech Databases and Assessment Techniques (COCOSDA), 2014

Segmented Spoken Document Retrieval Using Word Co-occurrence Information.
Proceedings of the 11th NTCIR Conference on Evaluation of Information Access Technologies, 2014

Audio-visual voice conversion using noise-robust features.
Proceedings of the IEEE International Conference on Acoustics, 2014

Improvement of utterance clustering by using employees' sound and area data.
Proceedings of the IEEE International Conference on Acoustics, 2014

Analysis of customer communication by employee in restaurant and lead time estimation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Probabilistic expression of Polynomial Semantic Indexing and its application for classification.
Pattern Recognit. Lett., 2013

Measurement and analysis of speech data toward improving service in restaurant.
Proceedings of the 2013 International Conference Oriental COCOSDA held jointly with 2013 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2013

Spoken Document Retrieval Using Extended Query Model and Web Documents.
Proceedings of the 10th NTCIR Conference on Evaluation of Information Access Technologies, 2013

Hidden Markov Model for Analyzing Time-Series Health Checkup Data.
Proceedings of the MEDINFO 2013, 2013

Improvement of lipreading performance using discriminative feature and speaker adaptation.
Proceedings of the Auditory-Visual Speech Processing, 2013

Audio-visual interaction in sparse representation features for noise robust audio-visual speech recognition.
Proceedings of the Auditory-Visual Speech Processing, 2013

Confidence estimation and keyword extraction from speech recognition result based on Web information.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

Time-series analysis of health checkup data using Hidden-Markov model.
Proceedings of the AMIA 2013, 2013

Improvement of Lip Reading Performance in Real Environments Using Speaker and Environmental Adaptation.
Proceedings of the 2nd IAPR Asian Conference on Pattern Recognition, 2013

Visual Analysis of Health Checkup Data Using Multidimensional Scaling.
J. Adv. Comput. Intell. Intell. Informatics, 2012

Sparse representation of audio features for sputum detection from lung sounds.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

GIF-LR: GA-based informative feature for lipreading.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

GIF-SP: GA-based informative feature for noisy speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Multi-stream acoustic model adaptation for noisy speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Feature reconstruction using sparse imputation for noise robust audio-visual speech recognition.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Statistical voice conversion using GA-based informative feature.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Toward polyphonic musical instrument identification using example-based sparse representation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Toward improvement of SDR accuracy using LDA and query expansion for SpokenDoc.
Proceedings of the 9th NTCIR Workshop Meeting on Evaluation of Information Access Technologies: Information Retrieval, 2011

A robust audio-visual speech recognition using audio-visual voice activity detection.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Template-based spectral estimation using microphone array for speech recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

CENSREC-1-AV: an audio-visual corpus for noisy bimodal speech recognition.
Proceedings of the Auditory-Visual Speech Processing, 2010

Decision fusion by boosting method for multi-modal voice activity detection.
Proceedings of the Auditory-Visual Speech Processing, 2010

Evaluation of real-time audio-visual speech recognition.
Proceedings of the Auditory-Visual Speech Processing, 2010

Voice activity detection based on fusion of audio and visual information.
Proceedings of the Auditory-Visual Speech Processing, 2009

Evaluation Framework for Distant-talking Speech Recognition under Reverberant Environments: newest Part of the CENSREC Series -.
Proceedings of the International Conference on Language Resources and Evaluation, 2008

CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

CENSREC-AV: evaluation frameworks for audio-visual speech recognition.
Proceedings of the International Conference on Auditory-Visual Speech Processing 2008, 2008

Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images.
EURASIP J. Audio Speech Music. Process., 2007

GEMSIS - a novel application of speech recognition to emergency and disaster medicine.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Development of VAD evaluation framework CENSREC-1-C and investigation of relationship between VAD and speech recognition performance.
Proceedings of the IEEE Workshop on Automatic Speech Recognition & Understanding, 2007

Note-Taking Support for Nurses Using Digital Pen Character Recognition System.
Proceedings of the Interactive Technologies and Sociotechnical Systems, 2006

Automatic metadata generation and video editing based on speech and image recognition for medical education contents.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

A Stream-Weight Optimization Method for Multi-Stream HMMS Based on Likelihood Value Normalization.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

Multi-Modal Speech Recognition Using Optical-Flow Analysis for Lip Images.
J. VLSI Signal Process., 2004

A stream-weight optimization method for audio-visual speech recognition using multi-stream HMMs.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

Audio-visual speech recognition using lip movement extracted from side-face images.
Proceedings of the AVSP 2003, 2003

Multi-Modal Temporal Asynchronicity Modeling by Product HMMs for Robust.
Proceedings of the 4th IEEE International Conference on Multimodal Interfaces (ICMI 2002), 2002

Robust bi-modal speech recognition based on state synchronous modeling and stream weight optimization.
Proceedings of the IEEE International Conference on Acoustics, 2002

Ubiquitous speech processing.
Proceedings of the IEEE International Conference on Acoustics, 2001
