Yasuo Ariki

Orcid: 0000-0003-3473-2026

According to our database1, Yasuo Ariki authored at least 238 papers between 1984 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Integrating Textual and Financial Time Series Data for Enhanced Forecasting.
Proceedings of the 16th IIAI International Congress on Advanced Applied Informatics, 2024

2023
Reversible designs for extreme memory cost reduction of CNN training.
EURASIP J. Image Video Process., 2023

Rule-based Fact Verification Utilizing Knowledge Graphs.
Proceedings of the Workshop, 2023

2022
Building a Knowledge-Based Dialogue System with Text Infilling.
Proceedings of the 23rd Annual Meeting of the Special Interest Group on Discourse and Dialogue, 2022

2021
Unsupervised domain adaptation for lip reading based on cross-modal knowledge distillation.
EURASIP J. Audio Speech Music. Process., 2021

2020
Dysarthric Speech Recognition Based on Deep Metric Learning.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Two-Step Acoustic Model Adaptation for Dysarthric Speech Recognition.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Convolutional neural networks Memory optimization Inference with Splitting Image.
Proceedings of the 9th IEEE Global Conference on Consumer Electronics, 2020

FasterRCNN Monitoring of Road Damages: Competition and Deployment.
Proceedings of the 2020 IEEE International Conference on Big Data (IEEE BigData 2020), 2020

2019
Emotional Voice Conversion Using Dual Supervised Adversarial Networks With Continuous Wavelet Transform F0 Features.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Semantic embeddings of generic objects for zero-shot learning.
EURASIP J. Image Video Process., 2019

Non-parallel dictionary learning for voice conversion using non-negative Tucker decomposition.
EURASIP J. Audio Speech Music. Process., 2019

Reversible designs for extreme memory cost reduction of CNN training.
CoRR, 2019

Knowledge Transferability Between the Speech Data of Persons With Dysarthria Speaking Different Languages for Dysarthric Speech Recognition.
IEEE Access, 2019

Generation of Objections Using Topic and Claim Information in Debate Dialogue System.
Proceedings of the Increasing Naturalness and Flexibility in Spoken Dialogue Interaction, 2019

Layer-Wise Invertibility for Extreme Memory Cost Reduction of CNN Training.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Assisting human experts in the interpretation of their visual process: A case study on assessing copper surface adhesive potency.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

End-to-end Dysarthric Speech Recognition Using Multiple Databases.
Proceedings of the IEEE International Conference on Acoustics, 2019

On Zero-Shot Recognition of Generic Objects.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Debate Dialog for News Question Answering System 'NetTv'-Debate Based on Claim and Reason Estimation-.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Chat Response Generation Based on Semantic Prediction Using Distributed Representations of Words.
Proceedings of the 9th International Workshop on Spoken Dialogue System Technology, 2018

Parallel-Data-Free Dictionary Learning for Voice Conversion Using Non-Negative Tucker Decomposition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

User's Intention Understanding in Question-Answering System Using Attention-based LSTM.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017
Rotation-reversal invariant HOG cascade for facial expression recognition.
Signal Image Video Process., 2017

Emotional voice conversion using neural networks with arbitrary scales F0 based on wavelet transform.
EURASIP J. Audio Speech Music. Process., 2017

Visual-to-speech conversion based on maximum likelihood estimation.
Proceedings of the Fifteenth IAPR International Conference on Machine Vision Applications, 2017

Emotional Voice Conversion with Adaptive Scales F0 Based on Wavelet Transform Using Limited Amount of Emotional Data.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Phoneme-Discriminative Features for Dysarthric Speech Conversion.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Facial Expression Recognition with deep age.
Proceedings of the 2017 IEEE International Conference on Multimedia & Expo Workshops, 2017

Spatiotemporal properties of magnetic fields induced by auditory speech sound imagery and perception.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

2016
Multiple Non-Negative Matrix Factorization for Many-to-Many Voice Conversion.
IEEE ACM Trans. Audio Speech Lang. Process., 2016

LLC Revisit: Scene Classification with <i>k</i>-Farthest Neighbours.
IEICE Trans. Inf. Syst., 2016

Multithreading cascade of SURF for facial expression recognition.
EURASIP J. Image Video Process., 2016

Emotional Voice Conversion Using Neural Networks with Different Temporal Scales of F0 based on Wavelet Transform.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

Audio-Visual Speech Recognition Using Bimodal-Trained Bottleneck Features for a Person with Severe Hearing Loss.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Parallel Dictionary Learning for Voice Conversion Using Discriminative Graph-Embedded Non-Negative Matrix Factorization.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Modeling deep bidirectional relationships for image classification and generation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Semi-non-negative matrix factorization using alternating direction method of multipliers for voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Expression Recognition with Ri-HOG Cascade.
Proceedings of the Computer Vision - ACCV 2016 Workshops, 2016

Emotional voice conversion using deep neural networks with MCC and F0 features.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Lip reading using a dynamic feature of lip images and convolutional neural networks.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

Selection of an optimum random matrix using a genetic algorithm for acoustic feature extraction.
Proceedings of the 15th IEEE/ACIS International Conference on Computer and Information Science, 2016

2015
Voice Conversion Using RNN Pre-Trained by Recurrent Temporal Restricted Boltzmann Machines.
IEEE ACM Trans. Audio Speech Lang. Process., 2015

Individuality-Preserving Voice Conversion for Articulation Disorders Using Phoneme-Categorized Exemplars.
ACM Trans. Access. Comput., 2015

Audio-Visual Speech Recognition Using Convolutive Bottleneck Networks for a Person with Severe Hearing Loss.
IPSJ Trans. Comput. Vis. Appl., 2015

Discriminating Unknown Objects from Known Objects Using Image and Speech Information.
IEICE Trans. Inf. Syst., 2015

A robust SVM classification framework using PSM for multi-class recognition.
EURASIP J. Image Video Process., 2015

Voice conversion using speaker-dependent conditional restricted Boltzmann machine.
EURASIP J. Audio Speech Music. Process., 2015

Multimodal voice conversion based on non-negative matrix factorization.
EURASIP J. Audio Speech Music. Process., 2015

Small-parallel exemplar-based voice conversion in noisy environments using affine non-negative matrix factorization.
EURASIP J. Audio Speech Music. Process., 2015

Many-to-one voice conversion using exemplar-based sparse representation.
Proceedings of the 2015 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2015

Individuality-Preserving Spectrum Modification for Articulation Disorders Using Phone Selective Synthesis.
Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, 2015

Content-based Image Retrieval Using Rotation-invariant Histograms of Oriented Gradients.
Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Many-to-many voice conversion based on multiple non-negative matrix factorization.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Word-Error Correction of Continuous Speech Recognition Based on Normalized Relevance Distance.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Individuality-Preserving Voice Reconstruction for Articulation Disorders Using Text-to-Speech Synthesis.
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction, Seattle, WA, USA, November 09, 2015

Sparse nonlinear representation for voice conversion.
Proceedings of the 2015 IEEE International Conference on Multimedia and Expo, 2015

Multithreading AdaBoost framework for object recognition.
Proceedings of the 2015 IEEE International Conference on Image Processing, 2015

Activity-mapping non-negative matrix factorization for exemplar-based voice conversion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Color saliency for object identification.
Proceedings of the 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, 2015

Estimation of object functions using deformable part model.
Proceedings of the 21st Korea-Japan Joint Workshop on Frontiers of Computer Vision, 2015

Feature extraction using pre-trained convolutive bottleneck nets for dysarthric speech recognition.
Proceedings of the 23rd European Signal Processing Conference, 2015

Noise-robust voice conversion using a small parallel data based on non-negative matrix factorization.
Proceedings of the 23rd European Signal Processing Conference, 2015

Detection of facial parts via deformable part model using part annotation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Rotation-invariant histograms of oriented gradients for local patch robust representation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Facial expression recognition with multithreaded cascade of rotation-invariant HOG.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Voice Conversion Based on Speaker-Dependent Restricted Boltzmann Machines.
IEICE Trans. Inf. Syst., 2014

Noise-Robust Voice Conversion Based on Sparse Spectral Mapping Using Non-negative Matrix Factorization.
IEICE Trans. Inf. Syst., 2014

A preliminary demonstration of exemplar-based voice conversion for articulation disorders using an individuality-preserving dictionary.
EURASIP J. Audio Speech Music. Process., 2014

Individuality-preserving Voice Conversion for Articulation Disorders Using Dictionary Selective Non-negative Matrix Factorization.
Proceedings of the 5th Workshop on Speech and Language Processing for Assistive Technologies, 2014

High-order sequence modeling using speaker-dependent recurrent temporal restricted boltzmann machines for voice conversion.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Multimodal exemplar-based voice conversion using lip features in noisy environments.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Error correction of automatic speech recognition based on normalized web distance.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Selection of Unknown Objects Specified by Speech Using Models Constructed from Web Images.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

3D-Object Recognition Based on LLC Using Depth Spatial Pyramid.
Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Selection of an Object Requested by Speech Based on Generic Object Recognition.
Proceedings of the 2014 Workshop on Multimodal, 2014

Voice conversion in time-invariant speaker-independent space.
Proceedings of the IEEE International Conference on Acoustics, 2014

Multimodal voice conversion using non-negative matrix factorization in noisy environments.
Proceedings of the IEEE International Conference on Acoustics, 2014

Voice conversion based on Non-negative matrix factorization using phoneme-categorized dictionary.
Proceedings of the IEEE International Conference on Acoustics, 2014

Exemplar-based emotional voice conversion using non-negative matrix factorization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Task-Driven Saliency Detection on Music Video.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

A Robust Learning Framework Using PSM and Ameliorated SVMs for Emotional Recognition.
Proceedings of the Computer Vision - ACCV 2014 Workshops, 2014

2013
Exemplar-Based Voice Conversion Using Sparse Representation in Noisy Environments.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2013

Noise-robust voice conversion based on spectral mapping on sparse space.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013

Robust Feature Extraction to Utterance Fluctuation of Articulation Disorders Based on Random Projection.
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies, 2013

Individuality-Preserving Voice Conversion for Articulation Disorders Using Locality-Constrained NMF.
Proceedings of the Fourth Workshop on Speech and Language Processing for Assistive Technologies, 2013

High-Frequency Restoration Using Deep Belief Nets for Super-resolution.
Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Event Detection and Recognition Using HMM with Whistle Sounds.
Proceedings of the Ninth International Conference on Signal-Image Technology & Internet-Based Systems, 2013

Acoustic feature selection utilizing multiple kernel learning for classification of children with autism spectrum and typically developing children.
Proceedings of the 2013 IEEE/SICE International Symposium on System Integration, 2013

Voice conversion based on Non-negative Matrix Factorization in noisy environments.
Proceedings of the 2013 IEEE/SICE International Symposium on System Integration, 2013

Unknown Object Identification Using Category Visual Words with Rejection Function.
Proceedings of the 13. IAPR International Conference on Machine Vision Applications, 2013

Robust facial expressions recognition using 3D average face and ameliorated adaboost.
Proceedings of the ACM Multimedia Conference, 2013

Two-step correction of speech recognition errors based on n-gram and long contextual information.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Voice conversion in high-order eigen space using deep belief nets.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Exemplar-based individuality-preserving voice conversion for articulation disorders in noisy environments.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Prediction of unlearned position based on local regression for single-channel talker localization using acoustic transfer function.
Proceedings of the IEEE International Conference on Acoustics, 2013

Sparse representation for outliers suppression in semi-supervised image annotation.
Proceedings of the IEEE International Conference on Acoustics, 2013

Individuality-preserving voice conversion for articulation disorders based on non-negative matrix factorization.
Proceedings of the IEEE International Conference on Acoustics, 2013

Object Recognition by Integrated Information Using Web Images.
Proceedings of the 2nd IAPR Asian Conference on Pattern Recognition, 2013

2012
Exemplar-based voice conversion in noisy environment.
Proceedings of the 2012 IEEE Spoken Language Technology Workshop (SLT), 2012

Robust AAM-based audio-visual speech recognition against face direction changes.
Proceedings of the 20th ACM Multimedia Conference, MM '12, Nara, Japan, October 29, 2012

Super-resolution Using GMM and PLS Regression.
Proceedings of the 2012 IEEE International Symposium on Multimedia, 2012

Estimation of Talker's Head Orientation Based on Discrimination of the Shape of Cross-power Spectrum Phase Coefficients.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

3D tracking of soccer players using time-situation graph in monocular image sequence.
Proceedings of the 21st International Conference on Pattern Recognition, 2012

Acoustic model transformations based on random projections.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A new multiple-kernel-learning weighting method for localizing human brain magnetic activity.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Super-resolution by GMM based conversion using self-reduction image.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Generic object recognition by graph structural expression.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Towards Domain Independent Why Text Segment Classification Based on Bag of Function Words.
Proceedings of the AI 2012: Advances in Artificial Intelligence, 2012

Robust feature extraction to utterance fluctuations due to articulation disorders based on sparse expression.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

An adaboost-based weighting method for localizing human brain magnetic activity.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Consonant enhancement for articulation disorders based on non-negative matrix factorization.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2012

Disambiguation in Unknown Object Detection by Integrating Image and Speech Recognition Confidences.
Proceedings of the Computer Vision - ACCV 2012, 2012

2011
A Low-Power Real-Time SIFT Descriptor Generation Engine for Full-HDTV Video Recognition.
IEICE Trans. Electron., 2011

Topic tracking language model for speech recognition.
Comput. Speech Lang., 2011

Audio-Visual Speech Recognition Based on AAM Parameter and Phoneme Analysis of Visual Feature.
Proceedings of the Advances in Image and Video Technology - 5th Pacific Rim Symposium, 2011

Image Annotation with Concept Level Feature Using PLSA+CCA.
Proceedings of the Advances in Multimedia Modeling, 2011

Constrained Spectrum Generation Using A Probabilistic Spectrum Envelope for Mixed Music Analysis.
Proceedings of the 12th International Society for Music Information Retrieval Conference, 2011

Single-Channel Head Orientation Estimation Based on Discrimination of Acoustic Transfer Function.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Feature selection based on Multiple Kernel Learning for single-channel sound source localization using the acoustic transfer function.
Proceedings of the IEEE International Conference on Acoustics, 2011

Generic object recognition using automatic region extraction and dimensional feature integration utilizing multiple kernel learning.
Proceedings of the IEEE International Conference on Acoustics, 2011

2010
Sudden Noise Reduction Based on GMM with Noise Power Estimation.
J. Softw. Eng. Appl., 2010

3D Human Pose Estimation from a Monocular Image Using Model Fitting in Eigenspaces.
J. Softw. Eng. Appl., 2010

Application of topic tracking model to language model adaptation and meeting analysis.
Proceedings of the 2010 IEEE Spoken Language Technology Workshop, 2010

Multimodal speech recognition of a person with articulation disorders using AAM and MAF.
Proceedings of the 2010 IEEE International Workshop on Multimedia Signal Processing, 2010

Speech synthesis by modeling harmonics structure with multiple function.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Learning an Efficient and Robust Graph Matching Procedure for Specific Object Recognition.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Generic Object Recognition by Tree Conditional Random Field Based on Hierarchical Segmentation.
Proceedings of the 20th International Conference on Pattern Recognition, 2010

Structuring a gene network using a multiresolution independence test.
Proceedings of the IEEE International Conference on Acoustics, 2010

Evaluation of random-projection-based feature combination on speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

HMM-based separation of acoustic transfer function for single-channel sound source localization.
Proceedings of the IEEE International Conference on Acoustics, 2010

Why Text Segment Classification Based on Part of Speech Feature Selection.
Proceedings of the Discovery Science - 13th International Conference, 2010

Scale-invariant proximity graph for fast probabilistic object recognition.
Proceedings of the 9th ACM International Conference on Image and Video Retrieval, 2010

Gaze Estimation Using Regression Analysis and AAMs Parameters Selected Based on Information Criterion.
Proceedings of the Computer Vision - ACCV 2010 Workshops, 2010

2009
Integration of Metamodel and Acoustic Model for Dysarthric Speech Recognition.
J. Multim., 2009

Graph Cuts Segmentation by Using Local Texture Features of Multiresolution Analysis.
IEICE Trans. Inf. Syst., 2009

Single-Channel Talker Localization Based on Discrimination of Acoustic Transfer Functions.
EURASIP J. Adv. Signal Process., 2009

Integrated Phoneme Subspace Method for Speech Feature Extraction.
EURASIP J. Audio Speech Music. Process., 2009

System request detection in human conversation based on multi-resolution Gabor wavelet features.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Monaural sound-source-direction estimation using the acoustic transfer function of an active microphone.
Proceedings of the 12th International Conference on Information Fusion, 2009

Human Action Recognition Using HDP by Integrating Motion and Location Information.
Proceedings of the Computer Vision, 2009

2008
Language Modeling Using PLSA-Based Topic HMM.
IEICE Trans. Inf. Syst., 2008

Human-Robot Interface Using System Request Utterance Detection Based on Acoustic Features.
Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Audio-Based Video Editing with Two-Channel Microphone.
Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Speaker Independent Phoneme Recognition Based on Fisher Weight Map.
Proceedings of the 2008 International Conference on Multimedia and Ubiquitous Engineering (MUE 2008), 2008

Tagging Video Contents with Positive/Negative Interest Based on User's Facial Expression.
Proceedings of the Advances in Multimedia Modeling, 2008

Integration of metamodel and acoustic model for speech recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Object recognition and segmentation using SIFT and Graph Cuts.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

3D human posture estimation using the HOG features from monocular image.
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008

Graph cuts by using local texture features of wavelet coefficient for image segmentation.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

Digital camera work for soccer video production with event recognition and accurate ball tracking by switching search method.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008

2007
Two-channel-based Noise Reduction in a Complex Spectrum Plane for Hands-free Communication System.
J. VLSI Signal Process., 2007

Combination of GMM-based speech estimation method and temporal domain SVD-based speech enhancement for noise robust speech recognition.
Syst. Comput. Jpn., 2007

PCA-Based Speech Enhancement for Distorted Speech Recognition.
J. Multim., 2007

Voice activity detection by lip shape tracking using EBGM.
Proceedings of the 15th International Conference on Multimedia 2007, 2007

System request detection in conversation based on acoustic and speaker alternation features.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

PCA-based feature extraction for fluctuation in speaking style of articulation disorders.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Fast and cheap object recognition by linear combination of views.
Proceedings of the 6th ACM International Conference on Image and Video Retrieval, 2007

2006
Acoustic Model Adaptation Using First-Order Linear Prediction for Reverberant Speech.
IEICE Trans. Inf. Syst., 2006

Automatic Production System of Soccer Sports Video by Digital Camera Work Based on Situation Recognition.
Proceedings of the Eigth IEEE International Symposium on Multimedia (ISM 2006), 2006

Phoneme recognition based on fisher weight map to higher-order local auto-correlation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

Online Training-Oriented Video Shooting Navigation System Based on Real-Time Camerawork Evaluation.
Proceedings of the 2006 IEEE International Conference on Multimedia and Expo, 2006

Robust Feature Extraction using Kernel PCA.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006

2005
Recognition of speech from live sports coverage using acoustic and language model adaptation.
Syst. Comput. Jpn., 2005

Recognition of hands-free speech and hand pointing action for conversational TV.
Proceedings of the 13th ACM International Conference on Multimedia, 2005

Situation based speech recognition for structuring baseball live games.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Structuring Baseball Live Games Based on Speech Recognition Using Task Dependent Knowledge and Emotion State Recognition.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005

2004
Speech recognition in a noisy environment using a speech signal estimation method based on the Kalman filter.
Syst. Comput. Jpn., 2004

A Method of Digital Camera Work Focused on Players and a Ball: - Toward Automatic Contents Production System of Commentary Soccer Video by Digital Shooting.
Proceedings of the Advances in Multimedia Information Processing - PCM 2004, 5th Pacific Rim Conference on Multimedia, Tokyo, Japan, November 30, 2004

Structuring of baseball live games based on speech recognition using task dependant knowledge.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004

Video shooting navigation system by real-time useful shot discrimination based on video grammar.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Automatic extraction of PC scenes based on feature mining for a real time delivery system of baseball highlight scenes.
Proceedings of the 2004 IEEE International Conference on Multimedia and Expo, 2004

Robust speech recognition in additive and channel noise environments using GMM and EM algorithm.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

2003
Highlight scene extraction in real time from baseball live video.
Proceedings of the 5th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2003

Topic segmentation and retrieval system for lecture videos based on spontaneous speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Syllable-based acoustic modeling for Japanese spontaneous speech recognition.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Combination of temporal domain SVD based speech enhancement and GMM based speech estimation for ASR in noise - evaluation on the AURORA2 task -.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Live speech recognition in sports games by adaptation of acoustic model and language model.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Automatic Useful Shot Extraction for a Video Editing Support System.
Proceedings of the IAPR Conference on Machine Vision Applications (IAPR MVA 2002), 2002

Unsupervised acoustic model adaptation based on phoneme error minimization.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasks.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

English call system with functions of speech segmentation and pronunciation evaluation using speech recognition technology.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Video Editing Support System Based on Video Grammar and Content Analysis.
Proceedings of the 16th International Conference on Pattern Recognition, 2002

Noise robust hands-free speech recognition using microphone array and Kalman filter as front-end system of conversational TV.
Proceedings of the IEEE 5th Workshop on Multimedia Signal Processing, 2002

2001
Segmentation of goods catalog video based on video caption.
Proceedings of the 2001 ACM workshops on Multimedia: multimedia information retrieval, Ottawa, ON, Canada, September 30, 2001

Improved speech recognition using iterative decoding based on confidence measures.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Speaker recognition by separating phonetic space and speaker space.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Speech recognition under musical environments using kalman filter and iterative MLLR adaptation.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Summarization Of News Speech With Unknown Topic Boundary.
Proceedings of the 2001 IEEE International Conference on Multimedia and Expo, 2001

Continuous speech recognition under non-stationary musical environments based on speech state transition model.
Proceedings of the IEEE International Conference on Acoustics, 2001

2000
Automatic classification of TV sports news video by multiple subspace method.
Syst. Comput. Jpn., 2000

Multimedia Technologies for Structuring and Retrieval of TV News.
New Gener. Comput., 2000

Study on New Term Weighting Method and New Vector Space Model Based on Word Space in Spoken Document Retrieval.
Proceedings of the Computer-Assisted Information Retrieval (Recherche d'Information et ses Applications), 2000

Topic segmentation of news speech using word similarity.
Proceedings of the 8th ACM International Conference on Multimedia 2000, Los Angeles, CA, USA, October 30, 2000

Organization and retrieval of continuous media.
Proceedings of the ACM Multimedia 2000 Workshops, Los Angeles, CA, USA, October 30, 2000

Expanded vector space model based on word space in cross media retrieval of news speech data.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

An efficient lexical tree search for large vocabulary continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speaker verification by integrating dynamic and static features using subspace method.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Large vocabulary continuous speech recognition under real environments using adaptive sub-band spectral subtraction.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Noisy speech recognition using noise reduction method based on Kalman filter.
Proceedings of the IEEE International Conference on Acoustics, 2000

An Advanced Processing Environment for Managing the Continuous and Semistructured Features of Multimedia Content.
Proceedings of the Current Issues in Databases and Information Systems, 2000

1999
Effectiveness of KL-transformation in spectral delta expansion.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Speaker Indexing for News Articles, Debates and Drama in Broadcasted TV Programs.
Proceedings of the IEEE International Conference on Multimedia Computing and Systems, 1999

Automatic Classification of TV News Articles Based on Telop Character Recognition.
Proceedings of the IEEE International Conference on Multimedia Computing and Systems, 1999

Telop and Flip Frame Detection and Character Extraction from TV News Articles.
Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999

1998
Scene cut detection and article extraction in news video based on clustering of DCT features.
Syst. Comput. Jpn., 1998

Indexing and classification of TV news articles based on speech dictation using word bigram.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Real time speaker indexing based on subspace method - application to TV news articles and debate.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Classification of TV sports news by DCT features using multiple subspace method.
Proceedings of the Fourteenth International Conference on Pattern Recognition, 1998

Unsupervised speaker normalization using canonical correlation analysis.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Face Indexing on Video Data - Extraction, Recognition, Tracking and Modeling.
Proceedings of the 3rd International Conference on Face & Gesture Recognition (FG '98), 1998

News Dictation and Article Classification Using Automatically Extracted Announcer Utterance.
Proceedings of the Advanced Multimedia Content Processing, First International Conference, 1998

Human Information Retrieval by Face Extraction and Recognition on TV News Images Using Subspace Method.
Proceedings of the Computer Vision, 1998

1997
Indexing and Classification of TV News Articles Based on Telop Recognition.
Proceedings of the 4th International Conference Document Analysis and Recognition (ICDAR '97), 1997

Effectiveness of speaker normalized HMM by projection to speaker subspace.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

A TV News Retrieval System with Interactive Query Function.
Proceedings of the Second IFCIS International Conference on Cooperative Information Systems, 1997

1996
An enquiring system of unknown words in TV news by spontaneous repetition (application of speaker normalization by speaker subspace projection).
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

Integration of face and speaker recognition by subspace method.
Proceedings of the 13th International Conference on Pattern Recognition, 1996

Extraction of TV news articles based on scene cut detection using DCT clustering.
Proceedings of the Proceedings 1996 International Conference on Image Processing, 1996

Speaker recognition and speaker normalization by projection to speaker subspace.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Article Extraction and Classification of TV News Using Image and Speech Processing.
Proceedings of the International Symposium on Cooperative Database Systems for Advanced Applications, 1996

1995
Segmentation and recognition of handwritten characters using subspace method.
Proceedings of the Third International Conference on Document Analysis and Recognition, 1995

1994
Simultaneous spotting of phonemes and words in continuous speech.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Speaker recognition based on subspace methods.
Proceedings of the 3rd International Conference on Spoken Language Processing, 1994

Phoneme recognition improvement by restricting training section in concatenated HMM training.
Proceedings of ICASSP '94: IEEE International Conference on Acoustics, 1994

1990
Optimisation of English phoneme recognition based on HMM.
Proceedings of the First International Conference on Spoken Language Processing, 1990

Phoneme probability presentation of continuous speech.
Proceedings of the First International Conference on Spoken Language Processing, 1990

OSPREY: a transputer based continuous speech recognition system.
Proceedings of the 1990 International Conference on Acoustics, 1990

1989
Word and monosyllable recognition using lifters on two-dimensional cepstrum.
Syst. Comput. Jpn., 1989

Enhancement and optimisation of a speech recognition front end based on hidden Markov models.
Proceedings of the First European Conference on Speech Communication and Technology, 1989

Hierarchical phoneme discrimination by hidden Markov modelling using cepstrum and formant information.
Proceedings of the IEEE International Conference on Acoustics, 1989

1987
High-speed transformation of drawing images based on structure description.
Syst. Comput. Jpn., 1987

Continuous speech understanding by keyword extraction in a voice mail system.
Proceedings of the European Conference on Speech Technology, 1987

Spoken word recognition using statistic and dynamic information obtained by two-dimensional cepstrum analysis.
Proceedings of the European Conference on Speech Technology, 1987

Uncertainty Reduction Paradigm Using Structural Knowledge in Line-Drawing Understanding.
Proceedings of the 10th International Joint Conference on Artificial Intelligence. Milan, 1987

1986
Acoustic noise reduction by two dimensional spectral smoothing and spectral amplitude transformation.
Proceedings of the IEEE International Conference on Acoustics, 1986

1984
Speaker-independent word recognition in connected speech on the basis of phoneme recognition.
Inf. Sci., 1984


  Loading...