Yonghong Yan

Orcid: 0000-0001-6907-5770

  • Chinese Academy of Sciences, Institute of Acoustics / Xinjiang Technical Institute of Physics and Chemistry, China

According to our database1, Yonghong Yan authored at least 369 papers between 1995 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


GALD-SE: Guided Anisotropic Lightweight Diffusion for Efficient Speech Enhancement.
IEEE Signal Process. Lett., 2025

Semi-supervised sound event detection with dynamic convolution and confidence-aware mean teacher.
Digit. Signal Process., 2025

Enhancing spatial auditory attention decoding with wavelet-based prototype training.
Biomed. Signal Process. Control., 2025

A novel semi-blind source separation framework towards maximum signal-to-interference ratio.
Signal Process., April, 2024

MCRSpell: A metric learning of correct representation for Chinese spelling correction.
Expert Syst. Appl., March, 2024

Boosting Cross-Domain Speech Recognition With Self-Supervision.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Interrelate Training and Clustering for Online Speaker Diarization.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

Conversational Short-Phrase Speaker Diarization via Self-Adjusting Speech Segmentation and Embedding Extraction.
IEEE Signal Process. Lett., 2024

Novel audio characteristic-dependent feature extraction and data augmentation methods for cough-based respiratory disease classification.
Comput. Biol. Medicine, 2024

CosDiff: Code-Switching TTS Model Based on A Multi-Task DDIM.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024

One-Epoch Training with Single Test Sample in Test Time for Better Generalization of Cough-Based Covid-19 Detection Model.
Proceedings of the IEEE International Conference on Acoustics, 2024

BMMSNet: Bidirectional Mapping and Multilevel Similarity Comparison for EEG-Speech Match-Mismatch Problem.
Proceedings of the IEEE International Conference on Acoustics, 2024

Snore Sound Features Based on Percussive Enhancing and Positional Encoding Combined with Multi-Task Learning for Osahs Detection.
Proceedings of the IEEE International Conference on Acoustics, 2024

The effect of source sparsity on independent vector analysis for blind source separation.
Signal Process., December, 2023

SFA: Searching faster architectures for end-to-end automatic speech recognition models.
Comput. Speech Lang., June, 2023

Reminding the incremental language model via data-free self-distillation.
Appl. Intell., April, 2023

Alternative Pseudo-Labeling for Semi-Supervised Automatic Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

First coarse, fine afterward: A lightweight two-stage complex approach for monaural speech enhancement.
Speech Commun., 2023

ForkNet: Simultaneous Time and Time-Frequency Domain Modeling for Speech Enhancement.
CoRR, 2023

Speech Corpora Divergence Based Unsupervised Data Selection for ASR.
CoRR, 2023

Piecewise Position Encoding in Convolutional Neural Network for Cough-Based Covid-19 Detection.
Proceedings of the IEEE International Conference on Acoustics, 2023

Self-Supervised Pre-Training for Attention-Based Encoder-Decoder ASR Model.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

Alleviating ASR Long-Tailed Problem by Decoupling the Learning of Representation and Classification.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

ETEH: Unified Attention-Based End-to-End ASR and KWS Architecture.
IEEE ACM Trans. Audio Speech Lang. Process., 2022

An E2E-ASR-Based Iteratively-Trained Timestamp Estimator.
IEEE Signal Process. Lett., 2022

A Secondary Path-Decoupled Active Noise Control Algorithm Based on Deep Learning.
IEEE Signal Process. Lett., 2022

Underwater Detection of Small-Volume Weak Target Echo in Harbor Scene Under Multisource Interference.
IEEE Geosci. Remote. Sens. Lett., 2022

Modeling knowledge proficiency using multi-hierarchical capsule graph neural network.
Appl. Intell., 2022

Sequence Distribution Matching for Unsupervised Domain Adaptation in ASR.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Summary On The ISCSLP 2022 Chinese-English Code-Switching ASR Challenge.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Decoupled Federated Learning for ASR with Non-IID Data.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Wav2vec-S: Semi-Supervised Pre-Training for Low-Resource ASR.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Robust Cough Feature Extraction and Classification Method for COVID-19 Cough Detection Based on Vocalization Characteristics.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Recognition of Out-of-vocabulary Words in E2E Code-switching ASR by Fusing Speech Generation Methods.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Knowledge Distillation For CTC-based Speech Recognition Via Consistent Acoustic Representation Learning.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

NAS-SCAE: Searching Compact Attention-based Encoders For End-to-end Automatic Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Improving Streaming End-to-End ASR on Transformer-based Causal Models with Encoder States Revision Strategies.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Interrelate Training and Searching: A Unified Online Clustering Framework for Speaker Diarization.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Keyword Search Using Attention-Based End-to-End ASR and Frame-Synchronous Phoneme Alignments.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

Estimation Reliability Function Assisted Sound Source Localization With Enhanced Steering Vector Phase Difference.
IEEE ACM Trans. Audio Speech Lang. Process., 2021

FSCNet: Feature-Specific Convolution Neural Network for Real-Time Speech Enhancement.
IEEE Signal Process. Lett., 2021

A unified system for multilingual speech recognition and language identification.
Speech Commun., 2021

A Two-Stage Attention Based Modality Fusion Framework for Multi-Modal Speech Emotion Recognition.
IEICE Trans. Inf. Syst., 2021

Reminding the Incremental Language Model via Data-Free Self-Distillation.
CoRR, 2021

Wav2vec-S: Semi-Supervised Pre-Training for Speech Recognition.
CoRR, 2021

The Source Model Towards Maximizing The Output Signal-To-Interference Ratio For Independent Vector Analysis.
CoRR, 2021

The HCCL Speaker Verification System for Far-Field Speaker Verification Challenge.
CoRR, 2021

Improved Conformer-based End-to-End Speech Recognition Using Neural Architecture Search.
CoRR, 2021

A New Method for Improving Generative Adversarial Networks in Speech Enhancement.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Improves Neural Acoustic Word Embeddings Query by Example Spoken Term Detection with Wav2vec Pretraining and Circle Loss.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Context-dependent Label Smoothing Regularization for Attention-based End-to-End Code-Switching Speech Recognition.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Non-autoregressive Deliberation-Attention based End-to-End ASR.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021

Cough-based COVID-19 Detection with Multi-band Long-Short Term Memory and Convolutional Neural Networks.
Proceedings of the ISAIMS 2021: 2nd International Symposium on Artificial Intelligence for Medicine Sciences, Beijing, China, October 29, 2021

LinearSpeech: Parallel Text-to-Speech with Linear Complexity.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Incorporating Cross-Speaker Style Transfer for Multi-Language Text-to-Speech.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Text Data.
Proceedings of the IEEE International Conference on Acoustics, 2021

History Utterance Embedding Transformer LM for Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2021

Decomposing Complex Questions Makes Multi-Hop QA Easier and More Interpretable.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

Using Cognitive Interest Graph and Knowledge-activated Attention for Learning Resource Recommendation.
Proceedings of the IEEE 45th Annual Computers, Software, and Applications Conference, 2021

SI-Net: Multi-Scale Context-Aware Convolutional Block for Speaker Verification.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

Far-Field Speech Recognition Based on Complex-Valued Neural Networks and Inter-Frame Similarity Difference Method.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2021

A Model Compression Method With Matrix Product Operators for Speech Enhancement.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture.
IEEE ACM Trans. Audio Speech Lang. Process., 2020

Improving generative adversarial networks for speech enhancement through regularization of latent representations.
Speech Commun., 2020

基于循环时间卷积网络的序列流推荐算法 (Session-based Recommendation Algorithm Based on Recurrent Temporal Convolutional Network).
计算机科学, 2020

A Two-Stage Phase-Aware Approach for Monaural Multi-Talker Speech Separation.
IEICE Trans. Inf. Syst., 2020

A New Time-Frequency Attention Tensor Network for Language Identification.
Circuits Syst. Signal Process., 2020

Lingual-Agnostic Meta-Learning for Low-Resource Part-of-Speech Tagging.
Proceedings of the ICIT 2020, 2020

Transformer-Based Online CTC/Attention End-To-End Speech Recognition Architecture.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

Tailoring an Interpretable Neural Language Model.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Multiple Source Localization in a Shallow Water Waveguide Exploiting Subarray Beamforming and Deep Neural Networks.
Sensors, 2019

Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system.
IEEE CAA J. Autom. Sinica, 2019

Integrating the Data Augmentation Scheme with Various Classifiers for Acoustic Scene Modeling.
CoRR, 2019

Multi-Accent Adaptation Based on Gate Mechanism.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speaker-Invariant Feature-Mapping for Distant Speech Recognition via Adversarial Teacher-Student Learning.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

A New Time-Frequency Attention Mechanism for TDNN and CNN-LSTM-TDNN, with Application to Language Identification.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Online Hybrid CTC/Attention Architecture for End-to-End Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Character-Aware Sub-Word Level Language Modeling for Uyghur and Turkish ASR.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Target Speaker Recovery and Recognition Network with Average x-Vector and Global Training.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Audio Scene Classification with Discriminatively-Trained Segment-Level Features.
Proceedings of the IEEE International Conference on Multimedia & Expo Workshops, 2019

A Subband Energy Modification Method for Elevation Control in Median Plane.
Proceedings of the IEEE International Conference on Acoustics, 2019

Multiple Temporal Scales Based Speaker Embeddings Learning for Text-dependent Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Deep Learning Based Binaural Speech Enhancement Approach with Spatial Cues Preservation.
Proceedings of the IEEE International Conference on Acoustics, 2019

Self-attention Based Prosodic Boundary Prediction for Chinese Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

An Audio Scene Classification Framework with Embedded Filters and a DCT-based Temporal Module.
Proceedings of the IEEE International Conference on Acoustics, 2019

A Novel Method for Automatic Heart Murmur Diagnosis Using Phonocardiogram.
Proceedings of the 2019 International Conference on Artificial Intelligence and Advanced Manufacturing, 2019

Rank-1 constrained Multichannel Wiener Filter for speech recognition in noisy environments.
Comput. Speech Lang., 2018

Deep Dynamic Network Embedding for Link Prediction.
IEEE Access, 2018

Restricted Boltzmann Machine-Based Approaches for Link Prediction in Dynamic Networks.
IEEE Access, 2018

Improved Conditional Generative Adversarial Net Classification For Spoken Language Recognition.
Proceedings of the 2018 IEEE Spoken Language Technology Workshop, 2018

HCCL at SemEval-2018 Task 8: An End-to-End System for Sequence Labeling from Cybersecurity Reports.
Proceedings of The 12th International Workshop on Semantic Evaluation, 2018

Discriminating between Similar Languages on Imbalanced Conversational Texts.
Proceedings of the Eleventh International Conference on Language Resources and Evaluation, 2018

Text-dependent Speaker Verification Using Word-based Scoring.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Space-Time Residual LSTM Architechture for Distant Speech Recognition.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Bidirectional LSTM with Extended Input Context.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Improving Language Modeling with an Adversarial Critic for Automatic Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Multi-talker Speech Separation Based on Permutation Invariant Training and Beamforming.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Cross-Lingual Multi-Task Neural Architecture for Spoken Language Understanding.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Investigation on the Combination of Batch Normalization and Dropout in BLSTM-based Acoustic Modeling for ASR.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Output-Gate Projected Gated Recurrent Unit for Speech Recognition.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Deep Convolutional Neural Network with Scalogram for Audio Scene Modeling.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Effect of Steering Vector Estimation on MVDR Beamformer for Noisy Speech Recognition.
Proceedings of the 23rd IEEE International Conference on Digital Signal Processing, 2018

Improving Multichannel Speech Recognition with Generalized Cross Correlation Inputs and Multitask Learning.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

On SDW-MWF and Variable Span Linear Filter with Application to Speech Recognition in Noisy Environments.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Semi-Supervised Learning with Deep Neural Networks for Relative Transfer Function Inverse Regression.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

A Deep Neural Network Based Method of Source Localization in a Shallow Water Environment.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Window-Dominant Signal Subspace Methods for Multiple Short-Term Speech Source Localization.
IEEE ACM Trans. Audio Speech Lang. Process., 2017

Stream Attention for far-field multi-microphone ASR.
CoRR, 2017

Relative Transfer Function Inverse Regression from Low Dimensional Manifold.
CoRR, 2017

HCCL at SemEval-2017 Task 2: Combining Multilingual Word Embeddings and Transliteration Model for Semantic Similarity.
Proceedings of the 11th International Workshop on Semantic Evaluation, 2017

Attention-Based LSTM with Multi-Task Learning for Distant Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Ideal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Time Delay Histogram Based Speech Source Separation Using a Planar Array.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Joint Training of Multi-Channel-Condition Dereverberation and Acoustic Modeling of Microphone Array Speech for Robust Distant Speech Recognition.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An Exploration of Dropout with LSTMs.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An improved lexicon generation method for mandarin speech recognition.
Proceedings of the 13th International Conference on Natural Computation, 2017

Fast variable-frame-rate decoding of speech recognition based on deep neural networks.
Proceedings of the 13th International Conference on Natural Computation, 2017

Deep neural network based wake-up-word speech recognition with two-stage detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Cross Array and Rank-1 MUSIC Algorithm for Acoustic Highway Lane Detection.
IEEE Trans. Intell. Transp. Syst., 2016

Structural Optimization and Online Evolutionary Learning for Spoken Dialog Management.
IEEE Signal Process. Lett., 2016

Speeding up Deep Neural Networks in Speech Recognition with Piecewise Quantized Sigmoidal Activation Function.
IEICE Trans. Inf. Syst., 2016

Improved End-to-End Speech Recognition Using Adaptive Per-Dimensional Learning Rate Methods.
IEICE Trans. Inf. Syst., 2016

Policy Optimization for Spoken Dialog Management Using Genetic Algorithm.
IEICE Trans. Inf. Syst., 2016

Short Text Classification Based on Distributional Representations of Words.
IEICE Trans. Inf. Syst., 2016

Multi-Task Learning in Deep Neural Networks for Mandarin-English Code-Mixing Speech Recognition.
IEICE Trans. Inf. Syst., 2016

Robust Uncertainty Control of the Simplified Kalman Filter for Acoustic Echo Cancelation.
Circuits Syst. Signal Process., 2016

An unsupervised vocabulary selection technique for Chinese automatic speech recognition.
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016

Oracle performance investigation of the ideal masks.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Dynamic group sparsity for non-negative matrix factorization with application to unsupervised source separation.
Proceedings of the IEEE International Workshop on Acoustic Signal Enhancement, 2016

Improvement of mask-based speech source separation using DNN.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Speech intelligibility enhancement in noisy reverberant conditions.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Robust multiple speech source localization based on phase difference regression.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A DNN-HMM Approach to Non-Negative Matrix Factorization Based Speech Enhancement.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Adaptive Group Sparsity for Non-Negative Matrix Factorization with Application to Unsupervised Source Separation.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Enhancing Link Prediction Using Gradient Boosting Features.
Proceedings of the Intelligent Computing Theories and Application, 2016

Effective utilization of multiple examples in query-by-example spoken term detection.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Robust multiple speech source localization using time delay histogram.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Phonotactic language recognition using dynamic pronunciation and language branch discriminative information.
Speech Commun., 2015

A reverberation robust target speech detection method using dual-microphone in distant-talking scene.
Speech Commun., 2015

A Hybrid Approach for Reverberation Simulation.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2015

Discriminative Pronunciation Modeling Using the MPE Criterion.
IEICE Trans. Inf. Syst., 2015

Noise Robust IOA/CAS Speech Separation and Recognition System For The Third 'CHIME' Challenge.
CoRR, 2015

An Acoustic Traffic Monitoring System: Design and Implementation.
Proceedings of the 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associated Workshops (UIC-ATC-ScalCom), 2015

Predicting Who Will Retweet or Not in Microblogs Network.
Proceedings of the Social Media Processing - 4th National Conference, 2015

IOA: Improving SVM Based Sentiment Classification Through Post Processing.
Proceedings of the 9th International Workshop on Semantic Evaluation, 2015

Distributional Representations of Words for Short Text Classification.
Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing, 2015

Spectrographic speech mask estimation using the time-frequency correlation of speech presence.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Robust localization of single sound source based on phase difference regression.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Equalization of Sound Reproduction System Based on the Human Perception Characteristics.
Proceedings of the 2015 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2015

How to Detect Communities in Large Networks.
Proceedings of the Intelligent Computing Theories and Methodologies, 2015

Continuous speech recognition based on convolutional neural network.
Proceedings of the Seventh International Conference on Digital Image Processing, 2015

A closed-form method of spatial de-aliasing for multiple speech source localization.
Proceedings of the 2015 IEEE Global Conference on Signal and Information Processing, 2015

Restoration of instantaneous amplitude and phase of speech signal in noisy reverberant environments.
Proceedings of the 23rd European Signal Processing Conference, 2015

A Shallow Discourse Parsing System Based On Maximum Entropy Model.
Proceedings of the 19th Conference on Computational Natural Language Learning: Shared Task, 2015

Reverberation robust multi-channel post-filtering using modified signal presence probability.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Improving HMM/DNN in ASR of under-resourced languages using probabilistic sampling.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Two-stage ASGD framework for parallel training of DNN acoustic models using Ethernet.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Optimizing human-interpretable dialog management policy using genetic algorithm.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

Acoustic Echo Control with Frequency-Domain Stage-Wise Regression.
IEEE Signal Process. Lett., 2014

Coalescence Type based Confidence Warping for Agglutinative Language Keyword Spotting.
J. Softw., 2014

Voice biometrics using linear Gaussian model.
IET Biom., 2014

Smoothing Method for Improved Minimum Phone Error Linear Regression.
IEICE Trans. Inf. Syst., 2014

Markovian discriminative modeling for cross-domain dialog state tracking.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014

Markovian Discriminative Modeling for Dialog State Tracking.
Proceedings of the SIGDIAL 2014 Conference, 2014

The role of auditory feedback in speech production: Implications for speech perception in the hearing impaired.
Proceedings of the 2014 International Symposium on Integrated Circuits (ISIC), 2014

Direction-of-arrival estimation of multiple speakers using a planar array.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A robust step-size control algorithm for frequency domain acoustic echo cancellation.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

On the Performance and Robustness of Crosstalk Cancelation with Multiple Loudspeakers.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Enhanced Out of Vocabulary Word Detection Using Local Acoustic Information.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Melody Extraction for Vocal Polyphonic Music Based on Bayesian Framework.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Boosted Hybrid DNN/HMM System Based on Correlation-Generated Targets.
Proceedings of the 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2014

Speeding up deep neural networks for speech recognition on ARM Cortex-A series processors.
Proceedings of the 10th International Conference on Natural Computation, 2014

Improved mandarin spoken term detection by using deep neural network for keyword verification.
Proceedings of the 10th International Conference on Natural Computation, 2014

Language recognition system using language branch discriminative information.
Proceedings of the IEEE International Conference on Acoustics, 2014

Reverberation robust two-microphone Target Signal Detection algorithm with coherent interference.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

An efficient time varying hybrid reverberator for room acoustic simulation.
Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Noise Estimation Using a Constrained Sequential Hidden Markov Model in the Log-Spectral Domain.
IEEE Trans. Speech Audio Process., 2013

Robust and Fast Localization of Single Speech Source Using a Planar Array.
IEEE Signal Process. Lett., 2013

Spoken Term Detection Based on Improved Index Structure.
J. Softw., 2013

Mixing-attack-proof Randomized Embedding Audio Watermarking System.
J. Comput., 2013

A Novel Discriminative Method for Pronunciation Quality Assessment.
IEICE Trans. Inf. Syst., 2013

Speaker Recognition Using Sparse Probabilistic Linear Discriminant Analysis.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2013

Fuzzy Matching of Semantic Class in Chinese Spoken Language Understanding.
IEICE Trans. Inf. Syst., 2013

Discriminative Approach to Build Hybrid Vocabulary for Conversational Telephone Speech Recognition of Agglutinative Languages.
IEICE Trans. Inf. Syst., 2013

Dialog State Tracking using Conditional Random Fields.
Proceedings of the SIGDIAL 2013 Conference, 2013

Discriminative pronunciation modeling based on minimum phone error training.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Prefix tree based n-best list re-scoring for recurrent neural network language model used in speech recognition system.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Comparative investigation of objective speech intelligibility prediction measures for noise-reduced signals in Mandarin and Japanese.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Effect of linguistic masker on the intelligibility of Mandarin sentences.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Head-Related Transfer Function Modeling Based on Finite-Impulse Response.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Improvement.
Proceedings of the Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, 2013

A novel discriminative method for pronunciation quality assessment.
Proceedings of the IEEE International Conference on Acoustics, 2013

A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

Automatic Allophone Deriving for Korean Speech Recognition.
Proceedings of the Ninth International Conference on Computational Intelligence and Security, 2013

Automatic Vocal Segments Detection in Popular Music.
Proceedings of the Ninth International Conference on Computational Intelligence and Security, 2013

Web-Based Language Model Domain Adaptation for Real World Voice Retrieval.
Proceedings of the Ninth International Conference on Computational Intelligence and Security, 2013

Direction of arrival estimation based on weighted minimum mean square error.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Objective Japanese intelligibility prediction for noisy speech signals before and after noise-reduction processing.
Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

A Novel Similarity Measure to Induce Semantic Classes and Its Application for Language Model Adaptation in a Dialogue System.
J. Comput. Sci. Technol., 2012

Logarithmic Adaptive Quantization Projection for Audio Watermarking.
IEICE Trans. Inf. Syst., 2012

A Forced Alignment Based Approach for English Passage Reading Assessment.
IEICE Trans. Inf. Syst., 2012

Factor Analysis of Neighborhood-Preserving Embedding for Speaker Verification.
IEICE Trans. Inf. Syst., 2012

Two-Microphone Noise Reduction Using Spatial Information-Based Spectral Amplitude Estimation.
IEICE Trans. Inf. Syst., 2012

Noise Robust Feature Scheme for Automatic Speech Recognition Based on Auditory Perceptual Mechanisms.
IEICE Trans. Inf. Syst., 2012

Maximum A Posteriori Linear Regression for language recognition.
Expert Syst. Appl., 2012

Low-dimensional representation of Gaussian mixture model supervector for language recognition.
EURASIP J. Adv. Signal Process., 2012

Automatic Scoring on English Passage Reading Quality.
Proceedings of the Advances in Swarm Intelligence - Third International Conference, 2012

A fast two-microphone noise reduction algorithm based on power level ratio for mobile phone.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Sparse Probabilistic Linear Discriminant Analysis for Speaker Verification.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

A Initial Attempt on Task-Specific Adaptation for Deep Neural Network-based Large Vocabulary Continuous Speech Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Discriminative Decision Function Based Scoring Method in Joint Factor Analysis for Speaker Verification.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Speaker Verification Using Neighborhood Preserving Embedding.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Recurrent neural network language model in mandarin voice input system.
Proceedings of the Eighth International Conference on Natural Computation, 2012

A two microphone-based approach for speech enhancement in adverse environments.
Proceedings of the IEEE International Conference on Consumer Electronics, 2012

Target speech detection based on microphone array using inter-channel phase differences.
Proceedings of the IEEE International Conference on Consumer Electronics, 2012

Noise estimation using a constrained sequential HMM IN log-spectral domain.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Factor analysis of Laplacian approach for speaker recognition.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Evaluation of objective intelligibility prediction measures for noise-reduced signals in mandarin.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

A two-microphone based voice activity detection for distant-talking speech in wide range of direction of arrival.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Improved acoustic models for Conversational Telephone Speech recognition.
Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012

Optimized large vocabulary WFST speech recognition system.
Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery, 2012

An Improved Mandarin Voice Input System Using Recurrent Neural Network Language Model.
Proceedings of the Eighth International Conference on Computational Intelligence and Security, 2012

Parallel implementation of neural networks training on graphic processing unit.
Proceedings of the 5th International Conference on BioMedical Engineering and Informatics, 2012

Voice Activity Detection Based on an Unsupervised Learning Framework.
IEEE ACM Trans. Audio Speech Lang. Process., 2011

Towards precise and robust automatic synchronization of live speech and its transcripts.
Speech Commun., 2011

Speaker Verification Using Sparse Representations on Total Variability i-vectors.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

A Spread Spectrum Audio Watermarking System with High Perceptual Quality.
Proceedings of the Third International Conference on Communications and Mobile Computing, 2011

Development of a Chinese song name recognition system.
Proceedings of the Seventh International Conference on Natural Computation, 2011

Robust understanding of spoken Chinese through character-based tagging and prior knowledge exploitation.
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011

Language recognition with language total variability.
Proceedings of the 2011 International Conference on Innovative Computing and Cloud Computing, 2011

Quantization Index Modulation audio watermarking system using a psychoacoustic model.
Proceedings of the 8th International Conference on Information, 2011

Development of a Mandarin-English Bilingual Speech Recognition System with Unified Acoustic Models.
J. Inf. Sci. Eng., 2010

A Hybrid Speech Emotion Recognition System Based on Spectral and Prosodic Features.
IEICE Trans. Inf. Syst., 2010

Enhancing the Robustness of the Posterior-Based Confidence Measures Using Entropy Information for Speech Recognition.
IEICE Trans. Inf. Syst., 2010

Acoustic Feature Optimization Based on <i>F</i>-Ratio for Robust Speech Recognition.
IEICE Trans. Inf. Syst., 2010

A bayesian logistic regression approach to spoken language identification.
IEICE Electron. Express, 2010

A new linguistic feature for Automated Essay Scoring.
Proceedings of the 4th International Universal Communication Symposium, 2010

Forward optimal measures for automatic mispronunciation detection.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Intelligibility investigation of single-channel noise reduction algorithms for Chinese and Japanese.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Large vocabulary Uyghur continuous speech recognition based on stems and suffixes.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Speaker recognition using the resynthesized speech via spectrum modeling.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Speech enhancement using improved generalized sidelobe canceller in frequency domain with multi-channel postfiltering.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Robust character based tagging with domain lexical features for Chinese spoken language understanding.
Proceedings of the Sixth International Conference on Natural Computation, 2010

Maximum a posteriori linear regression for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010

Improved modeling for F0 generation and V/U decision in HMM-based TTS.
Proceedings of the IEEE International Conference on Acoustics, 2010

Automatic Synchronization of live speech and its Transcripts based on a frame-synchronous likelihood ratio test.
Proceedings of the IEEE International Conference on Acoustics, 2010

Subset selection for articulatory feature based confidence measures.
Proceedings of the Third International Workshop on Advanced Computational Intelligence, 2010

TBNR: the ThinkIT Broadcast News speech Recognition system.
Proceedings of the Third International Workshop on Advanced Computational Intelligence, 2010

Semantic class induction and its application for a Chinese voice search system.
Proceedings of the CIPS-SIGHAN Joint Conference on Chinese Language Processing, 2010

Using a Kind of Novel Phonotactic Information for SVM Based Speaker Recognition.
IEICE Trans. Inf. Syst., 2009

Approximate Decision Function and Optimization for GMM-UBM Based Speaker Verification.
IEICE Trans. Inf. Syst., 2009

An LVCSR Based Reading Miscue Detection System Using Knowledge of Reference and Error Patterns.
IEICE Trans. Inf. Syst., 2009

Automatic Singing Performance Evaluation for Untrained Singers.
IEICE Trans. Inf. Syst., 2009

WAPS: An Audio Program Surveillance System for Large Scale Web Data Stream.
Proceedings of the Web Information Systems and Mining, International Conference, 2009

Nonnative Speech Recognition Based on Bilingual Model Modification at State Level.
Proceedings of the Sixth International Symposium on Neural Networks, 2009

A Novel Fuzzy-Based Automatic Speaker Clustering Algorithm.
Proceedings of the Advances in Neural Networks, 2009

Dynamic Multiple Pronunciation Incorporation in a Refined Search Space for Reading Miscue Detection.
Proceedings of the Sixth International Symposium on Neural Networks, 2009

Improving Voice Search Using Forward-Backward LVCSR System Combination.
Proceedings of the Sixth International Symposium on Neural Networks, 2009

An SVM-Based Mandarin Pronunciation Quality Assessment System.
Proceedings of the Sixth International Symposium on Neural Networks, 2009

Simultaneous Synchronization of Text and Speech for Broadcast News Subtitling.
Proceedings of the Advances in Neural Networks, 2009

Physiologically-inspired feature extraction for emotion recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Tonal articulatory feature for Mandarin and its application to conversational LVCSR.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A one-step tone recognition approach using MSD-HMM for continuous speech.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Online detecting end times of spoken utterances for synchronization of live speech and its transcripts.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Chinese Prosody Structure Prediction Based on Conditional Random Fields.
Proceedings of the Fifth International Conference on Natural Computation, 2009

Nonnative speech recognition based on bilingual model modification.
Proceedings of the FUZZ-IEEE 2009, 2009

Emotion Recognition and Conversion for Mandarin Speech.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Investigations to Minimum Phone Error Training in Bilingual Speech Recognition.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Sample-Based Automatic Dictionary Generation for Keyword Spotting System.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Improving Automatic Speech Recognizer of Voice Search Using System Combination.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Improved Lattice-Based Confidence Measure for Speech Recognition via a Lattice Cutoff Procedure.
Proceedings of the Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009

Automatic Detection of Pathological Voices Using GMM-SVM Method.
Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics, 2009

Automatic Detection of Pathological Voices Using GMM-MLLR Approach.
Proceedings of the 2nd International Conference on BioMedical Engineering and Informatics, 2009

Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval.
IEICE Trans. Inf. Syst., 2008

Robust Speaker Clustering Using Affinity Propagation.
IEICE Trans. Inf. Syst., 2008

Speech Enhancement Using Improved Adaptive Null-Forming in Frequency Domain with Postfilter.
IEICE Trans. Fundam. Electron. Commun. Comput. Sci., 2008

Effects of the Temporal Fine Structure in Different Frequency Bands on Mandarin Tone Perception.
IEICE Trans. Inf. Syst., 2008

Melody Track Selection Using Discriminative Language Model.
IEICE Trans. Inf. Syst., 2008

Automatic Language Identification with Discriminative Language Characterization Based on SVM.
IEICE Trans. Inf. Syst., 2008

A One-Pass Real-Time Decoder Using Memory-Efficient State Network.
IEICE Trans. Inf. Syst., 2008

Effective Acoustic Modeling for Pronunciation Quality Scoring of Strongly Accented Mandarin Speech.
IEICE Trans. Inf. Syst., 2008

Using SVM as Back-End Classifier for Language Identification.
EURASIP J. Audio Speech Music. Process., 2008

Speaker Recognition using a Kind of Novel Phonotactic Information.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Improved Semi-Parametric Mean Trajectory Model Using Discriminatively Trained Centroids.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Using Reference to Tune Language Model for Detection of Reading Miscues.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Efficient System Combination for Syllable-Confusion-Network-Based Chinese Spoken Term Detection.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

A Synchronous Method for Automatic Scoring of Language Learning.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Nonnative speech recognition based on state-candidate bilingual model modification.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

A frequency domain approach for speech enhancement with directionality using compact microphone array.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Towards vocabulary-independent speech indexing for large-scale repositories.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Cochannel speech separation using multi-pitch estimation and model based voiced sequential grouping.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Forward optimal modeling of acoustic confusions in Mandarin CALL system.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Robust speaker change detection using Kernel-Gaussian model.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

An objective singing evaluation approach by relating acoustic measurements to perceptual ratings.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Recognizing named entities in spoken Chinese dialogues with a character-level maximum entropy tagger.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Wide-Band Low-Noise Quadrature VCO Design.
Proceedings of the 4th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2008), 2008

Using Discriminative Training Techniques in Practical Intelligent Music Retrieval System.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Application of LVCSR to the Detection of Chinese Mandarin Reading Miscues.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Spoken Term Detection Using Dynamic Match Subword Confusion Network.
Proceedings of the Fourth International Conference on Natural Computation, 2008

Mandarin-English bilingual Speech Recognition for real world music retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2008

A novel speaker clustering algorithm via supervised affinity propagation.
Proceedings of the IEEE International Conference on Acoustics, 2008

Mandarin vowel pronunciation quality evaluation by a novel formant classification method and its combination with traditional algorithms.
Proceedings of the IEEE International Conference on Acoustics, 2008

New Machine Scores and Their Combinations for Automatic Mandarin Phonetic Pronunciation Quality Assessment.
Proceedings of the Knowledge-Based Intelligent Information and Engineering Systems, 2007

Singing Melody Extraction in Polyphonic Music by Harmonic Tracking.
Proceedings of the 8th International Conference on Music Information Retrieval, 2007

Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

A fast fuzzy keyword spotting algorithm based on syllable confusion network.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Mandarin vowel pronunciation quality evaluation by using formant pattern recognition.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Spoken language identification using score vector modeling and support vector machine.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Robust voice activity detection based on adaptive sub-band energy sequence analysis and harmonic detection.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007

Authentication and Quality Monitoring based on Audio Watermark for Analog AM Shortwave Broadcasting.
Proceedings of the 3rd International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2007), 2007

Large Vocabulary Mandarin Continuous Speech Recognition under Noisy Environment.
Proceedings of the Third International Conference on Natural Computation, 2007

Keyword Spotting Based on Syllable Confusion Network.
Proceedings of the Third International Conference on Natural Computation, 2007

The Design of Backend Classifiers in PPRLM System for Language Identification.
Proceedings of the Third International Conference on Natural Computation, 2007

Real Context Model for Tone Recognition in Mandarin Conversational Telephone Speech.
Proceedings of the Third International Conference on Natural Computation, 2007

Mandarin Accent Analysis Based on Formant Frequencies.
Proceedings of the IEEE International Conference on Acoustics, 2007

Audio Segmentation via Tri-Model Bayesian Information Criterion.
Proceedings of the IEEE International Conference on Acoustics, 2007

A Decision-Tree-Based Online Speaker Clustering.
Proceedings of the Pattern Recognition and Image Analysis, Third Iberian Conference, 2007

A Spoken Dialogue System Based on Keyword Spotting Technology.
Proceedings of the Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments, 2007

High Quality Voice Conversion through Phoneme-Based Linear Mapping Functions with STRAIGHT for Mandarin.
Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery, 2007

Keyword Spotting Based on Phoneme Confusion Matrix.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Adaptive Null-Forming Algorithm with Auditory Sub-bands.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

A Top-down Approach to Melody Match in Pitch Contour for Query by Humming.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Syllable Based Audio Search Using Confusion Network Arc as Indexing Unit.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Improvements in Tone Pronunciation Scoring for Strongly Accented Mandarin Speech.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Speaker Diarization System Based on GMM and BIC.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Speech Endpoint Detection Based on Sub-band Energy and Harmonic Structure of Voice.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

An Efficient and Robust Approach to Audio ID Identification.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Automatic Scoring of Flat Tongue and Raised Tongue in Computer-assisted Mandarin Learning.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

A Novel Audio Watermarking in Wavelet Domain.
Proceedings of the Second International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP 2006), 2006

Fast confidence measure algorithm for continuous speech recognition.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

Speaker adaptation using constrained transformation.
IEEE Trans. Speech Audio Process., 2004

Robust state clustering using phonetic decision trees.
Speech Commun., 2004

Automatic assessment of pronunciation quality.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004

Fusion based speech segmentation in DARPA SPINE2 task.
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004

A dynamic cross-reference pruning strategy for multiple feature fusion at decoder run time.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

Run time information fusion in speech recognition.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

A context adaptation approach for building context dependent models in LVCSR.
Proceedings of the EUROSPEECH 2001 Scandinavia, 2001

Develop Telephony Speech Recognition Systems for Real-world Application.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Word Error Rate Reduction by Bottom-Up Tone Integration to Chinese Continuous Speech Recognition System.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Toward Making Speech Part of People's Daily Life.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Efficiently using speaker adaptation data.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Improvements in search algorithm for large vocabulary continuous speech recognition.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Effective vector quantization for a highly compact acoustic model for LVCSR.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

An orthogonal GMM based speaker verification system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Speaker change detection using minimum message length criterion.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Dynamic threshold setting via Bayesian information criterion (BIC) in HMM training.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Office message center - a spoken dialogue system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Vocabulary-based acoustic model trim down and task adaptation.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Keyword spotting in auto-attendant system.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000

Linear regression under maximum a posteriori criterion with Markov random field prior.
Proceedings of the IEEE International Conference on Acoustics, 2000

Markov Random Field Linear Regression.
Proceedings of the 10th European Signal Processing Conference, 2000

Understanding speech recognition using correlation-generated neural network targets.
IEEE Trans. Speech Audio Process., 1999

Development of the 1998 OGI-FONIX broadcast news transcription system.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

High accuracy acoustic modeling using two-level decision-tree based state-tying.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

High accuracy acoustic modeling based on multi-stage decision tree.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999

Universal speech tools: the CSLU toolkit.
Proceedings of the 5th International Conference on Spoken Language Processing, Incorporating The 7th Australian International Speech Science and Technology Conference, Sydney Convention Centre, Sydney, Australia, 30th November, 1998

Accessible technology for interactive systems: a new approach to spoken language research.
Proceedings of the 1998 IEEE International Conference on Acoustics, 1998

Toward new language adaptation for language identification.
Speech Commun., 1997

Matching training and testing criteria in hybrid speech recognition systems.
Proceedings of the Fifth European Conference on Speech Communication and Technology, 1997

Speech recognition using neural networks with forward-backward probability generated targets.
Proceedings of the 1997 IEEE International Conference on Acoustics, 1997

Development of an approach to automatic language identification based on phone recognition.
Comput. Speech Lang., 1996

The influence of bigram constraints on word recognition by humans: implications for computer speech recognition.
Proceedings of the 4th International Conference on Spoken Language Processing, 1996

The contribution of consonants versus vowels to word recognition in fluent speech.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

Experiments for an approach to language identification with conversational telephone speech.
Proceedings of the 1996 IEEE International Conference on Acoustics, 1996

An approach to language identification with enhanced language model.
Proceedings of the Fourth European Conference on Speech Communication and Technology, 1995

An approach to automatic language identification based on language-dependent phone recognition.
Proceedings of the 1995 International Conference on Acoustics, 1995
