Li-Rong Dai

Comput. Speech Lang., 2017

RAN: Radical analysis networks for zero-shot learning of Chinese characters.

[BibT_eX]

[DOI]

CoRR, 2017

Exploring Question Understanding and Adaptation in Neural-Network-Based Question Answering.

[BibT_eX]

[DOI]

CoRR, 2017

A Maximum Likelihood Approach to Deep Neural Network Based Nonlinear Spectral Mapping for Single-Channel Speech Separation.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

End-to-End Language Identification Using High-Order Utterance Representation with Bilinear Pooling.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Gaussian Prediction Based Attention for Online End-to-End Speech Recognition.

[BibT_eX]

[DOI]

Junfeng Hou

Shiliang Zhang

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

An investigation of high-resolution modeling units of deep neural networks for acoustic scene classification.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

A GRU-Based Encoder-Decoder Approach with Attention for Online Handwritten Mathematical Expression Recognition.

[BibT_eX]

[DOI]

Jianshu Zhang

Lirong Dai

Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition, 2017

Extracting structural spectral features using what-where auto-encoders for statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Ya-Jun Hu

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Adaptation of PLDA for multi-source text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Joint noise and mask aware training for DNN-based speech enhancement with SUB-band features.

[BibT_eX]

[DOI]

Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

Multiple-target deep learning for LSTM-RNN based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the Hands-free Speech Communications and Microphone Arrays, 2017

The USTC System for Blizzard Challenge 2017.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

The USTC system for blizzard machine learning challenge 2017-ES2.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017

Feedforward sequential memory networks based encoder-decoder model for machine translation.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Learning the number of nodes in DNNs with activation mask.

[BibT_eX]

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Emotional statistical parametric speech synthesis using LSTM-RNNs.

[BibT_eX]

[DOI]

Shumin An

Zhenhua Ling

Lirong Dai

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Speaker Adaptation of Hybrid NN/HMM Model for Speech Recognition Based on Singular Value Decomposition.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

Exploration of Local Variability in Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

J. Signal Process. Syst., 2016

A Regression Approach to Single-Channel Speech Separation Via High-Resolution Deep Neural Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Modeling F0 trajectories in hierarchically structured deep neural networks.

[BibT_eX]

[DOI]

Speech Commun., 2016

Hybrid Orthogonal Projection and Estimation (HOPE): A New Framework to Learn Neural Networks.

[BibT_eX]

[DOI]

Shiliang Zhang

Hui Jiang

J. Mach. Learn. Res., 2016

Joint training of DNNs by incorporating an explicit dereverberation structure for distant speech recognition.

[BibT_eX]

[DOI]

EURASIP J. Adv. Signal Process., 2016

Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filtering.

[BibT_eX]

[DOI]

Xin Wang

Comput. Speech Lang., 2016

Image classification with CNN-based Fisher vector coding.

[BibT_eX]

[DOI]

Proceedings of the 2016 Visual Communications and Image Processing, 2016

Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

LID-senone Extraction via Deep Neural Networks for End-to-End Language Identification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016

USTC at NTCIR-12 STC Task.

[BibT_eX]

[DOI]

Proceedings of the 12th NTCIR Conference on Evaluation of Information Access Technologies, 2016

Rapid speaker adaptation based on D-code extracted from BLSTM-RNN in LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A speaker-dependent deep learning approach to joint speech separation and acoustic modeling for multi-talker automatic speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Mismatched training data enhancement for automatic recognition of children's speech using DNN-HMM.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Cluster-based senone selection for the efficient calculation of deep neural network acoustic models.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Unsupervised speaker adaptation of BLSTM-RNN for LVCSR based on speaker code.

[BibT_eX]

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Learning FOFE based FNN-LMs with noise contrastive estimation and part-of-speech features.

[BibT_eX]

[DOI]

Junfeng Hou

Shiliang Zhang

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

A regression approach to binaural speech segregation via deep neural network.

[BibT_eX]

[DOI]

Nana Fan

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

RNN-BLSTM Based Multi-Pitch Estimation.

[BibT_eX]

[DOI]

Jianshu Zhang

Jian Tang

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Compact Feedforward Sequential Memory Networks for Large Vocabulary Continuous Speech Recognition.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Future Context Attention for Unidirectional LSTM Based Acoustic Model.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks.

[BibT_eX]

[DOI]

Zheng-Chen Liu

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks.

[BibT_eX]

[DOI]

Yu Gu

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F<sub>0</sub> Conversion.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Modeling spectral envelopes using deep conditional restricted Boltzmann machines for statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Modulation spectrum compensation for HMM-based speech synthesis using line spectral pairs.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Compact convolutional neural network transfer learning for small-scale image classification.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Speaker adaptation OF RNN-BLSTM for speech recognition based on speaker code.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Deep belief network-based post-filtering for statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Ya-Jun Hu

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Content-aware local variability vector for speaker verification with short utterance.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

The USTC System for Blizzard Challenge 2016.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

Unsupervised single-channel speech separation via deep neural network for different gender mixtures.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Boosting DNN-based speech enhancement via explicit transformations.

[BibT_eX]

[DOI]

Qing Wang

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

State-Clustering Based Multiple Deep Neural Networks Modeling Approach for Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

A Regression Approach to Speech Enhancement Based on Deep Neural Networks.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2015

Quasi-Factorial Prior for i-vector Extraction.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2015

Statistical parametric speech synthesis using a hidden trajectory model.

[BibT_eX]

[DOI]

Ming-Qi Cai

Speech Commun., 2015

Feedforward Sequential Memory Networks: A New Structure to Learn Long-term Dependency.

[BibT_eX]

[DOI]

CoRR, 2015

A Fixed-Size Encoding Method for Variable-Length Sequences with its Application to Neural Network Language Models.

[BibT_eX]

[DOI]

CoRR, 2015

Feedforward Sequential Memory Neural Networks without Recurrent Feedback.

[BibT_eX]

[DOI]

CoRR, 2015

Deep Bottleneck Feature for Image Classification.

[BibT_eX]

[DOI]

Ian McLoughlin

Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, 2015

Rectified linear neural networks with tied-scalar regularization for LVCSR.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Multi-objective learning and mask-based post-processing for deep neural network based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

High-resolution acoustic modeling and compact language modeling of language-universal speech attributes for spoken language identification.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A universal VAD based on jointly trained deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Deep bottleneck network based i-vector representation for language identification.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Phone-centric local variability vector for text-constrained speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Writer adaptive feature extraction based on convolutional neural networks for online handwritten Chinese character recognition.

[BibT_eX]

[DOI]

Proceedings of the 13th International Conference on Document Analysis and Recognition, 2015

Unsupervised speaker adaptation of deep neural network based on the combination of speaker codes and singular value decomposition for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Speech Separation based on signal-noise-dependent deep neural networks for robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improved language identification using deep bottleneck network.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Spectral conversion using deep neural networks trained with multi-source speakers.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Joint training of front-end and back-end deep neural networks for robust speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Channel adaptation of plda for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Improving Deep Neural Network Based Speech Enhancement in Low SNR Environments.

[BibT_eX]

[DOI]

Proceedings of the Latent Variable Analysis and Signal Separation, 2015

LIP movement generation using restricted Boltzmann machines for visual speech synthesis.

[BibT_eX]

[DOI]

Zheng-Chen Liu

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

A unified speaker-dependent speech separation and enhancement system based on deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

An information fusion approach to recognizing microphone array speech in the CHiME-3 challenge based on a deep learning framework.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015

The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models.

[BibT_eX]

[DOI]

Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015

2014

Fast adaptation of deep neural network based on discriminant codes for speech recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2014

Voice conversion using deep neural networks with layer-wise generative training.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2014

An Experimental Study on Speech Enhancement Based on Deep Neural Networks.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2014

HMM-based unit selection speech synthesis using log likelihood ratios derived from perceptual data.

[BibT_eX]

[DOI]

Speech Commun., 2014

Unsupervised Prosodic Labeling of Speech Synthesis Databases Using Context-Dependent HMMs.

[BibT_eX]

[DOI]

Chen-Yu Yang

IEICE Trans. Inf. Syst., 2014

Local Variability Modeling for Text-Independent Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014

Speaker adaptation of hybrid NN/HMM model for speech recognition based on singular value decomposition.

[BibT_eX]

[DOI]

Shaofei Xue

Hui Jiang

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Cross-language transfer learning for deep neural network based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A fusion approach to spoken language identification based on combining multiple phone recognizers and speech attribute detectors.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speech separation based on improved deep neural networks with dual outputs of speech features for both target and interfering speakers.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Integrating global variance of log power spectrum derived from LSPs into MGE training for HMM-based parametric speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Speaker adaptive bottleneck features extraction for LVCSR based on discriminative learning of speaker codes.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Performance evaluation of deep bottleneck features for spoken language identification.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Improving F0 prediction using bidirectional associative memories and syllable-level F0 features for HMM-based Mandarin speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Local variability vector for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Dynamic noise aware training for speech enhancement based on deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesis.

[BibT_eX]

[DOI]

Xin Wang

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Task-aware deep bottleneck features for spoken language identification.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Robust speech recognition with speech enhanced deep neural networks.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes.

[BibT_eX]

[DOI]

Ling-Hui Chen

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Formant-controlled speech synthesis using hidden trajectory model.

[BibT_eX]

[DOI]

Ming-Qi Cai

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

A Study of Designing Compact Classifiers Using Deep Neural Networks for Online Handwritten Chinese Character Recognition.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Pattern Recognition, 2014

Writer Adaptation Using Bottleneck Features and Discriminative Linear Regression for Online Handwritten Chinese Character Recognition.

[BibT_eX]

[DOI]

Proceedings of the 14th International Conference on Frontiers in Handwriting Recognition, 2014

Sequence training of multiple deep neural networks for better performance and faster training speed.

[BibT_eX]

[DOI]

Pan Zhou

Hui Jiang

Proceedings of the IEEE International Conference on Acoustics, 2014

Improving deep neural networks for LVCSR using dropout and shrinking structure.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Spectral modeling using neural autoregressive distribution estimators for statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Xiang Yin

Proceedings of the IEEE International Conference on Acoustics, 2014

Direct adaptation of hybrid DNN/HMM model for fast speaker adaptation in LVCSR based on speaker code.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Lattice based optimization of bottleneck feature extractor with linear transformation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Using bidirectional associative memories for joint spectral envelope modeling in voice conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Synthesized stereo mapping via deep neural networks for noisy speech recognition.

[BibT_eX]

[DOI]

Qiang Huo

Proceedings of the IEEE International Conference on Acoustics, 2014

Minimum divergence estimation of speaker prior in multi-session PLDA scoring.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

A spectral based visual matching method for image classification.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Audio, 2014

Global variance equalization for improving deep neural network based speech enhancement.

[BibT_eX]

[DOI]

Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

2013

Joint spectral distribution modeling using restricted boltzmann machines for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

A cluster-based multiple deep neural networks method for large vocabulary continuous speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Unsupervised prosodic phrase boundary labeling of Mandarin speech synthesis database using context-dependent HMM.

[BibT_eX]

[DOI]

Chen-Yu Yang

Proceedings of the IEEE International Conference on Acoustics, 2013

Exemplar based language recognition method for short-duration speech segments.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Phoneme variation based synthesized speech discrimination for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

Incoherent training of deep neural networks to de-correlate bottleneck features for speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2013

The USTC System for Blizzard Challenge 2013.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2013, 2013

2012

Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

IEEE Trans. Speech Audio Process., 2012

Spoken term detection for OOV terms based on triphone confusion matrix.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

A hybrid fragment / syllable-based system for improved OOV term detection.

[BibT_eX]

[DOI]

Yong Xu

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Improved unit selection speech synthesis method utilizing subjective evaluation results on synthetic speech.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Intra-conversation intra-speaker variability compensation for speaker clustering.

[BibT_eX]

[DOI]

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Xin Wang

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Exemplar-Based Sparse Representation for Language Recognition on I-Vectors.

[BibT_eX]

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

The USTC System for Blizzard Challenge 2012.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012

2011

Trust Region-Based Optimization for Maximum Mutual Information Estimation of HMMs in Speech Recognition.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2011

Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Formant-Controlled HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Factored covariance modeling for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Building HMM based unit-selection speech synthesis system using synthetic speech naturalness evaluation score.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Speaker characterization using spectral subband energy ratio based on Harmonic plus Noise Model.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Preserve ordering property of generated LSPS for minimum generation error training in HMM-based speech synthesis.

[BibT_eX]

[DOI]

Ming Lei

Proceedings of the IEEE International Conference on Acoustics, 2011

Non-parallel training for voice conversion based on FT-GMM.

[BibT_eX]

[DOI]

Ling-Hui Chen

Proceedings of the IEEE International Conference on Acoustics, 2011

The USTC System for Blizzard Challenge 2011.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011

Effective image representation based on bi-layer visual codebook.

[BibT_eX]

[DOI]

Proceedings of the First Asian Conference on Pattern Recognition, 2011

2010

Cross-Validation and Minimum Generation Error based Decision Tree Pruning for HMM-based Speech Synthesis.

[BibT_eX]

[DOI]

Int. J. Comput. Linguistics Chin. Lang. Process., 2010

Minimum generation error training for HMM-based prediction of articulatory movements.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Automatic phrase boundary labeling for Mandarin TTS corpus using context-dependent HMM.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

The description of iFlyTek Speech Lab system for NIST2009 Language Recognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phonetic clustering based confidence measure for embedded speech recognition.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Factor analysis based spatial correlation modeling for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Non-negative matrix factorization based discriminative features for speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis.

[BibT_eX]

[DOI]

Zhiguo Wang

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

GMM-based voice conversion with explicit modelling on feature transform.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Speaker verification against synthetic speech.

[BibT_eX]

[DOI]

LianWu Chen

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

The estimation and kernel metric of spectral correlation for text-independent speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Effects of the phonological relevance in speaker verification.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Yu Hu

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

A hierarchical F0 modeling method for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Multiple instance learning using visual phrases for object classification.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

A bounded trust region optimization for discriminative training of HMMS in speech recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

Minimum generation error training with weighted Euclidean distance on LSP for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Ming Lei

Proceedings of the IEEE International Conference on Acoustics, 2010

N-gram nearest neighbor algorithm for voice password system.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

HMM-based pseudo-clean speech synthesis for splice algorithm.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

The USTC System for Blizzard Challenge 2010.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009

Semi-supervised kernel density estimation for video annotation.

[BibT_eX]

[DOI]

Comput. Vis. Image Underst., 2009

Asynchronous F0 and spectrum modeling for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Cheng-Cheng Wang

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

An automatic language identification method based on subspace analysis.

[BibT_eX]

[DOI]

Ren-Hua Wang

Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009

Full covariance state duration modeling for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

Exploiting prosodic information for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

The I4U system in NIST 2008 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

iFLY system for the NIST 2008 speaker recognition evaluation.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

The USTC System for Blizzard Challenge 2009.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

2008

Investigation on Adaptation Using Different Discriminative Training Criteria Based Linear Regression and Map.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Multi-Layer F0 Modeling for HMM-Based Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Parallel Phone Recognizer based MLLR Speaker Recognition.

[BibT_eX]

[DOI]

Eryu Wang

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

A Sample and Feature Selection Scheme for GMM-SVM Based Language Recognition.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Interfusing the Confused Region Score of Speaker Verification Systems.

[BibT_eX]

[DOI]

Yanhua Long

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Exploiting Non-Target Region Information for Confidence Measure Based on Bayesian Information Criterion.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Double Gauss Based Unsupervised Score Normalization in Speaker Verification.

[BibT_eX]

[DOI]

Ren-Hua Wang

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

The Adaptation Schemes In PR-SVM Based Language Recognition.

[BibT_eX]

[DOI]

Xu Bing

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Heteronym Verification for Mandarin Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Minimum generation error criterion considering global/local variance for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

Minumum generation error linear regression based model adaptation for HMM-based speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2008

The USTC System for Blizzard Challenge 2008.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2008, 2008

2007

Interactive Video Annotation by Multi-Concept Multi-Modality Active Learning.

[BibT_eX]

[DOI]

Int. J. Semantic Comput., 2007

RMulti-Concept Multi-Modality Active Learning for Interactive Video Annotation.

[BibT_eX]

[DOI]

Proceedings of the First IEEE International Conference on Semantic Computing (ICSC 2007), 2007

An Efficient Automatic Video Shot Size Annotation Scheme.

[BibT_eX]

[DOI]

Proceedings of the Advances in Multimedia Modeling, 2007

Video annotation by graph-based learning with neighborhood similarity.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Multimedia 2007, 2007

Optimizing multi-graph learning: towards a unified video annotation scheme.

[BibT_eX]

[DOI]

Proceedings of the 15th International Conference on Multimedia 2007, 2007

Multi-Graph Semi-Supervised Learning for Video Semantic Feature Extraction.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

Lazy Learning Based Efficient Video Annotation.

[BibT_eX]

[DOI]

Proceedings of the 2007 IEEE International Conference on Multimedia and Expo, 2007

An Interactive Video Annotation Frameowrk with Multiple Modalities.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2007

Angle of Models Distance as Test Algorithm in Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Fourth International Conference on Fuzzy Systems and Knowledge Discovery, 2007

The USTC and iflytek speech synthesis systems for Blizzard Challenge 2007.

[BibT_eX]

[DOI]

Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007

2006

Efficient semantic annotation method for indexing large personal video database.

[BibT_eX]

[DOI]

Proceedings of the 8th ACM SIGMM International Workshop on Multimedia Information Retrieval, 2006

Two-layer Distance Scheme in Matching Engine for Query by Humming System.

[BibT_eX]

[DOI]

Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006

Feature Extraction and Test Algorithm for Speaker Verification.

[BibT_eX]

[DOI]