Bin Ma
Orcid: 0000-0002-9223-9654Affiliations:
- Alibaba Group, Speech Lab, Singapore
- Nanyang Technological University, School of Computer Science and Engineering, Singapore
- Institute for Infocomm Research, A*STAR, Singapore (since 2004)
- University of Hong Kong, Hong Kong (PhD 2000)
According to our database1,
Bin Ma
authored at least 268 papers
between 1999 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
IEEE Signal Process. Lett., 2024
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions.
CoRR, 2024
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs.
CoRR, 2024
Proceedings of the Odyssey 2024: The Speaker and Language Recognition Workshop, 2024
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
CoRR, 2023
deHuBERT: Disentangling Noise in a Self-supervised Model for Robust Speech Recognition.
CoRR, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Dual-Memory Multi-Modal Learning for Continual Spoken Keyword Spotting with Confidence Selection and Diversity Enhancement.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Dual Acoustic Linguistic Self-supervised Representation Learning for Cross-Domain Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Small Footprint Multi-channel Network for Keyword Spotting with Centroid Based Awareness.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Adapter-tuning with Effective Token-dependent Representation Shift for Automatic Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MossFormer: Pushing the Performance Limit of Monaural Speech Separation Using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions.
Proceedings of the IEEE International Conference on Acoustics, 2023
D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
De'hubert: Disentangling Noise in a Self-Supervised Model for Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Asia Pacific Signal and Information Processing Association Annual Summit and Conference, 2023
2022
CoRR, 2022
I2CR: Improving Noise Robustness on Keyword Spotting Using Inter-Intra Contrastive Regularization.
CoRR, 2022
Learning Disentangled Representations for Counterfactual Regression via Mutual Information Minimization.
Proceedings of the SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, July 11, 2022
FRCRN: Boosting Feature Representation Using Frequency Recurrence for Monaural Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2022
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
Leveraging Text Data Using Hybrid Transformer-LSTM Based End-to-End ASR in Transfer Learning.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram.
Proceedings of the IEEE International Conference on Acoustics, 2021
Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Proceedings of the CIKM '21: The 30th ACM International Conference on Information and Knowledge Management, Virtual Event, Queensland, Australia, November 1, 2021
2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Query-by-Example Speech Search Using Recurrent Neural Acoustic Word Embeddings With Temporal Context.
IEEE Access, 2019
Fast Learning for Non-Parallel Many-to-Many Voice Conversion with Residual Star Generative Adversarial Networks.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Multi-Task Multi-Network Joint-Learning of Deep Residual Networks and Cycle-Consistency Generative Adversarial Networks for Robust Speech Recognition.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Constrained Output Embeddings for End-to-End Code-Switching Speech Recognition with Only Monolingual Data.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Robust Audio-visual Speech Recognition Using Bimodal Dfsmn with Multi-condition Training and Dropout Regularization.
Proceedings of the IEEE International Conference on Acoustics, 2019
2018
Proceedings of the 15th International Conference on Spoken Language Translation, 2018
Learning Acoustic Word Embeddings with Temporal Context for Query-by-Example Speech Search.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
2017
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News.
IEEE ACM Trans. Audio Speech Lang. Process., 2017
IEEE J. Sel. Top. Signal Process., 2017
Filtering for Malice Through the Data Ocean: Large-Scale PHA Install Detection at the Communication Service Provider Level.
Proceedings of the Research in Attacks, Intrusions, and Defenses, 2017
Multi-Task Learning for Mispronunciation Detection on Singapore Children's Mandarin Speech.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
An Integrated Solution for Snoring Sound Classification Using Bhattacharyya Distance Based GMM Supervectors with SVM, Feature Selection with Random Forest and Spectrogram with CNN.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Pairwise learning using multi-lingual bottleneck features for low-resource query-by-example spoken term detection.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Efficient methods to train multilingual bottleneck feature extractors for low resource keyword search.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the 2017 International Conference on Asian Language Processing, 2017
Improving air traffic control speech intelligibility by reducing speaking rate effectively.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017
Extracting bottleneck features and word-like pairs from untranscribed speech for feature representation.
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Convolutional neural network with multi-task learning scheme for acoustic scene classification.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Low-resource spoken keyword search strategies in georgian inspired by distinctive feature theory.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
2016
J. Signal Process. Syst., 2016
Large-scale characterization of non-native Mandarin Chinese spoken by speakers of European origin: Analysis on iCALL.
Speech Commun., 2016
Proceedings of the Odyssey 2016: The Speaker and Language Recognition Workshop, 2016
Proceedings of the Working Notes Proceedings of the MediaEval 2016 Workshop, 2016
Learning Neural Network Representations Using Cross-Lingual Bottleneck Features with Word-Pair Information.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Toward High-Performance Language-Independent Query-by-Example Spoken Term Detection for MediaEval 2015: Post-Evaluation Analysis.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
The 2015 NIST Language Recognition Evaluation: The Shared View of I2R, Fantastic4 and SingaMS.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Unsupervised Bottleneck Features for Low-Resource Query-by-Example Spoken Term Detection.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Discriminatively trained joint speaker and environment representations for adaptation of deep neural network acoustic models.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Approximate search of audio queries by using DTW with phone time boundary and data augmentation.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Cross-lingual deep neural network based submodular unbiased data selection for low-resource keyword search.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Content-aware local variability vector for speaker verification with short utterance.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
2015
IEEE ACM Trans. Audio Speech Lang. Process., 2015
Proceedings of the ISCA International Workshop on Speech and Language Technology in Education, 2015
Proceedings of the Working Notes Proceedings of the MediaEval 2015 Workshop, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Investigation of parametric rectified linear units for noise robust speech recognition.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Parallel inference of dirichlet process Gaussian mixture models for unsupervised acoustic modeling: a feasibility study.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Language independent query-by-example spoken term detection using N-best phone sequences and partial matching.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Submodular data selection with acoustic and phonetic features for automatic speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Unsupervised data selection and word-morph mixed language model for tamil low-resource keyword search.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
2014
Speech Commun., 2014
Proceedings of the International Conference on Security and Privacy in Communication Networks, 2014
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014
Proceedings of the Odyssey 2014: The Speaker and Language Recognition Workshop, 2014
Proceedings of the Working Notes Proceedings of the MediaEval 2014 Workshop, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Intrinsic spectral analysis based on temporal context features for query-by-example spoken term detection.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
A graph-based Gaussian component clustering approach to unsupervised acoustic modeling.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
On the use of Bhattacharyya based GMM distance and neural net features for identification of cognitive load levels.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
IEEE Trans. Speech Audio Process., 2013
IEEE Trans. Speech Audio Process., 2013
IEEE Signal Process. Lett., 2013
I4u submission to NIST SRE 2012: a large-scale collaborative effort for noise-robust speaker verification.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Unsupervised mining of acoustic subword units with segment-level Gaussian posteriorgrams.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Large-scale characterization of Mandarin pronunciation errors made by native speakers of European languages.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
A study on GMM-SVM with adaptive relevance factor and its comparison with i-vector and JFA for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Using parallel tokenizers with DTW matrix combination for low-resource spoken term detection.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Phonetically-constrained PLDA modeling for text-dependent speaker verification with multiple short utterances.
Proceedings of the IEEE International Conference on Acoustics, 2013
Joint analysis of vocal tract length and temporal information for robust speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Speaker clustering using vector representation with long-term feature for lecture speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the IEEE International Conference on Acoustics, 2013
Broadcast News Story Segmentation Using Manifold Learning on Latent Topic Distributions.
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, 2013
2012
Speaker Clustering and Cluster Purification Methods for RT07 and RT09 Evaluation Meeting Data.
IEEE Trans. Speech Audio Process., 2012
Discriminative feature extraction for speech recognition using continuous output codes.
Pattern Recognit. Lett., 2012
Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features.
IEICE Trans. Inf. Syst., 2012
Bhattacharyya-based GMM-SVM system with adaptive relevance factor for pair language recognition.
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012
Proceedings of the Odyssey 2012: The Speaker and Language Recognition Workshop, 2012
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Phonotactic spoken language recognition: Using diversely adapted acoustic models in parallel phone recognizers.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Effect of Relevance Factor of Maximum a posteriori Adaptation for GMM-SVM in Speaker and Language Recognition.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
RSR2015: Database for Text-Dependent Speaker Verification using Multiple Pass-Phrases.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
IEEE Trans. Speech Audio Process., 2011
IEICE Trans. Inf. Syst., 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Joint Application of Speech and Speaker Recognition for Automation and Security in Smart Home.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
Score fusion and calibration in multiple language detectors with large performance variation.
Proceedings of the IEEE International Conference on Acoustics, 2011
2010
IEEE Signal Process. Mag., 2010
Autonomous acoustic model adaptation for multilingual meeting transcription involving high- and low-resourced languages.
Proceedings of the 2nd Workshop on Spoken Language Technologies for Under-Resourced Languages, 2010
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010
Proceedings of the Odyssey 2010: The Speaker and Language Recognition Workshop, Brno, Czech Republic, June 28, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Non-negative matrix factorization based discriminative features for speaker verification.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Building topic mixture language models using the document soft classification notion of topic models.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
The estimation and kernel metric of spectral correlation for text-independent speaker verification.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Incorporating MAP estimation and covariance transform for SVM based speaker recognition.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 20th International Conference on Pattern Recognition, 2010
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010
Soft margin estimation of Gaussian mixture model parameters for spoken language recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010
2009
IEEE Trans. Speech Audio Process., 2009
Int. J. Asian Lang. Process., 2009
Int. J. Asian Lang. Process., 2009
Proceedings of the IEEE International Conference on Robotics and Biomimetics, 2009
Large margin estimation of Gaussian mixture model parameters with extended baum-welch for spoken language recognition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the 2009 IEEE International Conference on Multimedia and Expo, 2009
Joint map adaptation of feature transformation and Gaussian Mixture Model for speaker recognition.
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Evaluation of a fused FM and cepstral-based speaker recognition system on the NIST 2008 SRE.
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the IEEE International Conference on Acoustics, 2009
Proceedings of the 2009 International Conference on Asian Language Processing, 2009
A Lattice-Based Phonotactic Language Recognition System with CMLLR Adaptation and Its Implementation Issues.
Proceedings of the 2009 International Conference on Asian Language Processing, 2009
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009
2008
Optimizing the Performance of Spoken Language Recognition With Discriminative Training.
IEEE Trans. Speech Audio Process., 2008
Proceedings of the 22nd Pacific Asia Conference on Language, Information and Computation, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Target-oriented phone selection from universal phone set for spoken language recognition.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Robust speaker verification using short-time frequency with long-time window and fusion of multi-resolutions.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Proceedings of the 19th International Conference on Pattern Recognition (ICPR 2008), 2008
Unsupervised pronunciation grammar growing using knowledge-based and data-driven approaches.
Proceedings of the 2008 IEEE International Conference on Multimedia and Expo, 2008
Discriminative learning for optimizing detection performance in spoken language recognition.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the Blizzard Challenge 2008, 2008
2007
IEEE Trans. Speech Audio Process., 2007
IEEE Trans. Speech Audio Process., 2007
Using direction of arrival estimate and acoustic feature information in speaker diarization.
Proceedings of the 8th Annual Conference of the International Speech Communication Association, 2007
A Generalized Feature Transformation Approach for Channel Robust Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the IEEE International Conference on Acoustics, 2007
Effects of Device Mismatch, Language Mismatch and Environmental Mismatch on Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2007
Speaker Diarization Using Direction of Arrival Estimate and Acoustic Feature Information: The I2R-NTU Submission for the NIST RT 2007 Evaluation.
Proceedings of the Multimodal Technologies for Perception of Humans, 2007
2006
Int. J. Comput. Linguistics Chin. Lang. Process., 2006
Language Recognition Based on Score Distribution Feature Vectors and Discriminative Classifier Fusion.
Proceedings of the Odyssey 2006: The Speaker and Language Recognition Workshop, 2006
Minimum Classification Error Based Optimal Linear Combination for Spoken Language Identification.
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Integrating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification.
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
Proceedings of the 2006 IEEE International Conference on Acoustics Speech and Signal Processing, 2006
2005
Proceedings of the SIGIR 2005: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Proceedings of the ACL 2005, 2005
2004
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Proceedings of the 2004 IEEE International Conference on Acoustics, 2004
2002
A comparative study of several incremental adaptation algorithms for speaker adaptation.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002
Likelihood probability mismatch analysis and normalization in multilingual speech applications.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002
2001
Online adaptive learning of continuous-density hidden Markov models based on multiple-stream prior evolution and posterior pooling.
IEEE Trans. Speech Audio Process., 2001
2000
PhD thesis, 2000
Benchmark Results of Triphone-based Acoustic Modeling on HKU96 and HKU99 Putonghua Corpora.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000
Robust speech recognition based on off-line elicitation of multiple priors and on-line adaptive prior fusion.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000
Efficient ML training of CDHMM parameters based on prior evolution, posterior intervention and feedback.
Proceedings of the IEEE International Conference on Acoustics, 2000
1999
On-line adaptive learning of CDHMM parameters based on multiple-stream prior evolution and posterior pooling.
Proceedings of the Sixth European Conference on Speech Communication and Technology, 1999
Irrelevant variability normalization in learning HMM state tying from data based on phonetic decision-tree.
Proceedings of the 1999 IEEE International Conference on Acoustics, 1999