Xiaolou Li

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Visualizing Data Augmentation in Deep Speaker Recognition.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Spot Keywords From Very Noisy and Mixed Speech.

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Adversarial Data Augmentation for Robust Speaker Verification.

[DOI]

Proceedings of the 9th International Conference on Communication and Information Processing, 2023

CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis.

[DOI]

Chen Chen

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

A Principle Solution for Enroll-Test Mismatch in Speaker Recognition.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

CN-Celeb: Multi-genre speaker recognition.

[DOI]

Speech Commun., 2022

Pay Attention to Hard Trials.

[DOI]

Di Wang

CoRR, 2022

Enhanced exemplar autoencoder with cycle consistency loss in any-to-one voice conversion.

[DOI]

CoRR, 2022

Cycleflow: Purify Information Factors by Cycle Loss.

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

C-P Map: A Novel Evaluation Toolkit for Speaker Verification.

[DOI]

Proceedings of the Odyssey 2022: The Speaker and Language Recognition Workshop, 28 June, 2022

Oriental Language Recognition (OLR) 2021: Summary and Analysis.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Reliable Visualization for Deep Speaker Recognition.

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Real Additive Margin Softmax for Speaker Verification.

[DOI]

Ruiqian Nai

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Deep Normalization for Speaker Vectors.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2021

Can We Trust Deep Speech Prior?

[DOI]

Proceedings of the IEEE Spoken Language Technology Workshop, 2021

M2ASR-MONGO: A Free Mongolian Speech Database and Accompanied Baselines.

[DOI]

Proceedings of the 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques, 2021

KeSpeech: An Open Source Speech Dataset of Mandarin and Its Eight Subdialects.

[DOI]

Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks 1, 2021

Oriental Language Recognition (OLR) 2020: Summary and Analysis.

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Squeezing Value of Cross-Domain Labels: A Decoupled Scoring Approach for Speaker Verification.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

A Study on Decoupled Probabilistic Linear Discriminant Analysis.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

OLR 2021 Challenge: Datasets, Rules and Baselines.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

How Speech is Recognized to Be Emotional - A Study Based on Information Decomposition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

An MAP Estimation for Between-Class Variance.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021

2020

Deep Speaker Vector Normalization with Maximum Gaussianality Training.

[DOI]

CoRR, 2020

Deep generative LDA.

[DOI]

Yunqi Cai

CoRR, 2020

Deep generative factorization for speech signal.

[DOI]

CoRR, 2020

Deep Normalization for Speaker Vectors.

[DOI]

CoRR, 2020

Neural Discriminant Analysis for Deep Speaker Embedding.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Domain-Invariant Speaker Vector Projection by Model-Agnostic Meta-Learning.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

ASR-Free Pronunciation Assessment.

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

A Robust Audio-Visual Speech Enhancement Model.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

CN-Celeb: A Challenging Chinese Speaker Recognition Dataset.

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

AP20-OLR Challenge: Three Tasks and Their Baselines.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

On Investigation of Unsupervised Speech Factorization Based on Normalization Flow.

[DOI]

CoRR, 2019

VAE-Based Regularization for Deep Speaker Embedding.

[DOI]

Yang Zhang

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Gaussian-constrained Training for Speaker Verification.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

Structure Growth for Small-Footprint Speech Recognition.

[DOI]

Jiayao Wu

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

VAE-based Domain Adaptation for Speaker Verification.

[DOI]

Xueyi Wang

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

AP19-OLR Challenge: Three Tasks and Their Baselines.

[DOI]

Liming Song

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Phonetic-Attention Scoring for Deep Speaker Features in Speaker Verification.

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

Question Mark Prediction By Bert.

[DOI]

Yunqi Cai

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

Phonetic Temporal Neural Model for Language Identification.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2018

Chinese Poetry Generation with Flexible Styles.

[DOI]

Jiyuan Zhang

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

Human and Machine Speaker Recognition Based on Short Trivial Events.

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Deep Factorization for Speech Signal.

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Full-Info Training for Deep Speaker Feature Learning.

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

VV-Couplet: An open source Chinese couplet generation system.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

RACORN-K: Risk-Aversion Pattern Matching-based Portfolio Selection.

[DOI]

Yang Wang

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

AP18-OLR Challenge: Three Tasks and Their Baselines.

[DOI]

Qing Chen

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

Map and Relabel: Towards Almost-Zero Resource Speech Recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018

2017

Collaborative Joint Training With Multitask Recurrent Model for Speech and Speaker Recognition.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2017

Medical Diagnosis From Laboratory Tests by Combining Generative and Discriminative Learning.

[DOI]

CoRR, 2017

Full-info Training for Deep Speaker Feature Learning.

[DOI]

CoRR, 2017

Deep Factorization for Speech Signal.

[DOI]

CoRR, 2017

M2ASR: Ambitions and first year progress.

[DOI]

Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017

Phone-aware neural language identification.

[DOI]

Proceedings of the 20th Conference of the Oriental Chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment, 2017

A Study on Replay Attack and Anti-Spoofing for Automatic Speaker Verification.

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Deep Speaker Feature Learning for Text-Independent Speaker Verification.

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

Memory visualization for gated recurrent neural networks in speech recognition.

[DOI]

Proceedings of the 2017 IEEE International Conference on Acoustics, 2017

Memory-augmented Neural Machine Translation.

[DOI]

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

Memory-augmented Chinese-Uyghur neural machine translation.

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Speaker recognition with cough, laugh and "Wei".

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

AP17-OLR challenge: Data, plan, and baseline.

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

A free Kazakh speech database and a speech recognition baseline.

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Enhanced neural machine translation by learning from draft.

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Cross-lingual speaker verification with deep feature learning.

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Deep speaker verification: Do we need end to end?

[DOI]

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Flexible and Creative Chinese Poetry Generation Using Neural Memory.

[DOI]

Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017

2016

Similar Word Model for Unfrequent Word Enhancement in Speech Recognition.

[DOI]

Xi Ma

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Improving Short Utterance Speaker Recognition by Modeling Speech Unit Classes.

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2016

Local Training for PLDA in Speaker Verification.

[DOI]

CoRR, 2016

OC16-CE80: A Chinese-English Mixlingual Database and A Speech Recognition Baseline.

[DOI]

CoRR, 2016

System Combination for Short Utterance Speaker Recognition.

[DOI]

CoRR, 2016

Collaborative Learning for Language and Speaker Recognition.

[DOI]

CoRR, 2016

Weakly Supervised PLDA Training.

[DOI]

CoRR, 2016

Relation Classification: CNN or RNN?

[DOI]

Dongxu Zhang

Proceedings of the Natural Language Understanding and Intelligent Applications, 2016

Learning from LDA Using Deep Neural Networks.

[DOI]

Dongxu Zhang

Tianyi Luo

Proceedings of the Natural Language Understanding and Intelligent Applications, 2016

Binary speaker embedding.

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Max-margin metric learning for speaker recognition.

[DOI]

Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016

Chinese Song Iambics Generation with Neural Attention-Based Model.

[DOI]

Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Recurrent neural network training with dark knowledge transfer.

[DOI]

Zhiyong Zhang

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Can Machine Generate Traditional Chinese Poetry? A Feigenbaum Test.

[DOI]

Qixin Wang

Tianyi Luo

Proceedings of the Advances in Brain Inspired Cognitive Systems, 2016

AP16-OL7: A multilingual database for oriental languages and a language recognition baseline.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Multi-task recurrent model for true multilingual speech recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Multi-task recurrent model for speech and speaker recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Feature transformation for speaker verification under speaking rate mismatch condition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

Learning ordered word representations with γ-decay dropout.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

System combination for short utterance speaker recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2015

Detection and reconstruction of clipped speech for speaker recognition.

[DOI]

Speech Commun., 2015

Noisy training for deep neural networks in speech recognition.

[DOI]

EURASIP J. Audio Speech Music. Process., 2015

Relation Classification via Recurrent Neural Network.

[DOI]

Dongxu Zhang

CoRR, 2015

Learning from LDA using Deep Neural Networks.

[DOI]

CoRR, 2015

Recurrent Neural Network Training with Dark Knowledge Transfer.

[DOI]

CoRR, 2015

Knowledge Transfer Pre-training.

[DOI]

CoRR, 2015

Deep Speaker Vectors for Semi Text-independent Speaker Verification.

[DOI]

CoRR, 2015

An open/free database and Benchmark for Uyghur speaker recognition.

[DOI]

Proceedings of the 2015 International Conference Oriental COCOSDA held jointly with 2015 Conference on Asian Spoken Language Research and Evaluation (O-COCOSDA/CASLRE), 2015

Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation.

[DOI]

Proceedings of the NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31, 2015

Learning speech rate in speech recognition.

[DOI]

Xiangyu Zeng

Shi Yin

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Recognize foreign low-frequency words with similar pairs.

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Lasso-based reverberation suppression in automatic speech Recognition.

[DOI]

Xuewei Zhang

Yiye Lin

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Stochastic Top-k ListNet.

[DOI]

Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2015

Cross-lingual speaker verification based on linear transform.

[DOI]

Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015

Music removal by convolutional denoising autoencoder in speech recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Transfer learning for speech and language processing.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Document classification with spherical word vectors.

[DOI]

Yiqiao Pan

Chao Xing

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Improved deep speaker feature learning for text-dependent speaker recognition.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Joint Semantic Relevance Learning with Text Data and Graph Knowledge.

[DOI]

Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, 2015

2014

Feature analysis for discriminative confidence estimation in spoken term detection.

[DOI]

Comput. Speech Lang., 2014

Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification.

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Document classification based on c.

[DOI]

Rong Liu

Chao Xing

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Research on truncated speech in speaker verification.

[DOI]

Fanhu Bie

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Pruning deep neural networks by optimal brain damage.

[DOI]

Chao Liu

Zhiyong Zhang

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

ATVS-CSLT-HCTLab System for NIST 2013 Open Keyword Search Evaluation.

[DOI]

Doroteo T. Toledano

Proceedings of the Advances in Speech and Language Technologies for Iberian Languages, 2014

Noisy training for deep neural networks.

[DOI]

Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Block-wise training for i-vector.

[DOI]

Proceedings of the IEEE China Summit & International Conference on Signal and Information Processing, 2014

Document classification with distributions of word vectors.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Discriminative scoring for speaker recognition based on I-vectors.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

Online Non-Negative Convolutive Pattern Learning for Speech Signals.

[DOI]

IEEE Trans. Signal Process., 2013

Evolutionary discriminative confidence estimation for spoken term detection.

[DOI]

Multim. Tools Appl., 2013

Auditory features based on Gammatone filters for robust speech recognition.

[DOI]

Proceedings of the 2013 IEEE International Symposium on Circuits and Systems (ISCAS2013), 2013

Sequential model adaptation for speaker verification.

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Bottleneck features based on gammatone frequency cepstral coefficients.

[DOI]

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Subspace models for bottleneck features.

[DOI]

Jun Qi

Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013

Sequential UBM adaptation for speaker verification.

[DOI]

Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Emotional speaker verification with linear adaptation.

[DOI]

Proceedings of the 2013 IEEE China Summit and International Conference on Signal and Information Processing, 2013

Emotional adaptive training for speaker verification.

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2013

2012

Direct posterior confidence for out-of-vocabulary spoken term detection.

[DOI]

ACM Trans. Inf. Syst., 2012

A Comparative Study of Bottom-Up and Top-Down Approaches to Speaker Diarization.

[DOI]

IEEE Trans. Speech Audio Process., 2012

Term-Dependent Confidence Normalisation for Out-of-Vocabulary Spoken Term Detection.

[DOI]

J. Comput. Sci. Technol., 2012

Heterogeneous Convolutive Non-Negative Sparse Coding.

[DOI]

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

N-gram FST Indexing for Spoken Term Detection.

[DOI]

Chao Liu

Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012

Speech overlap detection and attribution using convolutive non-negative sparse coding.

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

2011

Stochastic Pronunciation Modeling for Out-of-Vocabulary Spoken Term Detection.

[DOI]

Joe Frankel

IEEE Trans. Speech Audio Process., 2011

Letter-to-Sound Pronunciation Prediction Using Conditional Random Fields.

[DOI]

IEEE Signal Process. Lett., 2011

Parallel and Hierarchical Decision Making for Sparse Coding in Speech Recognition.

[DOI]

Ravichander Vipperla

Nicholas W. D. Evans

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Online Pattern Learning for Non-Negative Convolutive Sparse Coding.

[DOI]

Ravichander Vipperla

Nicholas W. D. Evans

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Handling overlaps in spoken term detection.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

Linguistic influences on bottom-up and top-down clustering for speaker diarization.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2011

An evolutionary confidence measurement for spoken term detection.

[DOI]

Alejandro Echeverría

Proceedings of the 9th International Workshop on Content-Based Multimedia Indexing, 2011

2010

An Evolutionary Confidence Measure for Spotting Words in Speech Recognition.

[DOI]

Alejandro Echeverría

Proceedings of the Trends in Practical Applications of Agents and Multiagent Systems, 2010

Evans, Joe Frankel, Raphaël Troncy: Direct posterior confidence for out-of-vocabulary spoken term detection.

[DOI]

Nicholas W. D. Evans

Proceedings of the 2010 International Workshop on Searching Spontaneous Conversational Speech, 2010

CRF-based stochastic pronunciation modeling for out-of-vocabulary spoken term detection.

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Augmented set of features for confidence estimation in spoken term detection.

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

An integrated top-down/bottom-up approach to speaker diarization.

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Stochastic pronunciation modelling and soft match for out-of-vocabulary spoken term detection.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2010

2009

Term-dependent confidence for out-of-vocabulary term detection.

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Stochastic pronunciation modelling for spoken term detection.

[DOI]

Joe Frankel

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

A posterior probability-based system hybridisation and combination for spoken term detection.

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Posterior-based confidence measures for spoken term detection.

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2009

2008

A comparison of grapheme and phoneme-based units for Spanish spoken term detection.

[DOI]

Speech Commun., 2008

A posterior approach for microphone array based speech recognition.

[DOI]

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

Growing bottleneck features for tandem ASR.

[DOI]

Joe Frankel