Yanmin Qian
Orcid: 0000-0002-0314-3790
According to our database1,
Yanmin Qian
authored at least 254 papers
between 2009 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Module-Based End-to-End Distant Speech Processing: A case study of far-field automatic speech recognition [Special Issue On Model-Based and Data-Driven Audio Signal Processing].
IEEE Signal Process. Mag., November, 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Overview of Speaker Modeling and Its Applications: From the Lens of Deep Speaker Representation Learning.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Self-Supervised Learning With Cluster-Aware-DINO for High-Performance Robust Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Attention-Based Encoder-Decoder End-to-End Neural Diarization With Embedding Enhancer.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Speech Commun., 2024
Scale This, Not That: Investigating Key Dataset Attributes for Efficient Speech Enhancement Scaling.
CoRR, 2024
Memory-Efficient Training for Deep Speaker Embedding Learning in Speaker Verification.
CoRR, 2024
Data-Efficient Low-Complexity Acoustic Scene Classification via Distilling and Progressive Pruning.
CoRR, 2024
Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification.
CoRR, 2024
WeSep: A Scalable and Flexible Toolkit Towards Generalizable Target Speaker Extraction.
CoRR, 2024
Improving Anomalous Sound Detection via Low-Rank Adaptation Fine-Tuning of Pre-Trained Audio Models.
CoRR, 2024
Disentangling the Prosody and Semantic Information with Pre-trained Model for In-Context Learning based Zero-Shot Voice Conversion.
CoRR, 2024
CoRR, 2024
Diffusion-based Generative Modeling with Discriminative Guidance for Streamable Speech Enhancement.
CoRR, 2024
CoRR, 2024
Generating Speakers by Prompting Listener Impressions for Pre-trained Multi-Speaker Text-to-Speech Systems.
CoRR, 2024
URGENT Challenge: Universality, Robustness, and Generalizability For Speech Enhancement.
CoRR, 2024
Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enhancement.
CoRR, 2024
CoRR, 2024
GSTalker: Real-time Audio-Driven Talking Face Generation via Deformable Gaussian Splatting.
CoRR, 2024
CoVoMix: Advancing Zero-Shot Speech Generation for Human-like Multi-talker Conversations.
CoRR, 2024
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024
Improving Acoustic Scene Classification via Self-Supervised and Semi-Supervised Learning with Efficient Audio Transformer.
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Multimedia and Expo, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Exploring Large Scale Pre-Trained Models for Robust Machine Anomalous Sound Detection.
Proceedings of the IEEE International Conference on Acoustics, 2024
2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing.
J. Open Source Softw., November, 2023
Software Design and User Interface of ESPnet-SE++: Speech Enhancement for Robust Speech Processing (espnet-v.202310).
Dataset, October, 2023
Depth-First Neural Architecture With Attentive Feature Fusion for Efficient Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Diffusion Conditional Expectation Model for Efficient and Robust Target Speech Extraction.
CoRR, 2023
InstructME: An Instruction Guided Music Edit And Remix Framework with Latent Diffusion Models.
CoRR, 2023
Whisper-KDQ: A Lightweight Whisper via Guided Knowledge Distillation and Quantization for Efficient ASR.
CoRR, 2023
Exploring the Integration of Speech Separation and Recognition with Self-Supervised Learning Representation.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Fast and Efficient Multilingual Self-Supervised Pre-training for Low-Resource Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Text Only Domain Adaptation with Phoneme Guided Data Splicing for End-to-End Speech Recognition.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Extremely Low Bit Quantization for Mobile Speaker Verification Systems Under 1MB Memory.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Attention-based Encoder-Decoder Network for End-to-End Neural Speaker Diarization with Target Speaker Attractor.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Light-Weight Visualvoice: Neural Network Quantization On Audio Visual Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023
HuBERT-AGG: Aggregated Representation Distillation of Hidden-Unit Bert for Robust Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Joint Discriminator and Transfer Based Fast Domain Adaptation For End-To-End Speech Recognition.
Proceedings of the IEEE International Conference on Acoustics, 2023
Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2023
Predictive Skim: Contrastive Predictive Coding for Low-Latency Online Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Improving Dino-Based Self-Supervised Speaker Verification with Progressive Cluster-Aware Training.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Exploring Time-Frequency Domain Target Speaker Extraction For Causal and Non-Causal Processing.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
FAT-HuBERT: Front-End Adaptive Training of Hidden-Unit BERT For Distortion-Invariant Robust Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Improving Speech Enhancement Using Audio Tagging Knowledge From Pre-Trained Representations and Multi-Task Learning.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
The Second Multi-Channel Multi-Party Meeting Transcription Challenge (M2MeT 2.0): A Benchmark for Speaker-Attributed ASR.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023
2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE ACM Trans. Audio Speech Lang. Process., 2022
IEEE J. Sel. Top. Signal Process., 2022
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregressive Speech Recognition.
CoRR, 2022
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
A Comprehensive Study on Self-Supervised Distillation for Speaker Representation Learning.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
ESPnet-SE++: Speech Enhancement for Robust Speech Recognition, Translation, and Understanding.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Knowledge Transfer and Distillation from Autoregressive to Non-Autoregessive Speech Recognition.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
CoRR, 2021
Closing the Gap Between Time-Domain Multi-Channel Speech Enhancement on Real and Simulation Conditions.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
End-to-End Dereverberation, Beamforming, and Speech Recognition with Improved Numerical Stability and Advanced Frontend.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
The Accented English Speech Recognition Challenge 2020: Open Datasets, Tracks, Baselines, Results and Methods.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Unit Selection Synthesis Based Data Augmentation for Fixed Phrase Speaker Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021
AISpeech-SJTU Accent Identification System for the Accented English Speech Recognition Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Convolutive Transfer Function Invariant SDR Training Criteria for Multi-Channel Reverberant Speech Separation.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
2020
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Data Augmentation Using Deep Generative Models for Embedding Based Speaker Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Convolutive Transfer Function Invariant SDR training criteria for Multi-Channel Reverberant Speech Separation.
CoRR, 2020
End-to-End Far-Field Speech Recognition with Unified Dereverberation and Beamforming.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Learning Contextual Language Embeddings for Monaural Multi-Talker Speech Recognition.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Bi-Encoder Transformer Network for Mandarin-English Code-Switching Speech Recognition Using Mixture of Experts.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Listen, Watch and Understand at the Cocktail Party: Audio-Visual-Contextual Speech Separation.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Adversarial Domain Adaptation for Speaker Verification Using Partially Shared Network.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Channel Invariant Speaker Embedding Learning with Joint Multi-Task and Adversarial Training.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
2019
Discriminative Neural Embedding Learning for Short-Duration Text-Independent Speaker Verification.
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Data augmentation using generative adversarial networks for robust speech recognition.
Speech Commun., 2019
Frontiers Inf. Technol. Electron. Eng., 2019
Erratum to: Past review, current progress, and challenges ahead on the cocktail party problem.
Frontiers Inf. Technol. Electron. Eng., 2019
Robust DOA Estimation Based on Convolutional Neural Network and Time-Frequency Masking.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Data Augmentation Using Variational Autoencoder for Embedding Based Speaker Verification.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
On the Usage of Phonetic Information for Text-Independent Speaker Embedding Extraction.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Prosody Usage Optimization for Children Speech Recognition with Zero Resource Children Speech.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
GANs for Children: A Generative Data Augmentation Strategy for Children Speech Recognition.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019
Margin Matters: Towards More Discriminative Deep Neural Network Embeddings for Speaker Recognition.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
2018
Adaptive Very Deep Convolutional Residual Network for Noise Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Investigating Raw Wave Deep Neural Networks for End-to-End Speaker Spoofing Detection.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Speech Commun., 2018
Speech Commun., 2018
Frontiers Inf. Technol. Electron. Eng., 2018
Generative Adversarial Networks based X-vector Augmentation for Robust Probabilistic Linear Discriminant Analysis in Speaker Verification.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Data Augmentation using Conditional Generative Adversarial Networks for Robust Speech Recognition.
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Proceedings of the Intelligence Science and Big Data Engineering, 2018
Deep Extractor Network for Target Speaker Recovery from Single Channel Speech Mixtures.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Permutation Invariant Training of Generative Adversarial Network for Monaural Speech Separation.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Monaural Multi-Talker Speech Recognition with Attention Mechanism and Gated Convolutional Networks.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Robust Mask Estimation By Integrating Neural Network-Based and Clustering-Based Approaches for Adaptive Acoustic Beamforming.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Focal Kl-Divergence Based Dilated Convolutional Neural Networks for Co-Channel Speaker Identification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Joint I-Vector with End-to-End System for Short Duration Text-Independent Speaker Verification.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Generative Adversarial Networks Based Data Augmentation for Noise Robust Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Adaptive Permutation Invariant Training with Auxiliary Information for Monaural Multi-Talker Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Knowledge Transfer in Permutation Invariant Training for Single-Channel Multi-Talker Speech Recognition.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
IEEE ACM Trans. Audio Speech Lang. Process., 2017
Proceedings of the Intelligence Science and Big Data Engineering, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition.
Proceedings of the New Era for Robust Speech Recognition, Exploiting Deep Learning., 2017
2016
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Neural Network Based Multi-Factor Aware Joint Training for Robust Speech Recognition.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
IEEE ACM Trans. Audio Speech Lang. Process., 2016
Proceedings of the 2016 IEEE Spoken Language Technology Workshop, 2016
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Integrated adaptation with multi-factor joint-learning for far-field speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Joint acoustic factor learning for robust deep neural network based automatic speech recognition.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 8th IEEE International Conference on Biometrics Theory, 2016
2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Robust deep feature for spoofing detection - the SJTU system for ASVspoof 2015 challenge.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Automatic model redundancy reduction for fast back-propagation for deep neural networks in speech recognition.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Recurrent neural network language model with structured word embeddings for speech recognition.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
Local trajectory based speech enhancement for robust speech recognition with deep neural network.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015
An investigation on DNN-derived bottleneck features for GMM-HMM based robust speech recognition.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
The development of the cambridge university alignment systems for the multi-genre broadcast challenge.
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
Proceedings of the 2015 IEEE Workshop on Automatic Speech Recognition and Understanding, 2015
2014
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the IEEE International Conference on Acoustics, 2014
2013
MLP-HMM two-stage unsupervised training for low-resource languages on conversational telephone speech recognition.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, 2013
2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012
2011
Time-Frequency Cepstral Features and Combining Discriminative Training for Phonotactic Language Recognition.
J. Comput., 2011
Language Recognition Based on Acoustic Diversified Phone Recognizers and Phonotactic Feature Fusion.
IEICE Trans. Inf. Syst., 2011
State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 2011 IEEE Workshop on Automatic Speech Recognition & Understanding, 2011
2010
Mandarin-English bilingual phone modeling and combining MPE based Discriminative training for cross-language speech recognition.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the Information Computing and Applications - First International Conference, 2010
Phone modeling and combining discriminative training for mandarinenglish bilingual speech recognition.
Proceedings of the IEEE International Conference on Acoustics, 2010
2009
Efficient embedded speech recognition for very large vocabulary Mandarin car-navigation systems.
IEEE Trans. Consumer Electron., 2009