Zhen-Hua Ling
Orcid: 0000-0001-7853-5273
According to our database1,
Zhen-Hua Ling
authored at least 304 papers
between 2002 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
-
on orcid.org
On csauthors.net:
Bibliography
2024
Dynamic facial expression recognition with pseudo-label guided multi-modal pre-training.
IET Comput. Vis., February, 2024
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Syntax-Augmented Hierarchical Interactive Encoder for Zero-Shot Cross-Lingual Information Extraction.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
PE-Wav2vec: A Prosody-Enhanced Speech Model for Self-Supervised Prosody Learning in TTS.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
Low-Latency Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses for Speech Generation Tasks.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
APCodec: A Neural Audio Codec With Parallel Amplitude and Phase Spectrum Encoding and Decoding.
IEEE ACM Trans. Audio Speech Lang. Process., 2024
MDCTCodec: A Lightweight MDCT-based Neural Audio Codec towards High Sampling Rate and Low Bitrate Scenarios.
CoRR, 2024
APCodec+: A Spectrum-Coding-Based High-Fidelity and High-Compression-Rate Neural Audio Codec with Staged Training Paradigm.
CoRR, 2024
CoRR, 2024
CoRR, 2024
Asynchronous Voice Anonymization Using Adversarial Perturbation On Speaker Embedding.
CoRR, 2024
CoRR, 2024
BiVocoder: A Bidirectional Neural Vocoder Integrating Feature Extraction and Waveform Generation.
CoRR, 2024
Towards High-Quality and Efficient Speech Bandwidth Extension with Parallel Amplitude and Phase Prediction.
CoRR, 2024
Speech Reconstruction from Silent Lip and Tongue Articulation by Diffusion Models and Text-Guided Pseudo Target Generation.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024
Proceedings of the Forty-first International Conference on Machine Learning, 2024
An End-to-End EEG Channel Selection Method with Residual Gumbel Softmax for Brain-Assisted Speech Enhancement.
Proceedings of the IEEE International Conference on Acoustics, 2024
Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval.
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Proceedings of the IEEE International Conference on Acoustics, 2024
Retrieving, Rethinking and Revising: The Chain-of-Verification Can Improve Retrieval Augmented Generation.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Proceedings of the Findings of the Association for Computational Linguistics, 2024
2023
Pronunciation Dictionary-Free Multilingual Speech Synthesis Using Learned Phonetic Representations.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
APNet: An All-Frame-Level Neural Vocoder Incorporating Direct Prediction of Amplitude and Phase Spectra.
IEEE ACM Trans. Audio Speech Lang. Process., 2023
Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation.
IEEE Trans. Affect. Comput., 2023
Exploring the Topics of Audio Words for Detecting Alzheimer's Disease From Spontaneous Speech.
IEEE Signal Process. Lett., 2023
Long-Frame-Shift Neural Speech Phase Prediction With Spectral Continuity Enhancement and Interpolation Error Compensation.
IEEE Signal Process. Lett., 2023
MoVQA: A Benchmark of Versatile Question-Answering for Long-Form Movie Understanding.
CoRR, 2023
Explicit Estimation of Magnitude and Phase Spectra in Parallel for High-Quality Speech Enhancement.
CoRR, 2023
MADNet: Maximizing Addressee Deduction Expectation for Multi-Party Conversation Generation.
CoRR, 2023
SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction.
CoRR, 2023
CoRR, 2023
Source-Filter-Based Generative Adversarial Neural Vocoder for High Fidelity Speech Synthesis.
CoRR, 2023
USTC-NELSLIP at SemEval-2023 Task 2: Statistical Construction and Dual Adaptation of Gazetteer for Multilingual Complex NER.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023
Proceedings of the 31st ACM International Conference on Multimedia, 2023
Incorporating Ultrasound Tongue Images for Audio-Visual Speech Enhancement through Knowledge Distillation.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
MP-SENet: A Speech Enhancement Model with Parallel Denoising of Magnitude and Phase Spectra.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
BASEN: Time-Domain Brain-Assisted Speech Enhancement Network with Convolutional Cross Attention in Multi-talker Conditions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023
Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training.
Proceedings of the IEEE International Conference on Acoustics, 2023
Self-Supervised Audio-Visual Speech Representations Learning by Multimodal Self-Distillation.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the IEEE International Conference on Acoustics, 2023
Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses.
Proceedings of the IEEE International Conference on Acoustics, 2023
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
MADNet: Maximizing Addressee Deduction Expectation for Multi-Party Conversation Generation.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023
Symbolization, Prompt, and Classification: A Framework for Implicit Speaker Identification in Novels.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023
The USTC-NERCSLIP System for the Track 1.2 of Audio Deepfake Detection (ADD 2023) Challenge.
Proceedings of the Workshop on Deepfake Audio Detection and Analysis co-located with 32th International Joint Conference on Artificial Intelligence (IJCAI 2023), 2023
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2022
Cognitive Diagnosis with Explicit Student Vector Estimation and Unsupervised Question Matrix Learning.
CoRR, 2022
USTC-NELSLIP at SemEval-2022 Task 11: Gazetteer-Adapted Integration Network for Multilingual Complex Named Entity Recognition.
Proceedings of the 16th International Workshop on Semantic Evaluation, SemEval@NAACL 2022, 2022
Decoupled Pronunciation and Prosody Modeling in Meta-Learning-based Multilingual Speech Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Integrating Discrete Word-Level Style Variations into Non-Autoregressive Acoustic Models for Speech Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Pronunciation Dictionary-Free Multilingual Speech Synthesis by Combining Unsupervised and Supervised Phonetic Representations.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022
Proceedings of the Tenth International Conference on Learning Representations, 2022
Speaker Adaption with Intuitive Prosodic Features for Statistical Parametric Speech Synthesis.
Proceedings of the ICDSP 2022: 6th International Conference on Digital Signal Processing, Chengdu, China, February 25, 2022
Discourse-Level Prosody Modeling with a Variational Autoencoder for Non-Autoregressive Expressive Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Using Multiple Reference Audios and Style Embedding Constraints for Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022
Proceedings of the IEEE International Conference on Acoustics, 2022
Improving Recognition-Synthesis Based any-to-one Voice Conversion with Cyclic Training.
Proceedings of the IEEE International Conference on Acoustics, 2022
Wider & Closer: Mixture of Short-channel Distillers for Zero-shot Cross-lingual Named Entity Recognition.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022
Detecting Alzheimer's Disease Based on Acoustic Features Extracted from Pre-trained Models.
Proceedings of the Artificial Intelligence - Second CAAI International Conference, 2022
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022
HeterMPC: A Heterogeneous Graph Neural Network for Response Generation in Multi-Party Conversations.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022
Proceedings of the Second DialDoc Workshop on Document-grounded Dialogue and Conversational Question Answering, 2022
2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Deep Contextualized Utterance Representations for Response Selection and Dialogue Analysis.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Robustness of Speech Spoofing Detectors Against Adversarial Post-Processing of Voice Conversion.
IEEE ACM Trans. Audio Speech Lang. Process., 2021
Proceedings of the IEEE Wireless Communications and Networking Conference, 2021
Denoising-and-Dereverberation Hierarchical Neural Vocoder for Robust Waveform Generation.
Proceedings of the IEEE Spoken Language Technology Workshop, 2021
Partner Matters! An Empirical Study on Fusing Personas for Personalized Response Selection in Retrieval-Based Chatbots.
Proceedings of the SIGIR '21: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2021
Proceedings of the 15th International Workshop on Semantic Evaluation, 2021
Phase Spectrum Recovery for Enhancing Low-Quality Speech Captured by Laser Microphones.
Proceedings of the 12th International Symposium on Chinese Spoken Language Processing, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021
Learning Deep and Wide Contextual Representations Using BERT for Statistical Parametric Speech Synthesis.
Proceedings of the ICDSP 2021: 5th International Conference on Digital Signal Processing, 2021
Proceedings of the ICCSE '21: 5th International Conference on Crowd Science and Engineering, Jinan, China, October 16, 2021
Graph Attention and Interaction Network With Multi-Task Learning for Fact Verification.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the IEEE International Conference on Acoustics, 2021
Detecting Alzheimer's Disease from Speech Using Neural Networks with Bottleneck Features and Data Augmentation.
Proceedings of the IEEE International Conference on Acoustics, 2021
Have You Made a Decision? Where? A Pilot Study on Interpretability of Polarity Analysis Based on Advising Problem.
Proceedings of the IEEE International Conference on Acoustics, 2021
Improving Naturalness and Controllability of Sequence-to-Sequence Speech Synthesis by Learning Local Prosody Representations.
Proceedings of the IEEE International Conference on Acoustics, 2021
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021
Selecting and Analyzing Speech Features for the Screening of Mild Cognitive Impairment.
Proceedings of the 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2021
Proceedings of the Blizzard Challenge 2021, virtual, October 23, 2021, 2021
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2021
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021
TaLNet: Voice Reconstruction from Tongue and Lip Articulation with Transfer Learning from Text-to-Speech Synthesis.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Non-Parallel Sequence-to-Sequence Voice Conversion With Disentangled Linguistic and Speaker Representations.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2020
Learning and Modeling Unit Embeddings Using Deep Neural Networks for Unit-Selection-Based Mandarin Speech Synthesis.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020
Condition-Transforming Variational Autoencoder for Generating Diverse Short Text Conversations.
ACM Trans. Asian Low Resour. Lang. Inf. Process., 2020
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech.
Comput. Speech Lang., 2020
Generating diverse conversation responses by creating and ranking multiple candidates.
Comput. Speech Lang., 2020
Learning to Retrieve Entity-Aware Knowledge and Generate Responses with Copy Mechanism for Task-Oriented Dialogue Systems.
CoRR, 2020
Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots.
CoRR, 2020
CoRR, 2020
Pre-Trained and Attention-Based Neural Networks for Building Noetic Task-Oriented Dialogue Systems.
CoRR, 2020
Encrypted Network Traffic Classification Using Deep and Parallel Network-in-Network Models.
IEEE Access, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Knowledge-and-Data-Driven Amplitude Spectrum Prediction for Hierarchical Neural Vocoders.
Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020
Proceedings of the 36th IEEE International Conference on Data Engineering, 2020
Extracting Unit Embeddings Using Sequence-To-Sequence Acoustic Models for Unit Selection Speech Synthesis.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020
Filtering before Iteratively Referring for Knowledge-Grounded Response Selection in Retrieval-Based Chatbots.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020
Text Classification by Contrastive Learning and Cross-lingual Data Augmentation for Alzheimer's Disease Detection.
Proceedings of the 28th International Conference on Computational Linguistics, 2020
Proceedings of the CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, 2020
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Non-Parallel Voice Conversion with Autoregressive Conversion Model and Duration Adjustment.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Predictions of Subjective Ratings and Spoofing Assessments of Voice Conversion Challenge 2020 Submissions.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Voice Conversion Challenge 2020 -- Intra-lingual semi-parallel and cross-lingual voice conversion --.
Proceedings of the Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020
2019
IEEE ACM Trans. Audio Speech Lang. Process., 2019
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models.
CoRR, 2019
Exploring Unsupervised Pretraining and Sentence Structure Modelling for Winograd Schema Challenge.
CoRR, 2019
Knowledge Base Question Answering With Attentive Pooling for Question Representation.
IEEE Access, 2019
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019
Singing Voice Synthesis Using Deep Autoregressive Neural Networks for Acoustic Modeling.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019
Proceedings of the International Conference on Multimodal Interaction, 2019
Learning Latent Representations for Style Control and Transfer in End-to-end Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Proceedings of the IEEE International Conference on Acoustics, 2019
Dnn-based Spectral Enhancement for Neural Waveform Generators with Low-bit Quantization.
Proceedings of the IEEE International Conference on Acoustics, 2019
Dually Interactive Matching Network for Personalized Response Selection in Retrieval-Based Chatbots.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019
Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots.
Proceedings of the 28th ACM International Conference on Information and Knowledge Management, 2019
Proceedings of the Blizzard Challenge 2019, Vienna, Austria, September 23, 2019, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
End-to-End Emotional Speech Synthesis Using Style Tokens and Semi-Supervised Training.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019
2018
Improving the Decoding Efficiency of Deep Neural Network Acoustic Models by Cluster-Based Senone Selection.
J. Signal Process. Syst., 2018
Unit Selection Speech Synthesis Using Frame-Sized Speech Segments and Neural Network Based Acoustic Models.
J. Signal Process. Syst., 2018
A Sequential Neural Encoder With Latent Structured Description for Modeling Sentences.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Waveform Modeling and Generation Using Hierarchical Recurrent Neural Networks for Speech Bandwidth Extension.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
Extracting Spectral Features Using Deep Autoencoders With Binary Distributed Hidden Units for Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2018
IEEE Signal Process. Lett., 2018
Articulatory-to-acoustic conversion using BLSTM-RNNs with augmented input representation.
Speech Commun., 2018
CoRR, 2018
CoRR, 2018
The Voice Conversion Challenge 2018: Promoting Development of Parallel and Nonparallel Methods.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
A Spoofing Benchmark for the 2018 Voice Conversion Challenge: Leveraging from Spoofing Countermeasures for Speech Artifact Assessment.
Proceedings of the Odyssey 2018: The Speaker and Language Recognition Workshop, 2018
Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018
Learning and Modeling Unit Embeddings for Improving HMM-based Unit Selection Speech Synthesis.
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018
Proceedings of the 27th International Conference on Computational Linguistics, 2018
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2018
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2018
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, 2018
2017
Recurrent Neural Network-Based Sentence Encoder with Gated Attention for Natural Language Inference.
Proceedings of the 2nd Workshop on Evaluating Vector Space Representations for NLP, 2017
Waveform Modeling Using Stacked Dilated Convolutional Neural Networks for Speech Bandwidth Extension.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017
Cause-Effect Knowledge Acquisition and Neural Association Model for Solving A Set of Winograd Schema Problems.
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017
Extracting structural spectral features using what-where auto-encoders for statistical parametric speech synthesis.
Proceedings of the 2017 IEEE International Conference on Acoustics, 2017
Question Answering with Character-Level LSTM Encoders and Model-Based Data Augmentation.
Proceedings of the Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, 2017
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop, 2017
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017
Combing Context and Commonsense Knowledge Through Neural Networks for Solving Winograd Schema Problems.
Proceedings of the 2017 AAAI Spring Symposia, 2017
2016
Anti-Spoofing for Text-Independent Speaker Verification: An Initial Database, Comparison of Countermeasures, and Human Performance.
IEEE ACM Trans. Audio Speech Lang. Process., 2016
DBN-based Spectral Feature Representation for Statistical Parametric Speech Synthesis.
IEEE Signal Process. Lett., 2016
Speech Commun., 2016
Concept-to-Speech generation with knowledge sharing for acoustic modelling and utterance filtering.
Comput. Speech Lang., 2016
CoRR, 2016
Intra-Topic Variability Normalization based on Linear Projection for Topic Classification.
Proceedings of the NAACL HLT 2016, 2016
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Cluster-based senone selection for the efficient calculation of deep neural network acoustic models.
Proceedings of the 10th International Symposium on Chinese Spoken Language Processing, 2016
Articulatory-to-Acoustic Conversion with Cascaded Prediction of Spectral and Excitation Features Using Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Speech Bandwidth Extension Using Bottleneck Features and Deep Recurrent Neural Networks.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
The USTC System for Voice Conversion Challenge 2016: Neural Network Based Approaches for Spectrum, Aperiodicity and F<sub>0</sub> Conversion.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016
Modeling spectral envelopes using deep conditional restricted Boltzmann machines for statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Modulation spectrum compensation for HMM-based speech synthesis using line spectral pairs.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Deep belief network-based post-filtering for statistical parametric speech synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016
2015
A Deep Generative Architecture for Postfiltering in Statistical Parametric Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2015
Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends.
IEEE Signal Process. Mag., 2015
Speech Commun., 2015
Integrate Document Ranking Information into Confidence Measure Calculation for Spoken Term Detection.
CoRR, 2015
Automatic phrase boundary labeling of speech synthesis database using context-dependent HMMs and n-gram prior distributions.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015
Restoring high frequency spectral envelopes using neural networks for speech bandwidth extension.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015
LIP movement generation using restricted Boltzmann machines for visual speech synthesis.
Proceedings of the IEEE China Summit and International Conference on Signal and Information Processing, 2015
Proceedings of the Blizzard Challenge 2015, 2015
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, 2015
2014
IEEE ACM Trans. Audio Speech Lang. Process., 2014
HMM-based unit selection speech synthesis using log likelihood ratios derived from perceptual data.
Speech Commun., 2014
Unsupervised Prosodic Labeling of Speech Synthesis Databases Using Context-Dependent HMMs.
IEICE Trans. Inf. Syst., 2014
Integrating global variance of log power spectrum derived from LSPs into MGE training for HMM-based parametric speech synthesis.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Improving F0 prediction using bidirectional associative memories and syllable-level F0 features for HMM-based Mandarin speech synthesis.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014
Modeling DCT parameterized F0 trajectory at intonation phrase level with DNN or decision tree.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Concept-to-speech generation by integrating syntagmatic features into HMM-based speech synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014
Spectral modeling using neural autoregressive distribution estimators for statistical parametric speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2014
Using bidirectional associative memories for joint spectral envelope modeling in voice conversion.
Proceedings of the IEEE International Conference on Acoustics, 2014
Proceedings of the Blizzard Challenge 2014, Singapore, Singapore, September 19, 2014, 2014
2013
Articulatory Control of HMM-Based Parametric Speech Synthesis Using Feature-Space-Switched Multiple Regression.
IEEE Trans. Speech Audio Process., 2013
Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis.
IEEE Trans. Speech Audio Process., 2013
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
Mage - reactive articulatory feature control of HMM-based parametric speech synthesis.
Proceedings of the Eighth ISCA Tutorial and Research Workshop on Speech Synthesis, 2013
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Joint spectral distribution modeling using restricted boltzmann machines for voice conversion.
Proceedings of the 14th Annual Conference of the International Speech Communication Association, 2013
Unsupervised prosodic phrase boundary labeling of Mandarin speech synthesis database using context-dependent HMM.
Proceedings of the IEEE International Conference on Acoustics, 2013
Modeling spectral envelopes using restricted Boltzmann machines for statistical parametric speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2013
Proceedings of the Blizzard Challenge 2013, 2013
2012
Minimum Kullback-Leibler Divergence Parameter Generation for HMM-Based Speech Synthesis.
IEEE Trans. Speech Audio Process., 2012
Improved unit selection speech synthesis method utilizing subjective evaluation results on synthetic speech.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Cross-stream dependency modeling using continuous F0 model for HMM-based speech synthesis.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012
Considering Global Variance of the Log Power Spectrum Derived from Mel-Cepstrum in HMM-based Parametric Speech Synthesis.
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the 13th Annual Conference of the International Speech Communication Association, 2012
Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012
2011
Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011
Building HMM based unit-selection speech synthesis system using synthetic speech naturalness evaluation score.
Proceedings of the IEEE International Conference on Acoustics, 2011
Preserve ordering property of generated LSPS for minimum generation error training in HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the IEEE International Conference on Acoustics, 2011
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011
2010
Cross-Validation and Minimum Generation Error based Decision Tree Pruning for HMM-based Speech Synthesis.
Int. J. Comput. Linguistics Chin. Lang. Process., 2010
Minimum generation error training for HMM-based prediction of articulatory movements.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Automatic phrase boundary labeling for Mandarin TTS corpus using context-dependent HMM.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Statistical modeling of syllable-level F0 features for HMM-based unit selection speech synthesis.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010
Automatic error detection for unit selection speech synthesis using log likelihood ratio based SVM classifier.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
HMM-based text-to-articulatory-movement prediction and analysis of critical articulators.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Global variance modeling on the log power spectrum of LSPs for HMM-based speech synthesis.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010
Minimum generation error training with weighted Euclidean distance on LSP for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2010
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010
2009
IEEE Trans. Speech Audio Process., 2009
IEEE Trans. Speech Audio Process., 2009
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009
2008
Model Adaptation for HMM-Based Speech Synthesis under Minimum Generation Error Criterion.
Proceedings of the Tenth IEEE International Symposium on Multimedia (ISM2008), 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Articulatory control of HMM-based parametric speech synthesis driven by phonetic knowledge.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008
Minimum generation error criterion considering global/local variance for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008
Minumum generation error linear regression based model adaptation for HMM-based speech synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2008
Minimum unit selection error training for HMM-based unit selection speech synthesis system.
Proceedings of the IEEE International Conference on Acoustics, 2008
Proceedings of the Blizzard Challenge 2008, 2008
2007
HMM-Based Hierarchical Unit Selection Combining Kullback-Leibler Divergence with Likelihood Criterion.
Proceedings of the IEEE International Conference on Acoustics, 2007
Proceedings of the Evaluation of text-to-speech systems: Blizzard Challenge 2007, 2007
2006
Proceedings of the 5th International Symposium on Chinese Spoken Language Processing, 2006
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006
Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix format.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006
USTC System for Blizzard Challenge 2006 an Improved HMM-based Speech Synthesis Method.
Proceedings of the Blizzard Challenge 2006, Pittsburgh, PA, USA, September 16, 2006, 2006
2005
An Improved Spectral and Prosodic Transformation Method in STRAIGHT-based Voice Conversion.
Proceedings of the 2005 IEEE International Conference on Acoustics, 2005
Proceedings of the Affective Computing and Intelligent Interaction, 2005
A Novel Source Analysis Method by Matching Spectral Characters of LF Model with STRAIGHT Spectrum.
Proceedings of the Affective Computing and Intelligent Interaction, 2005
2004
Modeling glottal effect on the spectral envelop of STRAIGHT using mixture of Gaussians.
Proceedings of the 2004 International Symposium on Chinese Spoken Language Processing, 2004
A novel voice conversion system based on codebook mapping with phoneme-tied weighting.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
Compression of speech database by feature separation and pattern clustering using STRAIGHT.
Proceedings of the 8th International Conference on Spoken Language Processing, 2004
2002
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002