Minghui Dong

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

A Dual Target Neural Network Method for Speech Enhancement.

[BibT_eX]

[DOI]

Changhuai You

Lei Wang

Proceedings of the International Conference on Asian Language Processing, 2023

2021

End-to-End Detection-Segmentation System for Face Labeling.

[BibT_eX]

[DOI]

IEEE Trans. Emerg. Top. Comput. Intell., 2021

2019

CLU-CNNs: Object detection for medical images.

[BibT_eX]

[DOI]

Neurocomputing, 2019

Sparse fully convolutional network for face labeling.

[BibT_eX]

[DOI]

Neurocomputing, 2019

Towards Good Practices for Video Object Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Temporal Feature Augmented Network for Video Instance Segmentation.

[BibT_eX]

[DOI]

Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis.

[BibT_eX]

[DOI]

Yanfeng Lu

Ying Chen

Proceedings of the IEEE International Conference on Acoustics, 2019

On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

SINGAN: Singing Voice Conversion with Generative Adversarial Networks.

[BibT_eX]

[DOI]

Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018

The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

The TL-NTU Text-to-speech System for the Blizzard Challenge 2018.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

2017

Node-level parallelization for deep neural networks with conditional independent graph.

[BibT_eX]

[DOI]

Neurocomputing, 2017

Multimodal Prediction of Affective Dimensions via Fusing Multiple Regression Techniques.

[BibT_eX]

[DOI]

Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A light-weight method of building an LSTM-RNN-based bilingual tts system.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Conference on Asian Language Processing, 2017

The I2R-NWPU Text-to-Speech System for Blizzard Challenge 2017.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

A dual alignment scheme for improved speech-to-singing voice conversion.

[BibT_eX]

[DOI]

Karthika Vijayan

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Word level prosody prediction using large audiobook dataset.

[BibT_eX]

[DOI]

Yanfeng Lu

Chenyu Yang

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Representing raw linguistic information in chinese text-to-speech system.

[BibT_eX]

[DOI]

Zhengchen Zhang

Huaiping Ming

Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016

Guest Editorial: Advances in Machine Learning for Speech Processing.

[BibT_eX]

[DOI]

Jianhua Tao

Man-Wai Mak

J. Signal Process. Syst., 2016

High quality voice conversion using prosodic and high-resolution spectral features.

[BibT_eX]

[DOI]

Multim. Tools Appl., 2016

Transition-based Parsing with Context Enhancement and Future Reward Reranking.

[BibT_eX]

[DOI]

CoRR, 2016

Mandarin Prosodic Phrase Prediction based on Syntactic Trees.

[BibT_eX]

[DOI]

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.

[BibT_eX]

[DOI]

Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

I2RNTU at SemEval-2016 Task 4: Classifier Fusion for Polarity Classification in Twitter.

[BibT_eX]

[DOI]

Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Audio and face video emotion recognition in the wild using deep neural networks and small datasets.

[BibT_eX]

[DOI]

Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesis.

[BibT_eX]

[DOI]

Xin Wang

Zhen-Hua Ling

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exemplar-based sparse representation of timbre and prosody for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Combining multiple kernel models for automatic intelligibility detection of pathological speech.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A Stack LSTM Transition-Based Dependency Parser with Context Enhancement and K-best Decoding.

[BibT_eX]

[DOI]

Proceedings of the Chinese Lexical Semantics - 17th Workshop, 2016

The I2R-NWPU-NTU Text-to-Speech System at Blizzard Challenge 2016.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

2015

Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

System fusion for high-performance voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

An alternating optimization approach for phase retrieval.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A real-time variable-q non-stationary Gabor transform for pitch shifting.

[BibT_eX]

[DOI]

Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Formant excursion in singing synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Sparse representation for frequency warping based voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Mandarin prosodic word prediction using dependency relationships.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Performance scoring of singing voice.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Asian Language Processing, 2015

The expression of singing emotion - contradicting the constraints of song.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Mapping frames with DNN-HMM recognizer for non-parallel voice conversion.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Fundamental frequency modeling using wavelets for emotional voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014

Soft constrained leading voice separation with music score guidance.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

The power of special characters in prosodicword prediction for Chinese TTS.

[BibT_eX]

[DOI]

Zhengchen Zhang

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Acoustic emotion recognition based on fusion of multiple feature-dependent deep Boltzmann machines.

[BibT_eX]

[DOI]

Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A comparative study of spectral transformation techniques for singing voice synthesis.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

I<sup>2</sup>r speech2singing perfects everyone's singing.

[BibT_eX]

[DOI]

Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Intelligibility detection of pathological speech using asymmetric sparse kernel partial least squares classifier.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2014

Emotion analysis of children's stories with context information.

[BibT_eX]

[DOI]

Zhengchen Zhang

Shuzhi Sam Ge

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Emotional facial expression transfer based on temporal restricted Boltzmann machines.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Ensemble Nyström method for predicting conflict level from speech.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013

A dynamic Gaussian process for voice conversion.

[BibT_eX]

[DOI]

Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

I2R Text-to-Speech System for Blizzard Challenge 2013.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2013, 2013

2012

A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis.

[BibT_eX]

[DOI]

Siu Wa Lee

Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Template-based personalized singing voice synthesis.

[BibT_eX]

[DOI]

Ling Cen

Paul Y. Chan

Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

I2R Text-to-Speech System for Blizzard Challenge 2012.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012

2011

Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope.

[BibT_eX]

[DOI]

Siu Wa Lee

Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Solo to a capella conversion - Synthesizing vocal harmony from lead vocals.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Analyzing the Relationship between Formants and Pitch for Singing Voice.

[BibT_eX]

[DOI]

Hwee Teng Tan

Proceedings of the International Conference on Asian Language Processing, 2011

Linear Regression for Prosody Prediction via Convex Optimization.

[BibT_eX]

[DOI]

Ling Cen

Paul Y. Chan

Proceedings of the International Conference on Asian Language Processing, 2011

I2R Text-to-Speech System for Blizzard Challenge 2011.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011

Speech Emotion Recognition System Based on L1 Regularized Linear Regression and Decision Fusion.

[BibT_eX]

[DOI]

Ling Cen

Zhu Liang Yu

Proceedings of the Affective Computing and Intelligent Interaction, 2011

2010

Feature Integration and Dimension Reduction in Unit Selection TTS.

[BibT_eX]

[DOI]

Int. J. Asian Lang. Process., 2010

Considering readability in text-to-speech recording script design.

[BibT_eX]

[DOI]

Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Aligning singing voice with MIDI melody using synthesized audio signal.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

The psychoacoustic approach towards enhancing speech intelligibility in noise.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Generating emotional speech from neutral speech.

[BibT_eX]

[DOI]

Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phonetic segmentation of singing voice using MIDI and parallel speech.

[BibT_eX]

[DOI]

Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Voice conversion: From spoken vowels to singing vowels.

[BibT_eX]

[DOI]

Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

I2R Text-to-Speech System for Blizzard Challenge 2010.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009

Readability Consideration in Speech Synthesis Recording Script Selection.

[BibT_eX]

[DOI]

Int. J. Asian Lang. Process., 2009

Unit selection based speech synthesis for poor channel condition.

[BibT_eX]

[DOI]

Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Refining Unit Boundaries for Mandarin Text-to-Speech Database.

[BibT_eX]

[DOI]

Proceedings of the 2009 International Conference on Asian Language Processing, 2009

I2R Text-to-Speech System for Blizzard Challenge 2009.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

2008

Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Multi-speaker meeting audio segmentation.

[BibT_eX]

[DOI]

Tin Lay Nwe

Swe Zin Kalayar Khine

Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

I2R's Submission to Blizzard Challenge 2008.

[BibT_eX]

[DOI]

Proceedings of the Blizzard Challenge 2008, 2008

2007

Evaluating Prosody of Mandarin Speech for Language Learning.

[BibT_eX]

[DOI]

Tin Lay Nwe

J. Chin. Lang. Comput., 2007

Semantic Transliteration of Personal Names.

[BibT_eX]

[DOI]

Proceedings of the ACL 2007, 2007

2006

A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese.

[BibT_eX]

[DOI]

J. Chin. Lang. Comput., 2006

Fusion of Acoustic and Tokenization Features for Speaker Recognition.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

The IIR Submission to CSLP 2006 Speaker Recognition Evaluation.

[BibT_eX]

[DOI]

Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Analysis and detection of speech under sleep deprivation.

[BibT_eX]

[DOI]

Tin Lay Nwe

Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005

A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS.

[BibT_eX]

[DOI]

Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004

Selecting Prosody Parameters for Unit Selection Based Chinese TTS.

[BibT_eX]

[DOI]

Jun Xu

Proceedings of the Natural Language Processing, 2004

2003

On unit analysis for Cantonese corpus-based TTS.

[BibT_eX]

[DOI]

Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002

Prosodic phrase detection for Chinese TTS using CART and statistical model.

[BibT_eX]

[DOI]

Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Automatic prosodic break labeling for Mandarin Chinese speech data.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Pitch contour model for Chinese text-to-speech using CART and statistical model.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2000

An Example-based Approach for Prosody Generation in Chinese Speech synthesis.

[BibT_eX]

[DOI]

Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Using prosody database in Chinese speech synthesis.

[BibT_eX]

[DOI]