Minghui Dong

Orcid: 0000-0001-6543-2929

According to our database1, Minghui Dong authored at least 98 papers between 2000 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2000
2005
2010
2015
2020
0
5
10
15
1
1
2
1
3
1
1
1
1
1
1
5
2
6
12
13
9
2
4
6
7
3
3
1
3
1
1
1
3
2

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The ISCSLP 2024 Conversational Voice Clone (CoVoC) Challenge: Tasks, Results and Findings.
CoRR, 2024

A Study on Combining Non-Parallel and Parallel Methodologies for Mandarin-English Cross-Lingual Voice Conversion.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
A Dual Target Neural Network Method for Speech Enhancement.
Proceedings of the International Conference on Asian Language Processing, 2023

2021
End-to-End Detection-Segmentation System for Face Labeling.
IEEE Trans. Emerg. Top. Comput. Intell., 2021

2019
CLU-CNNs: Object detection for medical images.
Neurocomputing, 2019

Sparse fully convolutional network for face labeling.
Neurocomputing, 2019

Towards Good Practices for Video Object Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Temporal Feature Augmented Network for Video Instance Segmentation.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision Workshops, 2019

Implementing Prosodic Phrasing in Chinese End-to-end Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2019

On the Study of Generative Adversarial Networks for Cross-Lingual Voice Conversion.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2019

SINGAN: Singing Voice Conversion with Generative Adversarial Networks.
Proceedings of the 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2019

2018
The I2R-NWPU-NUS Text-to-Speech System for Blizzard Challenge 2018.
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

The TL-NTU Text-to-speech System for the Blizzard Challenge 2018.
Proceedings of the Blizzard Challenge 2018, Hyderabad, India, September 8, 2018, 2018

2017
Node-level parallelization for deep neural networks with conditional independent graph.
Neurocomputing, 2017

Multimodal Prediction of Affective Dimensions via Fusing Multiple Regression Techniques.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

A light-weight method of building an LSTM-RNN-based bilingual tts system.
Proceedings of the 2017 International Conference on Asian Language Processing, 2017

The I2R-NWPU Text-to-Speech System for Blizzard Challenge 2017.
Proceedings of the Blizzard Challenge 2017, Stockholm, Sweden, August 25, 2017, 2017

A dual alignment scheme for improved speech-to-singing voice conversion.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Word level prosody prediction using large audiobook dataset.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

Representing raw linguistic information in chinese text-to-speech system.
Proceedings of the 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2017

2016
Guest Editorial: Advances in Machine Learning for Speech Processing.
J. Signal Process. Syst., 2016

High quality voice conversion using prosodic and high-resolution spectral features.
Multim. Tools Appl., 2016

Transition-based Parsing with Context Enhancement and Future Reward Reranking.
CoRR, 2016

Mandarin Prosodic Phrase Prediction based on Syntactic Trees.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

An Automatic Voice Conversion Evaluation Strategy Based on Perceptual Background Noise Distortion and Speaker Similarity.
Proceedings of the 9th ISCA Speech Synthesis Workshop, 2016

I2RNTU at SemEval-2016 Task 4: Classifier Fusion for Polarity Classification in Twitter.
Proceedings of the 10th International Workshop on Semantic Evaluation, 2016

Deep Bidirectional LSTM Modeling of Timbre and Prosody for Emotional Voice Conversion.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SERAPHIM Live! - Singing Synthesis for the Performer, the Composer, and the 3D Game Developer.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

SERAPHIM: A Wavetable Synthesis System with 3D Lip Animation for Real-Time Speech and Singing Applications on Mobile Platforms.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Audio and face video emotion recognition in the wild using deep neural networks and small datasets.
Proceedings of the 18th ACM International Conference on Multimodal Interaction, 2016

A full training framework of cross-stream dependence modelling for HMM-based singing voice synthesis.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Exemplar-based sparse representation of timbre and prosody for voice conversion.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Combining multiple kernel models for automatic intelligibility detection of pathological speech.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

A Stack LSTM Transition-Based Dependency Parser with Context Enhancement and K-best Decoding.
Proceedings of the Chinese Lexical Semantics - 17th Workshop, 2016

The I2R-NWPU-NTU Text-to-Speech System at Blizzard Challenge 2016.
Proceedings of the Blizzard Challenge 2016, Cuppertino, CA, USA, September 16, 2016, 2016

2015
Regularized non-negative matrix factorization using alternating direction method of multipliers and its application to source separation.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

System fusion for high-performance voice conversion.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

An alternating optimization approach for phase retrieval.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

A real-time variable-q non-stationary Gabor transform for pitch shifting.
Proceedings of the 16th Annual Conference of the International Speech Communication Association, 2015

Formant excursion in singing synthesis.
Proceedings of the 2015 IEEE International Conference on Digital Signal Processing, 2015

Sparse representation for frequency warping based voice conversion.
Proceedings of the 2015 IEEE International Conference on Acoustics, 2015

Mandarin prosodic word prediction using dependency relationships.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Performance scoring of singing voice.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

The expression of singing emotion - contradicting the constraints of song.
Proceedings of the 2015 International Conference on Asian Language Processing, 2015

Non-negative matrix factorization using stable alternating direction method of multipliers for source separation.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

A waveform representation framework for high-quality statistical parametric speech synthesis.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Mapping frames with DNN-HMM recognizer for non-parallel voice conversion.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2015

Fundamental frequency modeling using wavelets for emotional voice conversion.
Proceedings of the 2015 International Conference on Affective Computing and Intelligent Interaction, 2015

2014
Soft constrained leading voice separation with music score guidance.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

The power of special characters in prosodicword prediction for Chinese TTS.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

Acoustic emotion recognition based on fusion of multiple feature-dependent deep Boltzmann machines.
Proceedings of the 9th International Symposium on Chinese Spoken Language Processing, 2014

A comparative study of spectral transformation techniques for singing voice synthesis.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

I<sup>2</sup>r speech2singing perfects everyone's singing.
Proceedings of the 15th Annual Conference of the International Speech Communication Association, 2014

Intelligibility detection of pathological speech using asymmetric sparse kernel partial least squares classifier.
Proceedings of the IEEE International Conference on Acoustics, 2014

Emotion analysis of children's stories with context information.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Emotional facial expression transfer based on temporal restricted Boltzmann machines.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

Ensemble Nyström method for predicting conflict level from speech.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2014

2013
A dynamic Gaussian process for voice conversion.
Proceedings of the 2013 IEEE International Conference on Multimedia and Expo Workshops, 2013

I2R Text-to-Speech System for Blizzard Challenge 2013.
Proceedings of the Blizzard Challenge 2013, 2013

2012
A study of F0 modelling and generation with lyrics and shape characterization for singing voice synthesis.
Proceedings of the 8th International Symposium on Chinese Spoken Language Processing, 2012

Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

Template-based personalized singing voice synthesis.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012

I2R Text-to-Speech System for Blizzard Challenge 2012.
Proceedings of the Blizzard Challenge 2012, Portland, OR, USA, September 14, 2012, 2012

2011
Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope.
Proceedings of the 12th Annual Conference of the International Speech Communication Association, 2011

Solo to a capella conversion - Synthesizing vocal harmony from lead vocals.
Proceedings of the 2011 IEEE International Conference on Multimedia and Expo, 2011

Analyzing the Relationship between Formants and Pitch for Singing Voice.
Proceedings of the International Conference on Asian Language Processing, 2011

Linear Regression for Prosody Prediction via Convex Optimization.
Proceedings of the International Conference on Asian Language Processing, 2011

I2R Text-to-Speech System for Blizzard Challenge 2011.
Proceedings of the Blizzard Challenge 2011, Turin, Italy, September 2, 2011, 2011

Speech Emotion Recognition System Based on L1 Regularized Linear Regression and Decision Fusion.
Proceedings of the Affective Computing and Intelligent Interaction, 2011

2010
Feature Integration and Dimension Reduction in Unit Selection TTS.
Int. J. Asian Lang. Process., 2010

Considering readability in text-to-speech recording script design.
Proceedings of the Seventh ISCA Tutorial and Research Workshop on Speech Synthesis, 2010

Aligning singing voice with MIDI melody using synthesized audio signal.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

The psychoacoustic approach towards enhancing speech intelligibility in noise.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Generating emotional speech from neutral speech.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010

Phonetic segmentation of singing voice using MIDI and parallel speech.
Proceedings of the 11th Annual Conference of the International Speech Communication Association, 2010

Voice conversion: From spoken vowels to singing vowels.
Proceedings of the 2010 IEEE International Conference on Multimedia and Expo, 2010

I2R Text-to-Speech System for Blizzard Challenge 2010.
Proceedings of the Blizzard Challenge 2010, Kansai Science City, Japan, September 25, 2010, 2010

2009
Readability Consideration in Speech Synthesis Recording Script Selection.
Int. J. Asian Lang. Process., 2009

Unit selection based speech synthesis for poor channel condition.
Proceedings of the 10th Annual Conference of the International Speech Communication Association, 2009

Refining Unit Boundaries for Mandarin Text-to-Speech Database.
Proceedings of the 2009 International Conference on Asian Language Processing, 2009

I2R Text-to-Speech System for Blizzard Challenge 2009.
Proceedings of the Blizzard Challenge 2009, Edinburgh, Scotland, UK, September 4, 2009, 2009

2008
Predicting Spectral and Prosodic Parameters for Unit Selection in Speech Synthesis.
Proceedings of the 6th International Symposium on Chinese Spoken Language Processing, 2008

Multi-speaker meeting audio segmentation.
Proceedings of the 9th Annual Conference of the International Speech Communication Association, 2008

I2R's Submission to Blizzard Challenge 2008.
Proceedings of the Blizzard Challenge 2008, 2008

2007
Evaluating Prosody of Mandarin Speech for Language Learning.
J. Chin. Lang. Comput., 2007

Semantic Transliteration of Personal Names.
Proceedings of the ACL 2007, 2007

2006
A Unit Selection-based Speech Synthesis Approach for Mandarin Chinese.
J. Chin. Lang. Comput., 2006

Fusion of Acoustic and Tokenization Features for Speaker Recognition.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

The IIR Submission to CSLP 2006 Speaker Recognition Evaluation.
Proceedings of the Chinese Spoken Language Processing, 5th International Symposium, 2006

Analysis and detection of speech under sleep deprivation.
Proceedings of the Ninth International Conference on Spoken Language Processing, 2006

2005
A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS.
Proceedings of the 9th European Conference on Speech Communication and Technology, 2005

2004
Selecting Prosody Parameters for Unit Selection Based Chinese TTS.
Proceedings of the Natural Language Processing, 2004

2003
On unit analysis for Cantonese corpus-based TTS.
Proceedings of the 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003, 2003

2002
Prosodic phrase detection for Chinese TTS using CART and statistical model.
Proceedings of the 2002 International Symposium on Chinese Spoken Language Processing, 2002

Automatic prosodic break labeling for Mandarin Chinese speech data.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

Pitch contour model for Chinese text-to-speech using CART and statistical model.
Proceedings of the 7th International Conference on Spoken Language Processing, ICSLP2002, 2002

2000
An Example-based Approach for Prosody Generation in Chinese Speech synthesis.
Proceedings of the 2000 International Symposium on Chinese Spoken Language Processing, 2000

Using prosody database in Chinese speech synthesis.
Proceedings of the Sixth International Conference on Spoken Language Processing, 2000


  Loading...