Xi Wang

Orcid: 0000-0002-0434-7939

Affiliations:

Microsoft Cloud and AI, Bejing, China

According to our database¹, Xi Wang authored at least 20 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2025

ZSVC: Zero-shot Style Voice Conversion with Disentangled Latent Diffusion Models and Adversarial Training.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

NaturalSpeech: End-to-End Text-to-Speech Synthesis With Human-Level Quality.

[BibT_eX]

[DOI]

IEEE Trans. Pattern Anal. Mach. Intell., June, 2024

UniStyle: Unified Style Modeling for Speaking Style Captioning and Stylistic Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Contrastive Context-Speech Pretraining for Expressive Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

ContextSpeech: Expressive and Efficient Text-to-Speech for Paragraph Reading.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Large-Scale Automatic Audiobook Creation.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MuLanTTS: The Microsoft Speech Synthesis System for Blizzard Challenge 2023.

[BibT_eX]

[DOI]

Proceedings of the 18th Blizzard Challenge Workshop, Grenoble, France, August 29, 2023, 2023

2022

SoftSpeech: Unsupervised Duration Model in FastSpeech 2.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Self-supervised Context-aware Style Representation for Expressive Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Prosodyspeech: Towards Advanced Prosody Model for Neural Text-to-Speech.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Improving Fastspeech TTS with Efficient Self-Attention and Compact Feed-Forward Network.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Speech Bert Embedding for Improving Prosody in Neural TTS.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

s-Transformer: Segment-Transformer for Robust Neural Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2020

An Efficient Subband Linear Prediction for LPCNet-Based Neural Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2020

2019

Forward-Backward Decoding for Regularizing End-to-End TTS.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

2018

LP-WaveNet: Linear Prediction-based WaveNet Speech Synthesis.

[BibT_eX]

[DOI]

CoRR, 2018

Frame Selection in SI-DNN Phonetic Space with WaveNet Vocoder for Voice Conversion without Parallel Training Data.

[BibT_eX]

[DOI]

Proceedings of the 11th International Symposium on Chinese Spoken Language Processing, 2018

A New Glottal Neural Vocoder for Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Xi Wang

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...