Yongmao Zhang

Orcid: 0009-0000-0526-5778

According to our database1, Yongmao Zhang authored at least 16 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
SStackGNN: Graph Data Augmentation Simplified Stacking Graph Neural Network for Twitter Bot Detection.
Int. J. Comput. Intell. Syst., December, 2024

METTS: Multilingual Emotional Text-to-Speech by Cross-Speaker and Cross-Lingual Emotion Transfer.
IEEE ACM Trans. Audio Speech Lang. Process., 2024

2023
Accent-VITS: accent transfer for end-to-end TTS.
CoRR, 2023

The NPU-MSXF Speech-to-Speech Translation System for IWSLT 2023 Speech-to-Speech Translation Task.
Proceedings of the 20th International Conference on Spoken Language Translation, 2023

VISinger2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

PromptStyle: Controllable Style Transfer for Text-to-Speech with Natural Language Descriptions.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling.
Proceedings of the IEEE International Conference on Acoustics, 2023

DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP.
Proceedings of the IEEE International Conference on Acoustics, 2023

Promptspeaker: Speaker Generation Based on Text Descriptions.
Proceedings of the IEEE Automatic Speech Recognition and Understanding Workshop, 2023

2022
VISinger 2: High-Fidelity End-to-End Singing Voice Synthesis Enhanced by Digital Signal Processing Synthesizer.
CoRR, 2022

AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS.
Proceedings of the 13th International Symposium on Chinese Spoken Language Processing, 2022

Learn2Sing 2.0: Diffusion and Mutual Information-Based Target Speaker SVS by Learning from Singing Teacher.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Opencpop: A High-Quality Open Source Chinese Popular Song Corpus for Singing Voice Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022


  Loading...