Yuping Wang

Affiliations:

ByteDance, Shanghai, China

According to our database¹, Yuping Wang authored at least 28 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Multi-Level Temporal-Channel Speaker Retrieval for Zero-Shot Voice Conversion.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

AudioLDM 2: Learning Holistic Audio Generation With Self-Supervised Pretraining.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

U-Style: Cascading U-Nets With Multi-Level Speaker and Style Modeling for Zero-Shot Voice Cloning.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

Joint Multiscale Cross-Lingual Speaking Style Transfer With Bidirectional Attention Mechanism for Automatic Dubbing.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2024

StreamVoice+: Evolving Into End-to-End Streaming Zero-Shot Voice Conversion.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2024

Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition.

[BibT_eX]

[DOI]

CoRR, 2024

Improving Audio Generation with Visual Enhanced Caption.

[BibT_eX]

[DOI]

CoRR, 2024

VoiceShop: A Unified Speech-to-Speech Framework for Identity-Preserving Zero-Shot Voice Editing.

[BibT_eX]

[DOI]

CoRR, 2024

PolyVoice: Language Models for Speech to Speech Translation.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

MSM-VC: High-Fidelity Source Style Transfer for Non-Parallel Voice Conversion by Multi-Scale Style Modeling.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

DiCLET-TTS: Diffusion Model Based Cross-Lingual Emotion Transfer for Text-to-Speech - A Study Between English and Mandarin.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2023

LM-VC: Zero-Shot Voice Conversion via Speech Generation Based on Language Models.

[BibT_eX]

[DOI]

IEEE Signal Process. Lett., 2023

PolyVoice: Language Models for Speech to Speech Translation.

[BibT_eX]

[DOI]

CoRR, 2023

a unified front-end framework for english text-to-speech synthesis.

[BibT_eX]

[DOI]

CoRR, 2023

Multi-level Temporal-channel Speaker Retrieval for Robust Zero-shot Voice Conversion.

[BibT_eX]

[DOI]

CoRR, 2023

Joint Multi-scale Cross-lingual Speaking Style Transfer with Bidirectional Attention Mechanism for Automatic Dubbing.

[BibT_eX]

[DOI]

CoRR, 2023

Efficient Neural Music Generation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Delivering Speaking Style in Low-Resource Voice Conversion with Multi-Factor Constraints.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

Streaming Voice Conversion via Intermediate Bottleneck Features and Non-Streaming Teacher Guidance.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Controllable and Lossless Non-Autoregressive End-to-End Text-to-Speech.

[BibT_eX]

[DOI]

CoRR, 2022

Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks.

[BibT_eX]

[DOI]

Proceedings of the MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10, 2022

Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

Cloning One's Voice Using Very Limited Data in the Wild.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

Neural Dubber: Dubbing for Silent Videos According to Scripts.

[BibT_eX]

[DOI]

CoRR, 2021

Neural Dubber: Dubbing for Videos According to Scripts.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

2020

Noise Robust TTS for Low Resource Speakers using Pre-trained Model and Speech Enhancement.

[BibT_eX]

[DOI]

CoRR, 2020

Xiaomingbot: A Multilingual Robot News Reporter.

[BibT_eX]

[DOI]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2020

Yuping Wang

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...