Shun Lei

Orcid: 0000-0003-3597-3913

According to our database1, Shun Lei authored at least 21 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2021
2022
2023
2024
0
5
10
6
2
5
3
4
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
TouchTTS: An Embarrassingly Simple TTS Framework that Everyone Can Touch.
CoRR, 2024

The Codec Language Model-based Zero-Shot Spontaneous Style TTS System for CoVoC Challenge 2024.
CoRR, 2024

MuCodec: Ultra Low-Bitrate Music Codec.
CoRR, 2024

An End-to-End Approach for Chord-Conditioned Song Generation.
CoRR, 2024

SongCreator: Lyrics-based Universal Song Generation.
CoRR, 2024

Foundation Models for Music: A Survey.
CoRR, 2024

VoxInstruct: Expressive Human Instruction-to-Speech Generation with Unified Multilingual Codec Language Modelling.
Proceedings of the 32nd ACM International Conference on Multimedia, MM 2024, Melbourne, VIC, Australia, 28 October 2024, 2024

NRAdapt: Noise-Robust Adaptive Text to Speech Using Untranscribed Data.
Proceedings of the International Joint Conference on Neural Networks, 2024

The THU-HCSI Multi-Speaker Multi-Lingual Few-Shot Voice Cloning System for LIMMITS'24 Challenge.
Proceedings of the IEEE International Conference on Acoustics, 2024

Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts.
Proceedings of the IEEE International Conference on Acoustics, 2024

SimCalib: Graph Neural Network Calibration Based on Similarity between Nodes.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
MSStyleTTS: Multi-Scale Style Modeling With Hierarchical Context Information for Expressive Speech Synthesis.
IEEE ACM Trans. Audio Speech Lang. Process., 2023

AdaMesh: Personalized Facial Expressions and Head Poses for Speech-Driven 3D Facial Animation.
CoRR, 2023

Towards Spontaneous Style Modeling with Semi-supervised Pre-training for Conversational Text-to-Speech Synthesis.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

GTN-Bailando: Genre Consistent long-Term 3D Dance Generation Based on Pre-Trained Genre Token Network.
Proceedings of the IEEE International Conference on Acoustics, 2023

Context-Aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Towards Improving the Expressiveness of Singing Voice Synthesis with BERT Derived Semantic Information.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Towards Multi-Scale Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis.
Proceedings of the IEEE International Conference on Acoustics, 2022

Unsupervised Multi-scale Expressive Speaking Style Modeling with Hierarchical Context Information for Audiobook Speech Synthesis.
Proceedings of the 29th International Conference on Computational Linguistics, 2022

2021
MRC-LSTM: A Hybrid Approach of Multi-scale Residual CNN and LSTM to Predict Bitcoin Price.
Proceedings of the International Joint Conference on Neural Networks, 2021


  Loading...