Yinghao Aaron Li

According to our database1, Yinghao Aaron Li authored at least 15 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

2021
2022
2023
2024
0
1
2
3
4
5
6
7
5
1
1
1
5
1
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
StyleTTS-ZS: Efficient High-Quality Zero-Shot Text-to-Speech Synthesis with Distilled Time-Varying Style Diffusion.
CoRR, 2024

Style-Talker: Finetuning Audio Language Model and Style-Based Text-to-Speech Model for Fast Spoken Dialogue Generation.
CoRR, 2024

Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis.
CoRR, 2024

Listen, Chat, and Edit: Text-Guided Soundscape Modification for Enhanced Auditory Experience.
CoRR, 2024

Contextual Feature Extraction Hierarchies Converge in Large Language Models and the Brain.
CoRR, 2024

Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
HiFTNet: A Fast High-Quality Neural Vocoder with Harmonic-plus-Noise Filter and Inverse Short Time Fourier Transform.
CoRR, 2023

SLMGAN: Exploiting Speech Language Model Representations for Unsupervised Zero-Shot Voice Conversion in GANs.
Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2023

StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

DeCoR: Defy Knowledge Forgetting by Predicting Earlier Audio Codes.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions.
Proceedings of the IEEE International Conference on Acoustics, 2023

Improved Decoding of Attentional Selection in Multi-Talker Environments with Self-Supervised Learned Speech Representation.
Proceedings of the 45th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2023

2022
StyleTTS: A Style-Based Generative Model for Natural and Diverse Text-to-Speech Synthesis.
CoRR, 2022

Styletts-VC: One-Shot Voice Conversion by Knowledge Transfer From Style-Based TTS Models.
Proceedings of the IEEE Spoken Language Technology Workshop, 2022

2021
StarGANv2-VC: A Diverse, Unsupervised, Non-Parallel Framework for Natural-Sounding Voice Conversion.
Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021


  Loading...