Xu Li

Orcid: 0000-0003-2954-3271

Affiliations:

ARC Lab, Tencent PCG
Chinese University of Hong Kong, Department of Systems Engineering and Engineering Management, Hong Kong

According to our database¹, Xu Li authored at least 29 papers between 2016 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Bibliography

2024

EA-VTR: Event-Aware Video-Text Retrieval.

[BibT_eX]

[DOI]

CoRR, 2024

Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2024

2023

AFL-Net: Integrating Audio, Facial, and Lip Modalities with Cross-Attention for Robust Speaker Diarization in the Wild.

[BibT_eX]

[DOI]

CoRR, 2023

Prosody Modeling with 3D Visual Information for Expressive Video Dubbing.

[BibT_eX]

[DOI]

Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2023

2022

Improving the Adversarial Robustness for Speaker Verification by Self-Supervised Learning.

[BibT_eX]

[DOI]

IEEE ACM Trans. Audio Speech Lang. Process., 2022

Spoofing-Aware Speaker Verification by Multi-Level Fusion.

[BibT_eX]

[DOI]

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

A Hierarchical Speaker Representation Framework for One-shot Singing Voice Conversion.

[BibT_eX]

[DOI]

Xu Li

Shansong Liu

Ying Shan

Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2022

2021

VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks.

[BibT_eX]

[DOI]

Proceedings of the 22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30, 2021

Adversarial Defense for Automatic Speaker Verification by Cascaded Self-Supervised Learning Models.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

Replay and Synthetic Speech Detection with Res2Net Architecture.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2021

2020

Deep segmental phonetic posterior-grams based discovery of non-categories in L2 English speech.

[BibT_eX]

[DOI]

CoRR, 2020

Bayesian x-vector: Bayesian Neural Network based x-vector System for Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the Odyssey 2020: The Speaker and Language Recognition Workshop, 2020

Investigating Robustness of Adversarial Samples Detection for Automatic Speaker Verification.

[BibT_eX]

[DOI]

Proceedings of the 21st Annual Conference of the International Speech Communication Association, 2020

Adversarial Attacks on GMM I-Vector Based Speaker Verification Systems.

[BibT_eX]

[DOI]

Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019

Comparative Study of Parametric and Representation Uncertainty Modeling for Recurrent Neural Network Language Models.

[BibT_eX]

[DOI]

Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Speech Emotion Recognition Using Capsule Networks.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

End-to-end Code-switched TTS with Mix of Monolingual Recordings.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Acoustics, 2019

2018

Automatic lexical stress and pitch accent detection for L2 English speech using multi-distribution deep neural networks.

[BibT_eX]

[DOI]

Speech Commun., 2018

Unsupervised Discovery of Non-native Phonetic Patterns in L2 English Speech for Mispronunciation Detection and Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the 19th Annual Conference of the International Speech Communication Association, 2018

Integrating Articulatory Features into Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Multimedia and Expo, 2018

Applying Multitask Learning to Acoustic-Phonemic Model for Mispronunciation Detection and Diagnosis in L2 English Speech.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

Unsupervised Discovery of an Extended Phoneme Set in L2 English Speech for Mispronunciation Detection and Diagnosis.

[BibT_eX]

[DOI]

Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2016

Expressive Speech Driven Talking Avatar Synthesis with DBLSTM Using Limited Amount of Emotional Bimodal Data.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Phoneme Embedding and its Application to Speech Driven Talking Avatar Synthesis.

[BibT_eX]

[DOI]

Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Low level descriptors based DBLSTM bottleneck feature for speech driven talking avatar.

[BibT_eX]

[DOI]

Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Xu Li

Timeline

Legend:

Links

Online presence:

On csauthors.net:

Bibliography

Loading...