Wei Li

Orcid: 0000-0001-7824-4839

Affiliations:
  • ByteDance AI-Lab
  • Georgia Institute of Technology, School of Electrical and Computer Engineering, Atlanta, GA, USA
  • Beijing Language and Culture University, College of Information Science, Beijing, China


According to our database1, Wei Li authored at least 27 papers between 2010 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
A Comprehensive Solution to Connect Speech Encoder and Large Language Model for ASR.
CoRR, 2024

Can Large Language Models Understand Spatial Audio?
CoRR, 2024

video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SALMONN: Towards Generic Hearing Abilities for Large Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Connecting Speech Encoder and Large Language Model for ASR.
Proceedings of the IEEE International Conference on Acoustics, 2024

Extending Large Language Models for Speech and Audio Captioning.
Proceedings of the IEEE International Conference on Acoustics, 2024

2023
Fine-grained Audio-Visual Joint Representations for Multimodal Large Language Models.
CoRR, 2023

Disentangling the Contribution of Non-native Speech in Automated Pronunciation Assessment.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

Phonetic and Prosody-aware Self-supervised Learning Approach for Non-native Fluency Scoring.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

An ASR-Free Fluency Scoring Approach with Self-Supervised Learning.
Proceedings of the IEEE International Conference on Acoustics, 2023

Leveraging Phone-Level Linguistic-Acoustic Similarity For Utterance-Level Pronunciation Scoring.
Proceedings of the IEEE International Conference on Acoustics, 2023

2022
Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information.
CoRR, 2022

A Transfer and Multi-Task Learning based Approach for MOS Prediction.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Using Fluency Representation Learned from Sequential Raw Features for Improving Non-native Fluency Scoring.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

2020
Improving mispronunciation Detection and Enriching Diagnostic feedback for non-Native Learners of Mandarin.
PhD thesis, 2020

Improving Accent Conversion with Reference Encoder and End-To-End Text-To-Speech.
CoRR, 2020

A Cross-Task Transfer Learning Approach to Adapting Deep Speech Enhancement Models to Unseen Background Noise Using Paired Senone Classifiers.
Proceedings of the 2020 IEEE International Conference on Acoustics, 2020

2019
Improving Mispronunciation Detection of Mandarin Tones for Non-Native Learners With Soft-Target Tone Labels and BLSTM-Based Deep Tone Models.
IEEE ACM Trans. Audio Speech Lang. Process., 2019

Improving Audio-visual Speech Recognition Performance with Cross-modal Student-teacher Training.
Proceedings of the IEEE International Conference on Acoustics, 2019

2018
Improving Mandarin Tone Recognition Based on DNN by Combining Acoustic and Articulatory Features Using Extended Recognition Networks.
J. Signal Process. Syst., 2018

Improving Mandarin Tone Mispronunciation Detection for Non-Native Learners with Soft-Target Tone Labels and BLSTM-Based Deep Models.
Proceedings of the 2018 IEEE International Conference on Acoustics, 2018

2017
Improving Mispronunciation Detection for Non-Native Learners with Multisource Information and LSTM-Based Deep Models.
Proceedings of the 18th Annual Conference of the International Speech Communication Association, 2017

2016
Detecting Mispronunciations of L2 Learners and Providing Corrective Feedback Using Knowledge-Guided and Data-Driven Decision Trees.
Proceedings of the 17th Annual Conference of the International Speech Communication Association, 2016

Improving non-native mispronunciation detection and enriching diagnostic feedback with DNN-based speech attribute modeling.
Proceedings of the 2016 IEEE International Conference on Acoustics, 2016

Using tone-based extended recognition network to detect non-native Mandarin tone mispronunciations.
Proceedings of the Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, 2016

2013
Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion.
Proceedings of the 2013 International Conference on Asian Language Processing, 2013

2010
A study on Functional Loads of phonetic contrasts under context based on Mutual Information of Chinese text and phonemes.
Proceedings of the 7th International Symposium on Chinese Spoken Language Processing, 2010


  Loading...