2025
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, March, 2025

2024
Multilingual Large Language Models: A Systematic Survey.
CoRR, 2024

FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024, 2024

An Empirical Study on the Robustness of Massively Multilingual Neural Machine Translation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
Evaluating Large Language Models: A Comprehensive Survey.
CoRR, 2023

Is Robustness Transferable across Languages in Multilingual Neural Machine Translation?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023