2024
Can LLM Generate Culturally Relevant Commonsense QA Data? Case Study in Indonesian and Sundanese.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
ICON: A Linguistically-Motivated Large-Scale Benchmark Indonesian Constituency Treebank.
ACM Trans. Asian Low Resour. Lang. Inf. Process., August, 2023

NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023