Colin Leong

According to our database1, Colin Leong authored at least 12 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Enhancing Multi-Domain Automatic Short Answer Grading through an Explainable Neuro-Symbolic Pipeline.
CoRR, 2024

2023
The eBible Corpus: Data and Model Benchmarks for Bible Translation for Low-Resource Languages.
CoRR, 2023

Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages.
CoRR, 2023

JWSign: A Highly Multilingual Corpus of Bible Translations for more Diversity in Sign Language Processing.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages.
Proceedings of the 4th Workshop on African Natural Language Processing, 2023

2022
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.
Trans. Assoc. Comput. Linguistics, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.
CoRR, 2022


BibleTTS: a large, high-fidelity, multilingual, and uniquely African speech corpus.
Proceedings of the 23rd Annual Conference of the International Speech Communication Association, 2022

Bloom Library: Multimodal Datasets in 300+ Languages for a Variety of Downstream Tasks.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Phone-ing it in: Towards Flexible Multi-Modal Language Model Training by Phonetic Representations of Data.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022


  Loading...