Atnafu Lambebo Tonja

Orcid: 0000-0002-3501-5136

According to our database1, Atnafu Lambebo Tonja authored at least 38 papers between 2021 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Overview of HOPE at IberLEF 2024: Approaching Hope Speech Detection in Social Media from Two Perspectives, for Equality, Diversity and Inclusion and as Expectations.
Proces. del Leng. Natural, 2024

InkubaLM: A small language model for low-resource African languages.
CoRR, 2024

CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark.
CoRR, 2024

EthioMT: Parallel Corpus for Low-resource Ethiopian Languages.
CoRR, 2024

EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation.
CoRR, 2024

NLP Progress in Indigenous Latin American Languages.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

The Zeno's Paradox of 'Low-Resource' Languages.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation.
Proceedings of the 2024 Joint International Conference on Computational Linguistics, 2024

2023
AfriSpeech-200: Pan-African Accented Speech Dataset for Clinical and General Domain ASR.
Trans. Assoc. Comput. Linguistics, 2023

First Attempt at Building Parallel Corpora for Machine Translation of Northeast India's Very Low-Resource Languages.
CoRR, 2023

Adapting Pretrained ASR Models to Low-resource Clinical Speech using Epistemic Uncertainty-based Data Selection.
CoRR, 2023

Automatic Translation of Hate Speech to Non-hate Speech in Social Media Texts.
CoRR, 2023

Enhancing Translation for Indigenous Languages: Experiments with Multilingual Models.
CoRR, 2023

Parallel Corpus for Indigenous Language Translation: Spanish-Mazatec and Spanish-Mixtec.
CoRR, 2023

AfriQA: Cross-lingual Open-Retrieval Question Answering for African Languages.
CoRR, 2023

The African Stopwords project: curating stopwords for African languages.
CoRR, 2023

MasakhaNEWS: News Topic Classification for African languages.
CoRR, 2023

Masakhane-Afrisenti at SemEval-2023 Task 12: Sentiment Analysis using Afro-centric Language Models and Adapters for Low-resource African Languages.
CoRR, 2023

Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages.
CoRR, 2023

Masakhane-Afrisenti at SemEval-2023 Task 12: Sentiment Analysis using Afro-centric Language Models and Adapters for Low-resource African Languages.
Proceedings of the The 17th International Workshop on Semantic Evaluation, 2023

Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities.
Proceedings of the Fourth workshop on Resources for African Indigenous Languages (RAIL 2023), 2023

AfriNames: Most ASR Models "Butcher" African Names.
Proceedings of the 24th Annual Conference of the International Speech Communication Association, 2023

MasakhaNEWS: News Topic Classification for African languages.
Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, 2023


The Less the Merrier? Investigating Language Representation in Multilingual Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2023, 2023

Adapting to the Low-Resource Double-Bind: Investigating Low-Compute Methods on Low-Resource African Languages.
Proceedings of the 4th Workshop on African Natural Language Processing, 2023


2022
Transformer-based Model for Word Level Language Identification in Code-mixed Kannada-English Texts.
CoRR, 2022

AfroLM: A Self-Active Learning-based Multilingual Pretrained Language Model for 23 African Languages.
CoRR, 2022

Detection of Aggressive and Violent Incidents from Social Media in Spanish using Pre-trained Language Model.
Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2022) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2022), 2022

Improving Neural Machine Translation for Low Resource Languages Using Mixed Training: The Case of Ethiopian Languages.
Proceedings of the Advances in Computational Intelligence, 2022

The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation.
Proceedings of the International Conference on Information and Communication Technology for Development for Africa, 2022

CIC NLP at SMM4H 2022: a BERT-based approach for classification of social media forum posts.
Proceedings of The Seventh Workshop on Social Media Mining for Health Applications, 2022

CIC at CheckThat!-2022: Multi-class and Cross-lingual Fake News Detection.
Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy, September 5th - to, 2022

2021
Multilingual Neural Machine Translation for Low Resourced Languages: Ometo-English.
Proceedings of the International Conference on Information and Communication Technology for Development for Africa, 2021

A Parallel Corpora for bi-directional Neural Machine Translation for Low Resourced Ethiopian Languages.
Proceedings of the International Conference on Information and Communication Technology for Development for Africa, 2021

Early Ginger Disease Detection Using Deep Learning Approach.
Proceedings of the Advances of Science and Technology - 9th EAI International Conference, 2021


  Loading...