Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training.
CoRR, May, 2025
OLMoE: Open Mixture-of-Experts Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
et al.
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
CaLMQA: Exploring culturally specific long-form question answering across 23 languages.
CoRR, 2024
OLMo: Accelerating the Science of Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024