2025
Training LLMs on HPC Systems: Best Practices from the OpenGPT-X Project.
CoRR, April, 2025

Memory and Bandwidth are All You Need for Fully Sharded Data Parallel.
CoRR, April, 2025

The Artificial Scientist - in-transit Machine Learning of Plasma Simulations.
CoRR, January, 2025

2024
Time Transfer: On Optimal Learning Rate and Batch Size In The Infinite Data Limit.
CoRR, 2024

Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs.
CoRR, 2024

Tokenizer Choice For LLM Training: Negligible or Crucial?
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
StarCoder: may the source be with you!
Trans. Mach. Learn. Res., 2023

Toward the Production of Spatiotemporally Consistent Annual Land Cover Maps Using Sentinel-2 Time Series.
IEEE Geosci. Remote. Sens. Lett., 2023

Physics informed Neural Networks applied to the description of wave-particle resonance in kinetic simulations of fusion plasmas.
CoRR, 2023

Enhancing Training Set Through Multi-Temporal Attention Analysis in Transformers for Multi-Year Land Cover Mapping.
Proceedings of the IEEE International Geoscience and Remote Sensing Symposium, 2023

2022
Hearts Gym: Learning Reinforcement Learning as a Team Event.
Proceedings of the Third Teaching Machine Learning and Artificial Intelligence Workshop, 2022

2021
JUWELS Booster - A Supercomputer for Large-Scale AI Research.
Proceedings of the High Performance Computing - ISC High Performance Digital 2021 International Workshops, Frankfurt am Main, Germany, June 24, 2021