Guilherme Penedo

According to our database1, Guilherme Penedo authored at least 5 papers between 2023 and 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale.
CoRR, 2024

2023
The Falcon Series of Open Language Models.
CoRR, 2023

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only.
CoRR, 2023

AlGhafa Evaluation Benchmark for Arabic Language Models.
Proceedings of ArabicNLP 2023, Singapore (Hybrid), December 7, 2023, 2023

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data Only.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023


  Loading...