Niklas Muennighoff

According to our database¹, Niklas Muennighoff authored at least 52 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

In proceedings

Links

Bibliography

A Survey on Data Selection for Language Models.

Trans. Mach. Learn. Res., 2024

A large-scale audit of dataset licensing and attribution in AI.

Nat. Mac. Intell., 2024

SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models.

OLMoE: Open Mixture-of-Experts Language Models.

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents.

Consent in Crisis: The Rapid Decline of the AI Data Commons.

Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies.

BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval.

RegMix: Data Mixture as Regression for Language Model Pre-training.

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions.

DataComp-LM: In search of the next generation of training sets for language models.

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding.

Lessons from the Trenches on Reproducible Evaluation of Language Models.

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence.

Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order.

Language models scale reliably with over-training and on downstream tasks.

StarCoder 2 and The Stack v2: The Next Generation.

KMMLU: Measuring Massive Multitask Language Understanding in Korean.

Generative Representational Instruction Tuning.

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning.

KTO: Model Alignment as Prospect Theoretic Optimization.

OLMo: Accelerating the Science of Language Models.

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models.

C-Pack: Packed Resources For General Chinese Embeddings.

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2024

Model Alignment as Prospect Theoretic Optimization.

Proceedings of the Forty-first International Conference on Machine Learning, 2024

OctoPack: Instruction Tuning Code Large Language Models.

Proceedings of the Twelfth International Conference on Learning Representations, 2024

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages.

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model.

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research.

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning.

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

OLMo: Accelerating the Science of Language Models.

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.

Trans. Mach. Learn. Res., 2023

StarCoder: may the source be with you!

Trans. Mach. Learn. Res., 2023

The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI.

C-Pack: Packaged Resources To Advance General Chinese Embedding.

SantaCoder: don't reach for the stars!

Scaling Data-Constrained Language Models.

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

FinGPT: Large Generative Models for a Small Language.

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

MTEB: Massive Text Embedding Benchmark.

Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

Crosslingual Generalization through Multitask Finetuning.

Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.

What Language Model to Train if You Have One Million GPU Hours?

SGPT: GPT Sentence Embeddings for Semantic Search.

What Language Model to Train if You Have One Million GPU Hours?

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation.

Diagnosing the Impact of AI on Radiology in China.

Vilio: State-of-the-art Visio-Linguistic Models applied to Hateful Memes.

The Hateful Memes Challenge: Competition Report.

Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 2020

Niklas Muennighoff

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...