Stella Biderman

Orcid: 0000-0001-8228-1042

According to our database1, Stella Biderman authored at least 60 papers between 2020 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion.
Trans. Mach. Learn. Res., 2024

A Walsh Hadamard Derived Linear Vector Symbolic Architecture.
CoRR, 2024

Consent in Crisis: The Rapid Decline of the AI Data Commons.
CoRR, 2024

LLM Circuit Analyses Are Consistent Across Training and Scale.
CoRR, 2024

Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon.
CoRR, 2024

The Responsible Foundation Model Development Cheatsheet: A Review of Tools & Resources.
CoRR, 2024

Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
CoRR, 2024

Lessons from the Trenches on Reproducible Evaluation of Language Models.
CoRR, 2024

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence.
CoRR, 2024

On the Societal Impact of Open Foundation Models.
CoRR, 2024

KMMLU: Measuring Massive Multitask Language Understanding in Korean.
CoRR, 2024

Suppressing Pink Elephants with Direct Principle Feedback.
CoRR, 2024

The Case for Co-Designing Model Architectures with Hardware.
Proceedings of the 53rd International Conference on Parallel Processing, 2024

Grokking Group Multiplication with Cosets.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stay on Topic with Classifier-Free Guidance.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


Llemma: An Open Language Model for Mathematics.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Re-Evaluating Evaluation for Multilingual Summarization.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Holographic Global Convolutional Networks for Long-Range Prediction Tasks in Malware Detection.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
Trans. Mach. Learn. Res., 2023

Stay on topic with Classifier-Free Guidance.
CoRR, 2023

Can Transformers Learn to Solve Problems Recursively?
CoRR, 2023

Eliciting Latent Predictions from Transformers with the Tuned Lens.
CoRR, 2023

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy Matters for Implicature Resolution by LLMs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Emergent and Predictable Memorization in Large Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

LEACE: Perfect linear concept erasure in closed form.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling.
Proceedings of the International Conference on Machine Learning, 2023

Recasting Self-Attention with Holographic Reduced Representations.
Proceedings of the International Conference on Machine Learning, 2023


trlX: A Framework for Large Scale Reinforcement Learning from Human Feedback.
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, 2023

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

GAIA Search: Hugging Face and Pyserini Interoperability for NLP Training Data Exploration.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2023

Crosslingual Generalization through Multitask Finetuning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023

2022
Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets.
Trans. Assoc. Comput. Linguistics, 2022

MP-NeRF: A massively parallel method for accelerating protein structure reconstruction from internal coordinates.
J. Comput. Chem., 2022

BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting.
CoRR, 2022

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model.
CoRR, 2022

What Language Model to Train if You Have One Million GPU Hours?
CoRR, 2022

Large language models are not zero-shot communicators.
CoRR, 2022

EleutherAI: Going Beyond "Open Science" to "Science in the Open".
CoRR, 2022

BigBIO: A Framework for Data-Centric Biomedical Natural Language Processing.
CoRR, 2022

Data Governance in the Age of Large-Scale Data-Driven Language Technology.
CoRR, 2022

GPT-NeoX-20B: An Open-Source Autoregressive Language Model.
CoRR, 2022

Documenting Geographically and Contextually Diverse Data Sources: The BigScience Catalogue of Language Data and Resources.
CoRR, 2022

Neural Language Models are Effective Plagiarists.
CoRR, 2022

Datasheet for the Pile.
CoRR, 2022




Data Governance in the Age of Large-Scale Data-Driven Language Technology.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

What Language Model to Train if You Have One Million GPU Hours?
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2022, 2022

VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance.
Proceedings of the Computer Vision - ECCV 2022, 2022

Fooling MOSS Detection with Pretrained Language Models.
Proceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022

2021
Multitask Prompted Training Enables Zero-Shot Task Generalization.
CoRR, 2021

Cut the CARP: Fishing for zero-shot story evaluation.
CoRR, 2021

Towards a Formal Model of Narratives.
CoRR, 2021

The Pile: An 800GB Dataset of Diverse Text for Language Modeling.
CoRR, 2021

Magic: The Gathering Is Turing Complete.
Proceedings of the 10th International Conference on Fun with Algorithms, 2021

2020
Magic: the Gathering is as Hard as Arithmetic.
CoRR, 2020

Pitfalls in Machine Learning Research: Reexamining the Development Cycle.
Proceedings of the "I Can't Believe It's Not Better!" at NeurIPS Workshops, 2020


  Loading...