We stand with Ukraine

We stand with Ukraine

Suchin Gururangan

According to our database¹, Suchin Gururangan authored at least 30 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

BTS: Harmonizing Specialized Experts into a Generalist LLM.

[BibT_eX]

[DOI]

,

Prajjwal Bhargava

,

,

,

,

,

Punit Singh Koura

,

,

,

,

Suchin Gururangan

,

CoRR, February, 2025

2024

Data-Centric Methods for Decentralizing Large Language Models

[BibT_eX]

[DOI]

Suchin Gururangan

PhD thesis, 2024

Self-Generated Critiques Boost Reward Modeling for Language Models.

[BibT_eX]

[DOI]

,

,

,

,

,

Richard Yuanzhe Pang

,

,

,

Suchin Gururangan

,

,

Melanie Kambadur

,

,

CoRR, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

,

,

Georgios Smyrnis

,

,

,

Samir Yitzhak Gadre

,

,

Etash Kumar Guha

,

,

,

,

,

Niklas Muennighoff

,

Reinhard Heckel

,

,

,

Suchin Gururangan

,

Mitchell Wortsman

,

,

,

Marianna Nezhurina

,

,

,

,

,

,

,

,

,

,

Gabriel Ilharco

,

,

Kalyani Marathe

,

,

,

Khyathi Raghavi Chandu

,

,

Igor Vasiljevic

,

,

,

,

,

,

Luke Zettlemoyer

,

,

Alaaeldin El-Nouby

,

Hadi Pouransari

,

Alexander Toshev

,

,

Dirk Groeneveld

,

,

,

,

,

Alexandros G. Dimakis

,

,

,

,

Vaishaal Shankar

CoRR, 2024

Language models scale reliably with over-training and on downstream tasks.

[BibT_eX]

[DOI]

CoRR, 2024

Information Flow Control in Machine Learning through Modular Model Architecture.

[BibT_eX]

[DOI]

Trishita Tiwari

,

Suchin Gururangan

,

,

,

Sanjay Kariyappa

,

,

,

,

Hsien-Hsin S. Lee

,

Proceedings of the 33rd USENIX Security Symposium, 2024

DataComp-LM: In search of the next generation of training sets for language models.

[BibT_eX]

[DOI]

,

,

Georgios Smyrnis

,

,

,

Samir Yitzhak Gadre

,

,

,

Sedrick Scott Keh

,

,

,

,

Niklas Muennighoff

,

Reinhard Heckel

,

,

,

Suchin Gururangan

,

Mitchell Wortsman

,

,

,

Marianna Nezhurina

,

,

,

,

,

,

,

,

,

,

Gabriel Ilharco

,

,

Kalyani Marathe

,

,

,

Khyathi Raghavi Chandu

,

,

Igor Vasiljevic

,

,

,

,

,

,

Luke Zettlemoyer

,

,

Alaaeldin El-Nouby

,

Hadi Pouransari

,

Alexander Toshev

,

,

Dirk Groeneveld

,

,

,

,

,

,

,

,

,

Vaishaal Shankar

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

LESS: Selecting Influential Data for Targeted Instruction Tuning.

[BibT_eX]

[DOI]

,

Sadhika Malladi

,

Suchin Gururangan

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

Hannaneh Hajishirzi

,

,

Luke Zettlemoyer

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models.

[BibT_eX]

[DOI]

,

Tomasz Limisiewicz

,

Suchin Gururangan

,

,

,

,

Luke Zettlemoyer

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Time is Encoded in the Weights of Finetuned Language Models.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

lo-fi: distributed fine-tuning without communication.

[BibT_eX]

[DOI]

Mitchell Wortsman

,

Suchin Gururangan

,

,

,

,

Michael G. Rabbat

,

Trans. Mach. Learn. Res., 2023

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

Hannaneh Hajishirzi

,

,

Luke Zettlemoyer

CoRR, 2023

Scaling Expert Language Models with Unsupervised Domain Discovery.

[BibT_eX]

[DOI]

Suchin Gururangan

,

,

,

,

,

,

Luke Zettlemoyer

CoRR, 2023

2022

Editing Models with Task Arithmetic.

[BibT_eX]

[DOI]

Gabriel Ilharco

,

Marco Túlio Ribeiro

,

Mitchell Wortsman

,

Suchin Gururangan

,

,

Hannaneh Hajishirzi

,

CoRR, 2022

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

,

,

Luke Zettlemoyer

CoRR, 2022

Time Waits for No One! Analysis and Challenges of Temporal Misalignment.

[BibT_eX]

[DOI]

,

Daniel Khashabi

,

Suchin Gururangan

,

Karishma Mandyam

,

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

DEMix Layers: Disentangling Domains for Modular Language Modeling.

[BibT_eX]

[DOI]

Suchin Gururangan

,

,

,

,

Luke Zettlemoyer

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Nearest Neighbor Zero-Shot Inference.

[BibT_eX]

[DOI]

,

,

Suchin Gururangan

,

Luke Zettlemoyer

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

M2D2: A Massively Multi-Domain Language Modeling Dataset.

[BibT_eX]

[DOI]

,

,

Suchin Gururangan

,

Luke Zettlemoyer

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection.

[BibT_eX]

[DOI]

Suchin Gururangan

,

,

Sarah K. Dreier

,

,

,

,

Luke Zettlemoyer

,

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021

Detoxifying Language Models Risks Marginalizing Minority Voices.

[BibT_eX]

[DOI]

,

,

,

Suchin Gururangan

,

,

Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Expected Validation Performance and Estimation of a Random Variable's Maximum.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text.

[BibT_eX]

[DOI]

Elizabeth Clark

,

,

,

,

Suchin Gururangan

,

Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020

RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks.

[BibT_eX]

[DOI]

Suchin Gururangan

,

,

Swabha Swayamdipta

,

,

,

,

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019

Show Your Work: Improved Reporting of Experimental Results.

[BibT_eX]

[DOI]

,

Suchin Gururangan

,

,

,

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Variational Pretraining for Semi-supervised Text Classification.

[BibT_eX]

[DOI]

Suchin Gururangan

,

,

,

Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018

Annotation Artifacts in Natural Language Inference Data.

[BibT_eX]

[DOI]

Suchin Gururangan

,

Swabha Swayamdipta

,

,

,

Samuel R. Bowman

,

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Loading...