Suchin Gururangan

According to our database1, Suchin Gururangan authored at least 26 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
DataComp-LM: In search of the next generation of training sets for language models.
CoRR, 2024

Language models scale reliably with over-training and on downstream tasks.
CoRR, 2024

Information Flow Control in Machine Learning through Modular Model Architecture.
Proceedings of the 33rd USENIX Security Symposium, 2024

LESS: Selecting Influential Data for Targeted Instruction Tuning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Time is Encoded in the Weights of Finetuned Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
lo-fi: distributed fine-tuning without communication.
Trans. Mach. Learn. Res., 2023

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.
CoRR, 2023

Scaling Expert Language Models with Unsupervised Domain Discovery.
CoRR, 2023

2022
Editing Models with Task Arithmetic.
CoRR, 2022

Branch-Train-Merge: Embarrassingly Parallel Training of Expert Language Models.
CoRR, 2022

Time Waits for No One! Analysis and Challenges of Temporal Misalignment.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

DEMix Layers: Disentangling Domains for Modular Language Modeling.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Nearest Neighbor Zero-Shot Inference.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

M2D2: A Massively Multi-Domain Language Modeling Dataset.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

Whose Language Counts as High Quality? Measuring Language Ideologies in Text Data Selection.
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 2022

2021
Detoxifying Language Models Risks Marginalizing Minority Voices.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Expected Validation Performance and Estimation of a Random Variable's Maximum.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Don't Stop Pretraining: Adapt Language Models to Domains and Tasks.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Show Your Work: Improved Reporting of Experimental Results.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Variational Pretraining for Semi-supervised Text Classification.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Annotation Artifacts in Natural Language Inference Data.
Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018


  Loading...