Eric Wallace

According to our database1, Eric Wallace authored at least 49 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions.
CoRR, 2024

Unfamiliar Finetuning Examples Control How Language Models Hallucinate.
CoRR, 2024

Privacy Side Channels in Machine Learning Systems.
Proceedings of the 33rd USENIX Security Symposium, 2024

Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Stealing part of a production language model.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

The False Promise of Imitating Proprietary Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

What Evidence Do Language Models Find Convincing?
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Scalable Extraction of Training Data from (Production) Language Models.
CoRR, 2023

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore.
CoRR, 2023

The False Promise of Imitating Proprietary LLMs.
CoRR, 2023

Extracting Training Data from Diffusion Models.
Proceedings of the 32nd USENIX Security Symposium, 2023

Poisoning Language Models During Instruction Tuning.
Proceedings of the International Conference on Machine Learning, 2023

Large Language Models Struggle to Learn Long-Tail Knowledge.
Proceedings of the International Conference on Machine Learning, 2023

Measuring Forgetting of Memorized Training Examples.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

InCoder: A Generative Model for Code Infilling and Synthesis.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
Deduplicating Training Data Mitigates Privacy Risks in Language Models.
Proceedings of the International Conference on Machine Learning, 2022

Analyzing Dynamic Adversarial Training Data in the Limit.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

Automated Crossword Solving.
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022

Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2022, 2022

2021
Calibrate Before Use: Improving Few-Shot Performance of Language Models.
CoRR, 2021

Extracting Training Data from Large Language Models.
Proceedings of the 30th USENIX Security Symposium, 2021

Detoxifying Language Models Risks Marginalizing Minority Voices.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Concealed Data Poisoning Attacks on NLP Models.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

Calibrate Before Use: Improving Few-shot Performance of Language Models.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Customizing Triggers with Concealed Data Poisoning.
CoRR, 2020

Trustworthy AI Inference Systems: An Industry Research View.
CoRR, 2020

Evaluating NLP Models via Contrast Sets.
CoRR, 2020

Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
CoRR, 2020

Train Big, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers.
Proceedings of the 37th International Conference on Machine Learning, 2020

Gradient-based Analysis of NLP Models is Manipulable.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, 2020

Imitation Attacks and Defenses for Black-box Machine Translation Systems.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020

Interpreting Predictions of NLP Models.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts, 2020

AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts.
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, 2020


Pretrained Transformers Improve Out-of-Distribution Robustness.
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020

2019
Trick Me If You Can: Human-in-the-loop Generation of Adversarial Question Answering Examples.
Trans. Assoc. Comput. Linguistics, 2019

Universal Adversarial Triggers for NLP.
CoRR, 2019

Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation.
Proceedings of the 36th International Conference on Machine Learning, 2019

Do NLP Models Know Numbers? Probing Numeracy in Embeddings.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

AllenNLP Interpret: A Framework for Explaining Predictions of NLP Models.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Universal Adversarial Triggers for Attacking and Analyzing NLP.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

Compositional Questions Do Not Necessitate Multi-hop Reasoning.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

Misleading Failures of Partial-input Baselines.
Proceedings of the 57th Conference of the Association for Computational Linguistics, 2019

2018
Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions.
CoRR, 2018

Right Answer for the Wrong Reason: Discovery and Mitigation.
CoRR, 2018

Interpreting Neural Networks with Nearest Neighbors.
Proceedings of the Workshop: Analyzing and Interpreting Neural Networks for NLP, 2018

Pathologies of Neural Models Make Interpretation Difficult.
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, October 31, 2018

Trick Me If You Can: Adversarial Writing of Trivia Challenge Questions.
Proceedings of ACL 2018, Melbourne, Australia, July 15-20, 2018, Student Research Workshop, 2018


  Loading...