Atticus Geiger

According to our database1, Atticus Geiger authored at least 30 papers between 2018 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Evaluating Open-Source Sparse Autoencoders on Disentangling Factual Knowledge in GPT-2 Small.
CoRR, 2024

Recurrent Neural Networks Learn to Store and Generate Sequences using Non-Linear Representations.
CoRR, 2024

Updating CLIP to Prefer Descriptions Over Captions.
CoRR, 2024

ReFT: Representation Finetuning for Language Models.
CoRR, 2024

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments.
CoRR, 2024

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: System Demonstrations, 2024

Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations.
Proceedings of the Causal Learning and Reasoning, 2024

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Linear Representations of Sentiment in Large Language Models.
CoRR, 2023

Interpretability at Scale: Identifying Causal Mechanisms in Alpaca.
CoRR, 2023

Causal Abstraction for Faithful Model Interpretation.
CoRR, 2023

Interpretability at Scale: Identifying Causal Mechanisms in Alpaca.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Causal Proxy Models for Concept-based Model Explanations.
Proceedings of the International Conference on Machine Learning, 2023

A Semantics for Causing, Enabling, and Preventing Verbs Using Structural Causal Models.
Proceedings of the 45th Annual Meeting of the Cognitive Science Society, 2023

Causal Abstraction with Soft Interventions.
Proceedings of the Conference on Causal Learning and Reasoning, 2023

Rigorously Assessing Natural Language Explanations of Neurons.
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 2023

ScoNe: Benchmarking Negation Reasoning in Language Models With Fine-Tuning and In-Context Learning.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2023

2022
CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Causal Distillation for Language Models.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

Inducing Causal Structure for Interpretable Neural Networks.
Proceedings of the International Conference on Machine Learning, 2022

2021
Causal Abstractions of Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Dynabench: Rethinking Benchmarking in NLP.
Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021

DynaSent: A Dynamic Benchmark for Sentiment Analysis.
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, 2021

2020
Modular Representation Underlies Systematic Generalization in Neural Natural Language Inference Models.
CoRR, 2020

Relational reasoning and generalization using non-symbolic neural networks.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Neural Natural Language Inference Models Partially Embed Theories of Lexical Entailment and Negation.
Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2020

2019
Recursive Routing Networks: Learning to Compose Modules for Language Understanding.
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019

Posing Fair Generalization Tasks for Natural Language Inference.
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, 2019

2018
Stress-Testing Neural Models of Natural Language Inference with Multiply-Quantified Sentences.
CoRR, 2018


  Loading...