Stefan Heimersheim

According to our database¹, Stefan Heimersheim authored at least 11 papers between 2023 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2023

2024

2025

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Open Problems in Mechanistic Interpretability.

[BibT_eX]

[DOI]

CoRR, January, 2025

Interpretability in Parameter Space: Minimizing Mechanistic Description Length with Attribution-based Parameter Decomposition.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Investigating Sensitive Directions in GPT-2: An Improved Baseline and Comparative Analysis of SAEs.

[BibT_eX]

[DOI]

Daniel J. Lee

Stefan Heimersheim

CoRR, 2024

Evolution of SAE Features Across Layers in LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Characterizing stable regions in the residual stream of LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Evaluating Synthetic Activations composed of SAE Latents in GPT-2.

[BibT_eX]

[DOI]

CoRR, 2024

You can remove GPT2's LayerNorm by fine-tuning.

[BibT_eX]

[DOI]

Stefan Heimersheim

CoRR, 2024

The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks.

[BibT_eX]

[DOI]

Lucius Bushnaq

Stefan Heimersheim

Nicholas Goldowsky-Dill

CoRR, 2024

Using Degeneracy in the Loss Landscape for Mechanistic Interpretability.

[BibT_eX]

[DOI]

Nicholas Goldowsky-Dill

Kaarel Hänni

Cindy Wu

Marius Hobbhahn

CoRR, 2024

How to use and interpret activation patching.

[BibT_eX]

[DOI]

Stefan Heimersheim

Neel Nanda

CoRR, 2024

2023

Towards Automated Circuit Discovery for Mechanistic Interpretability.

[BibT_eX]

[DOI]

Arthur Conmy

Augustine N. Mavor-Parker

Aengus Lynch

Stefan Heimersheim

Adrià Garriga-Alonso

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Stefan Heimersheim

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...