Aleksandar Makelov

According to our database¹, Aleksandar Makelov authored at least 6 papers between 2018 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

2018

2019

2020

2021

2022

2023

2024

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control.

[BibT_eX]

[DOI]

Aleksandar Makelov

Georg Lange

Neel Nanda

CoRR, 2024

Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching.

[BibT_eX]

[DOI]

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Is This the Subspace You Are Looking for? An Interpretability Illusion for Subspace Activation Patching.

[BibT_eX]

[DOI]

Aleksandar Makelov

Georg Lange

Neel Nanda

CoRR, 2023

Rethinking Backdoor Attacks.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

2022

Towards machine learning models robust to adversarial examples and backdoor attacks

[BibT_eX]

[DOI]

Aleksandar Makelov

PhD thesis, 2022

2018

Towards Deep Learning Models Resistant to Adversarial Attacks.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Aleksandar Makelov

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...