Aidan Ewart

According to our database1, Aidan Ewart authored at least 4 papers in 2024.

Collaborative distances:
  • Dijkstra number2 of five.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Mechanistic Unlearning: Robust Knowledge Unlearning and Editing via Mechanistic Localization.
CoRR, 2024

Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
CoRR, 2024

Eight Methods to Evaluate Robust Unlearning in LLMs.
CoRR, 2024

Sparse Autoencoders Find Highly Interpretable Features in Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024


  Loading...