Aengus Lynch

According to our database1, Aengus Lynch authored at least 8 papers between 2022 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2022
2023
2024
0
1
2
3
4
5
6
4
1
1
1
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Best-of-N Jailbreaking.
CoRR, 2024

Targeted Latent Adversarial Training Improves Robustness to Persistent Harmful Behaviors in LLMs.
CoRR, 2024

Analyzing the Generalization and Reliability of Steering Vectors.
CoRR, 2024

Eight Methods to Evaluate Robust Unlearning in LLMs.
CoRR, 2024

Analysing the Generalisation and Reliability of Steering Vectors.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
Spawrious: A Benchmark for Fine Control of Spurious Correlation Biases.
CoRR, 2023

Towards Automated Circuit Discovery for Mechanistic Interpretability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
Causal Machine Learning: A Survey and Open Problems.
CoRR, 2022


  Loading...