Bilal Chughtai

According to our database1, Bilal Chughtai authored at least 8 papers between 2018 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2025
Open Problems in Mechanistic Interpretability.
CoRR, January, 2025

2024
Towards evaluations-based safety cases for AI scheming.
CoRR, 2024

Transformer Circuit Faithfulness Metrics are not Robust.
CoRR, 2024

Can Language Models Explain Their Own Classification Behavior?
CoRR, 2024

Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs.
CoRR, 2024

Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023
A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations.
Proceedings of the International Conference on Machine Learning, 2023

2018
Variable Selection for Chronic Disease Outcome Prediction Using a Causal Inference Technique: A Preliminary Study.
Proceedings of the IEEE International Conference on Healthcare Informatics, 2018


  Loading...