Bilal Chughtai

According to our database¹, Bilal Chughtai authored at least 10 papers between 2018 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2025

Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities.

[BibT_eX]

[DOI]

Dylan Hadfield-Menell

CoRR, February, 2025

Detecting Strategic Deception Using Linear Probes.

[BibT_eX]

[DOI]

Nicholas Goldowsky-Dill

Bilal Chughtai

Stefan Heimersheim

Marius Hobbhahn

CoRR, February, 2025

Open Problems in Mechanistic Interpretability.

[BibT_eX]

[DOI]

CoRR, January, 2025

2024

Towards evaluations-based safety cases for AI scheming.

[BibT_eX]

[DOI]

Nicholas Goldowsky-Dill

CoRR, 2024

Transformer Circuit Faithfulness Metrics are not Robust.

[BibT_eX]

[DOI]

Joseph Miller

Bilal Chughtai

William Saunders

CoRR, 2024

Can Language Models Explain Their Own Classification Behavior?

[BibT_eX]

[DOI]

Dane Sherburn

Bilal Chughtai

Owain Evans

CoRR, 2024

Summing Up the Facts: Additive Mechanisms Behind Factual Recall in LLMs.

[BibT_eX]

[DOI]

Bilal Chughtai

Alan Cooney

Neel Nanda

CoRR, 2024

Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

2023

A Toy Model of Universality: Reverse Engineering how Networks Learn Group Operations.

[BibT_eX]

[DOI]

Bilal Chughtai

Lawrence Chan

Neel Nanda

Proceedings of the International Conference on Machine Learning, 2023

2018

Variable Selection for Chronic Disease Outcome Prediction Using a Causal Inference Technique: A Preliminary Study.

[BibT_eX]

[DOI]

John Richard Lee

Bilal Chughtai

Rema Padman

Proceedings of the IEEE International Conference on Healthcare Informatics, 2018

Bilal Chughtai

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...