Jacob Pfau

According to our database¹, Jacob Pfau authored at least 10 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Taking AI Welfare Seriously.

[BibT_eX]

[DOI]

CoRR, 2024

Steering Without Side Effects: Improving Post-Deployment Control of Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models.

[BibT_eX]

[DOI]

Jacob Pfau

William Merrill

Samuel R. Bowman

CoRR, 2024

2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Self-Consistency of Large Language Models under Ambiguity.

[BibT_eX]

[DOI]

Henning Bartsch

Ole Jorgensen

Domenic Rosati

Jason Hoelscher-Obermaier

Jacob Pfau

Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 2023

2022

Goal Misgeneralization in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Lauro Langosco di Langosco

Proceedings of the International Conference on Machine Learning, 2022

2021

Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models.

[BibT_eX]

[DOI]

npj Digit. Medicine, 2021

Objective Robustness in Deep Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2021

Robust Semantic Interpretability: Revisiting Concept Activation Vectors.

[BibT_eX]

[DOI]

CoRR, 2021

2019

Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias.

[BibT_eX]

[DOI]

CoRR, 2019

Jacob Pfau

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...