Jacob Pfau

According to our database1, Jacob Pfau authored at least 10 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Taking AI Welfare Seriously.
CoRR, 2024

Steering Without Side Effects: Improving Post-Deployment Control of Language Models.
CoRR, 2024

Let's Think Dot by Dot: Hidden Computation in Transformer Language Models.
CoRR, 2024

2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
Trans. Mach. Learn. Res., 2023

Self-Consistency of Large Language Models under Ambiguity.
Proceedings of the 6th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, 2023

2022
Goal Misgeneralization in Deep Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

2021
Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models.
npj Digit. Medicine, 2021

Objective Robustness in Deep Reinforcement Learning.
CoRR, 2021

Robust Semantic Interpretability: Revisiting Concept Activation Vectors.
CoRR, 2021

2019
Global Saliency: Aggregating Saliency Maps to Assess Dataset Artefact Bias.
CoRR, 2019


  Loading...