Adam Gleave

Orcid: 0000-0002-3467-528X

According to our database¹, Adam Gleave authored at least 28 papers between 2016 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Scaling Laws for Data Poisoning in LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Exploring Scaling Trends in LLM Robustness.

[BibT_eX]

[DOI]

CoRR, 2024

Planning behavior in a recurrent neural network that plays Sokoban.

[BibT_eX]

[DOI]

Adrià Garriga-Alonso

Mohammad Taufeeque

Adam Gleave

CoRR, 2024

Can Go AIs be adversarially robust?

[BibT_eX]

[DOI]

CoRR, 2024

Uncovering Latent Human Wellbeing in Language Model Embeddings.

[BibT_eX]

[DOI]

CoRR, 2024

STARC: A General Framework For Quantifying Differences Between Reward Functions.

[BibT_eX]

[DOI]

Joar Max Viktor Skalse

Lucy Farnik

Sumeet Ramesh Motwani

Erik Jenner

Adam Gleave

Alessandro Abate

Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023

Exploiting Novel GPT-4 APIs.

[BibT_eX]

[DOI]

CoRR, 2023

On The Fragility of Learned Reward Functions.

[BibT_eX]

[DOI]

CoRR, 2023

Adversarial Policies Beat Superhuman Go AIs.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Machine Learning, 2023

Invariance in Policy Optimisation and Partial Identifiability in Reward Learning.

[BibT_eX]

[DOI]

Joar Max Viktor Skalse

Matthew Farrugia-Roberts

Stuart Russell

Alessandro Abate

Adam Gleave

Proceedings of the International Conference on Machine Learning, 2023

2022

Towards Trustworthy Machine Learning

[BibT_eX]

[DOI]

Adam Gleave

PhD thesis, 2022

imitation: Clean Imitation Learning Implementations.

[BibT_eX]

[DOI]

CoRR, 2022

Adversarial Policies Beat Professional-Level Go AIs.

[BibT_eX]

[DOI]

CoRR, 2022

Calculus on MDPs: Potential Shaping as a Gradient.

[BibT_eX]

[DOI]

Erik Jenner

Herke van Hoof

Adam Gleave

CoRR, 2022

Reducing Exploitability with Population Based Training.

[BibT_eX]

[DOI]

Pavel Czempin

Adam Gleave

CoRR, 2022

Preprocessing Reward Functions for Interpretability.

[BibT_eX]

[DOI]

Erik Jenner

Adam Gleave

CoRR, 2022

A Primer on Maximum Causal Entropy Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Adam Gleave

Sam Toyer

CoRR, 2022

Uncertainty Estimation for Language Reward Models.

[BibT_eX]

[DOI]

Adam Gleave

Geoffrey Irving

CoRR, 2022

2021

Stable-Baselines3: Reliable Reinforcement Learning Implementations.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2021

Quantifying Differences in Reward Functions.

[BibT_eX]

[DOI]

Proceedings of the 9th International Conference on Learning Representations, 2021

2020

Understanding Learned Reward Functions.

[BibT_eX]

[DOI]

Eric J. Michaud

Adam Gleave

Stuart Russell

CoRR, 2020

DERAIL: Diagnostic Environments for Reward And Imitation Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Adversarial Policies: Attacking Deep Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

2018

Inverse reinforcement learning for video games.

[BibT_eX]

[DOI]

Aaron Tucker

Adam Gleave

Stuart Russell

CoRR, 2018

Active Inverse Reward Design.

[BibT_eX]

[DOI]

Sören Mindermann

Rohin Shah

Adam Gleave

Dylan Hadfield-Menell

CoRR, 2018

Multi-task Maximum Entropy Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Adam Gleave

Oliver Habryka

CoRR, 2018

2017

Making Compression Algorithms for Unicode Text.

[BibT_eX]

[DOI]

Adam Gleave

Christian Steinruecken

Proceedings of the 2017 Data Compression Conference, 2017

2016

Firmament: Fast, Centralized Cluster Scheduling at Scale.

[BibT_eX]

[DOI]

Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation, 2016

Adam Gleave

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...