Jonathan Uesato

According to our database1, Jonathan Uesato authored at least 28 papers between 2017 and 2022.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2022
Solving math word problems with process- and outcome-based feedback.
CoRR, 2022

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals.
CoRR, 2022

Improving alignment of dialogue agents via targeted human judgements.
CoRR, 2022

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Taxonomy of Risks posed by Language Models.
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

2021
Scaling Language Models: Methods, Analysis & Insights from Training Gopher.
CoRR, 2021

Ethical and social risks of harm from Language Models.
CoRR, 2021

An Empirical Investigation of Learning from Biased Toxicity Labels.
CoRR, 2021

Verifying Probabilistic Specifications with Functional Lagrangians.
CoRR, 2021

Make Sure You're Unsure: A Framework for Verifying Probabilistic Specifications.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Challenges in Detoxifying Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2021, 2021

2020
Avoiding Tampering Incentives in Deep RL via Decoupled Approval.
CoRR, 2020

REALab: An Embedded Perspective on Tampering.
CoRR, 2020

Uncovering the Limits of Adversarial Training against Norm-Bounded Adversarial Examples.
CoRR, 2020

Enabling certification of verification-agnostic networks via memory-efficient semidefinite programming.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Toward Evaluating Robustness of Deep Reinforcement Learning with Continuous Control.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
An Alternative Surrogate Loss for PGD-based Adversarial Testing.
CoRR, 2019

Are Labels Required for Improving Adversarial Robustness?
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures.
Proceedings of the 7th International Conference on Learning Representations, 2019

Verification of Non-Linear Specifications for Neural Networks.
Proceedings of the 7th International Conference on Learning Representations, 2019

Scalable Verified Training for Provably Robust Image Classification.
Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, 2019

Robustness via Curvature Regularization, and Vice Versa.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019

2018
Strength in Numbers: Trading-off Robustness and Computation via Adversarially-Trained Ensembles.
CoRR, 2018

On the Effectiveness of Interval Bound Propagation for Training Verifiably Robust Models.
CoRR, 2018

Training verified learners with learned verifiers.
CoRR, 2018

Adversarial Risk and the Dangers of Evaluating Against Weak Attacks.
Proceedings of the 35th International Conference on Machine Learning, 2018

2017
Semantic Code Repair using Neuro-Symbolic Transformation Networks.
CoRR, 2017

RobustFill: Neural Program Learning under Noisy I/O.
Proceedings of the 34th International Conference on Machine Learning, 2017


  Loading...