We stand with Ukraine

We stand with Ukraine

Francis Rhys Ward

According to our database¹, Francis Rhys Ward authored at least 13 papers between 2020 and 2025.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

Online presence:

On csauthors.net:

Bibliography

2025

The Elicitation Game: Evaluating Capability Elicitation Techniques.

[BibT_eX]

[DOI]

Felix Hofstätter

,

Teun van der Weij

,

,

Henning Bartsch

,

Francis Rhys Ward

CoRR, February, 2025

Towards a Theory of AI Personhood.

[BibT_eX]

[DOI]

Francis Rhys Ward

CoRR, January, 2025

2024

AI Sandbagging: Language Models can Strategically Underperform on Evaluations.

[BibT_eX]

[DOI]

Teun van der Weij

,

Felix Hofstätter

,

,

Samuel F. Brown

,

Francis Rhys Ward

CoRR, 2024

Evaluating Language Model Character Traits.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

,

,

,

,

,

,

Raymond Douglas

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

The Reasons that Agents Act: Intention and Instrumental Goals.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

Matt MacDermott

,

Francesco Belardinelli

,

,

Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

2023

Experiments with Detecting and Mitigating AI Deception.

[BibT_eX]

[DOI]

,

Francis Rhys Ward

,

C. Henrik Åslund

CoRR, 2023

Honesty Is the Best Policy: Defining and Mitigating AI Deception.

[BibT_eX]

[DOI]

,

,

Francesco Belardinelli

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Defining Deception in Structural Causal Games.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

,

Francesco Belardinelli

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022

Argumentative Reward Learning: Reasoning About Human Preferences.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

Francesco Belardinelli

,

CoRR, 2022

A Causal Perspective on AI Deception in Games.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

,

Francesco Belardinelli

Proceedings of the International Conference on Logic Programming 2022 Workshops co-located with the 38th International Conference on Logic Programming (ICLP 2022), Haifa, Israel, July 31st, 2022

On Agent Incentives to Manipulate Human Feedback in Multi-Agent Reward Learning Scenarios.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

,

Francesco Belardinelli

Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022

2020

An Assurance Case Pattern for the Interpretability of Machine Learning in Safety-Critical Systems.

[BibT_eX]

[DOI]

Francis Rhys Ward

,

Proceedings of the Computer Safety, Reliability, and Security. SAFECOMP 2020 Workshops, 2020

Geometric Deep Learning for Post-Menstrual Age Prediction Based on the Neonatal White Matter Cortical Surface.

[BibT_eX]

[DOI]

Vitalis Vosylius

,

,

,

Alexey Zakharov

,

,

Loïc Le Folgoc

,

,

Antonios Makropoulos

,

,

Daniel Rueckert

,

Proceedings of the Uncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Graphs in Biomedical Image Analysis, 2020

Loading...