2024

How will advanced AI systems impact democracy?

[DOI]

CoRR, 2024

Generative AI Misuse: A Taxonomy of Tactics and Insights from Real-World Data.

[DOI]

,

,

,

,

,

CoRR, 2024

The Ethics of Advanced AI Assistants.

[DOI]

,

Arianna Manzini

,

,

Lisa Anne Hendricks

,

,

,

,

,

,

Mikel Rodriguez

,

Seliem El-Sayed

,

,

,

,

,

A. Stevie Bergman

,

,

,

,

Juan Mateos-Garcia

,

Laura Weidinger

,

,

,

,

,

,

,

Victoria Krakovna

,

John Oliver Siy

,

Zeb Kurth-Nelson

,

Amanda McCroskery

,

,

,

Murray Shanahan

,

,

,

,

Yetunde Ibitoye

,

,

,

Sébastien Krier

,

Alexander Reese

,

Sims Witherspoon

,

,

,

,

Matija Franklin

,

Josh A. Goldstein

,

,

,

,

,

Meredith Ringel Morris

,

,

Blaise Agüera y Arcas

,

,

CoRR, 2024

Holistic Safety and Responsibility Evaluations of Advanced AI Models.

[DOI]

Laura Weidinger

,

Joslyn Barnhart

,

,

Christina Butterfield

,

,

,

Lisa Anne Hendricks

,

Ramona Comanescu

,

,

Mikel Rodriguez

,

Jennifer Beroshi

,

,

,

,

Sebastian Farquhar

,

,

,

,

CoRR, 2024

Should Users Trust Advanced AI Assistants? Justified Trust As a Function of Competence and Alignment.

[DOI]

Arianna Manzini

,

,

,

,

,

Proceedings of the 2024 ACM Conference on Fairness, Accountability, and Transparency, 2024

Gaps in the Safety Evaluation of Generative AI.

[DOI]

,

,

Arianna Manzini

,

Lisa Anne Hendricks

,

Ramona Comanescu

,

,

,

Juan Mateos-Garcia

,

A. Stevie Bergman

,

,

,

,

,

,

,

Laura Weidinger

Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

The Code That Binds Us: Navigating the Appropriateness of Human-AI Assistant Relationships.

[DOI]

Arianna Manzini

,

,

,

,

Meredith Ringel Morris

,

Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

All Too Human? Mapping and Mitigating the Risk from Anthropomorphic AI.

[DOI]

,

Laura Weidinger

,

Arianna Manzini

,

,

Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024

2023

Sociotechnical Safety Evaluation of Generative AI Systems.

[DOI]

Laura Weidinger

,

,

,

Arianna Manzini

,

Lisa Anne Hendricks

,

Juan Mateos-Garcia

,

A. Stevie Bergman

,

,

,

,

,

,

CoRR, 2023

Model evaluation for extreme risks.

[DOI]

CoRR, 2023

Representation in AI Evaluations.

[DOI]

A. Stevie Bergman

,

Lisa Anne Hendricks

,

,

,

,

,

,

,

Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, 2023

2022

Manifestations of Xenophobia in AI Systems.

[DOI]

,

Jonathan Leader Maynard

,

CoRR, 2022

A Human Rights-Based Approach to Responsible AI.

[DOI]

Vinodkumar Prabhakaran

,

Margaret Mitchell

,

,

CoRR, 2022

Improving alignment of dialogue agents via targeted human judgements.

[DOI]

CoRR, 2022

In conversation with Artificial Intelligence: aligning language models with human values.

[DOI]

Atoosa Kasirzadeh

,

CoRR, 2022

Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models.

[DOI]

,

,

Jonathan Uesato

,

,

,

Laura Weidinger

,

Sumanth Dathathri

,

,

Geoffrey Irving

,

,

,

Lisa Anne Hendricks

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Taxonomy of Risks posed by Language Models.

[DOI]

Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022

Power to the People? Opportunities and Challenges for Participatory AI.

[DOI]

,

,

Vinodkumar Prabhakaran

,

,

Madeleine Clare Elish

,

,

Proceedings of the Equity and Access in Algorithms, Mechanisms, and Optimization, 2022

2021

Scaling Language Models: Methods, Analysis & Insights from Training Gopher.

[DOI]

,

Sebastian Borgeaud

,

,

,

Jordan Hoffmann

,

H. Francis Song

,

,

Sarah Henderson

,

,

,

Eliza Rutherford

,

,

,

,

,

George van den Driessche

,

Lisa Anne Hendricks

,

,

,

,

,

Sumanth Dathathri

,

,

Jonathan Uesato

,

,

,

Antonia Creswell

,

,

,

,

Siddhant M. Jayakumar

,

Elena Buchatskaya

,

,

Esme Sutherland

,

,

Michela Paganini

,

,

,

Xiang Lorraine Li

,

Adhiguna Kuncoro

,

Aida Nematzadeh

,

Elena Gribovskaya

,

,

Angeliki Lazaridou

,

,

Jean-Baptiste Lespiau

,

Maria Tsimpoukelli

,

Nikolai Grigorev

,

,

Thibault Sottiaux

,

Mantas Pajarskas

,

,

,

,

Cyprien de Masson d'Autume

,

,

,

Vladimir Mikulik

,

Igor Babuschkin

,

,

Diego de Las Casas

,

,

,

,

Matthew J. Johnson

,

Blake A. Hechtman

,

Laura Weidinger

,

,

,

Edward Lockhart

,

,

,

,

,

,

,

Lorrayne Bennett

,

,

Koray Kavukcuoglu

,

Geoffrey Irving

CoRR, 2021

Ethical and social risks of harm from Language Models.

[DOI]

CoRR, 2021

Towards a Theory of Justice for Artificial Intelligence.

[DOI]

CoRR, 2021

Alignment of Language Agents.

[DOI]

,

,

Laura Weidinger

,

,

Vladimir Mikulik

,

Geoffrey Irving

CoRR, 2021

The Challenge of Value Alignment: from Fairer Algorithms to AI Safety.

[DOI]

,

CoRR, 2021

Modelling Cooperation in Network Games with Spatio-Temporal Complexity.

[DOI]

Michiel A. Bakker

,

Richard Everett

,

Laura Weidinger

,

,

William S. Isaac

,

,

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020

Artificial Intelligence, Values, and Alignment.

[DOI]

Minds Mach., 2020