We stand with Ukraine

We stand with Ukraine

Victoria Krakovna

According to our database¹, Victoria Krakovna authored at least 18 papers between 2010 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of three.

Timeline

2010

2012

2014

2016

2018

2020

2022

2024

0

1

2

3

4

5

4

1

1

1

2

1

2

1

2

1

1

1

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

The Ethics of Advanced AI Assistants.

[BibT_eX]

[DOI]

,

Arianna Manzini

,

,

Lisa Anne Hendricks

,

,

,

,

,

,

Mikel Rodriguez

,

Seliem El-Sayed

,

,

,

,

,

A. Stevie Bergman

,

,

,

,

Juan Mateos-Garcia

,

Laura Weidinger

,

,

,

,

,

,

,

Victoria Krakovna

,

John Oliver Siy

,

Zeb Kurth-Nelson

,

Amanda McCroskery

,

,

,

Murray Shanahan

,

,

,

,

Yetunde Ibitoye

,

,

,

Sébastien Krier

,

Alexander Reese

,

Sims Witherspoon

,

,

,

,

Matija Franklin

,

Josh A. Goldstein

,

,

,

,

,

Meredith Ringel Morris

,

,

Blaise Agüera y Arcas

,

,

CoRR, 2024

Evaluating Frontier Models for Dangerous Capabilities.

[BibT_eX]

[DOI]

CoRR, 2024

Limitations of Agents Simulated by Predictive Models.

[BibT_eX]

[DOI]

Raymond Douglas

,

Jacek Karwowski

,

,

,

Victoria Krakovna

CoRR, 2024

Quantifying stability of non-power-seeking in artificial agents.

[BibT_eX]

[DOI]

Evan Ryan Gunter

,

Yevgeny Liokumovich

,

Victoria Krakovna

CoRR, 2024

2023

Power-seeking can be probable and predictive for trained agents.

[BibT_eX]

[DOI]

Victoria Krakovna

,

CoRR, 2023

2022

Goal Misgeneralization: Why Correct Specifications Aren't Enough For Correct Goals.

[BibT_eX]

[DOI]

,

,

,

,

Victoria Krakovna

,

Jonathan Uesato

,

CoRR, 2022

2021

Reward tampering problems and solutions in reinforcement learning: a causal influence diagram perspective.

[BibT_eX]

[DOI]

,

,

,

Victoria Krakovna

Synth., 2021

2020

Avoiding Tampering Incentives in Deep RL via Decoupled Approval.

[BibT_eX]

[DOI]

Jonathan Uesato

,

,

Victoria Krakovna

,

,

,

CoRR, 2020

REALab: An Embedded Perspective on Tampering.

[BibT_eX]

[DOI]

,

Jonathan Uesato

,

,

,

Victoria Krakovna

,

CoRR, 2020

Avoiding Side Effects By Considering Future Tasks.

[BibT_eX]

[DOI]

Victoria Krakovna

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

2019

Penalizing Side Effects using Stepwise Relative Reachability.

[BibT_eX]

[DOI]

Victoria Krakovna

,

,

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Modeling AGI Safety Frameworks with Causal Influence Diagrams.

[BibT_eX]

[DOI]

,

,

Victoria Krakovna

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

2018

Measuring and avoiding side effects using relative reachability.

[BibT_eX]

[DOI]

Victoria Krakovna

,

,

,

CoRR, 2018

2017

AI Safety Gridworlds.

[BibT_eX]

[DOI]

,

,

Victoria Krakovna

,

Pedro A. Ortega

,

,

Andrew Lefrancq

,

,

CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[BibT_eX]

[DOI]

,

Victoria Krakovna

,

,

,

CoRR, 2017

Reinforcement Learning with a Corrupted Reward Channel.

[BibT_eX]

[DOI]

,

Victoria Krakovna

,

,

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

2016

Memory-Bounded Left-Corner Unsupervised Grammar Induction on Child-Directed Input.

[BibT_eX]

[DOI]

,

,

,

Victoria Krakovna

,

Finale Doshi-Velez

,

Timothy A. Miller

,

William Schuler

,

Proceedings of the COLING 2016, 2016

2010

A Generalized-Zero-Preserving Method for Compact Encoding of Concept Lattices.

[BibT_eX]

[DOI]

,

Victoria Krakovna

,

,

Proceedings of the ACL 2010, 2010

Loading...