We stand with Ukraine

We stand with Ukraine

David Lindner

Orcid: 0000-0001-7051-7433

According to our database¹, David Lindner authored at least 25 papers between 2019 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

MISR: Measuring Instrumental Self-Reasoning in Frontier Models.

[BibT_eX]

[DOI]

,

CoRR, 2024

ViSTa Dataset: Do vision-language models understand sequential tasks?

[BibT_eX]

[DOI]

,

Evan Ryan Gunter

,

Mikhail Seleznyov

,

CoRR, 2024

Mapping out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework.

[BibT_eX]

[DOI]

,

,

,

Mennatallah El-Assady

CoRR, 2024

Towards evaluations-based safety cases for AI scheming.

[BibT_eX]

[DOI]

,

Marius Hobbhahn

,

,

Alexander Meinke

,

,

,

,

Jérémy Scheurer

,

,

,

Nicholas Goldowsky-Dill

,

,

,

,

Daniel Kokotajlo

,

CoRR, 2024

On scalable oversight with weak LLMs judging strong LLMs.

[BibT_eX]

[DOI]

,

,

,

Jonah Brown-Cohen

,

,

,

Rishabh Agarwal

,

,

,

Noah D. Goodman

,

CoRR, 2024

Evaluating Frontier Models for Dangerous Capabilities.

[BibT_eX]

[DOI]

CoRR, 2024

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning.

[BibT_eX]

[DOI]

,

Victoriano Montesinos

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Learning Safety Constraints from Demonstrations with Unknown Rewards.

[BibT_eX]

[DOI]

,

,

Sebastian Tschiatschek

,

,

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023

GoSafeOpt: Scalable safe exploration for global optimization of dynamical systems.

[BibT_eX]

[DOI]

,

Matteo Turchetta

,

,

,

Sebastian Trimpe

,

Dominik Baumann

Artif. Intell., July, 2023

Algorithmic Foundations for Safe and Efficient Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

PhD thesis, 2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

RLHF-Blender: A Configurable Interactive Interface for Learning from Diverse Human Feedback.

[BibT_eX]

[DOI]

,

,

,

,

Mennatallah El-Assady

CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.

[BibT_eX]

[DOI]

,

,

,

,

Vladimir Mikulik

CoRR, 2023

Tracr: Compiled Transformers as a Laboratory for Interpretability.

[BibT_eX]

[DOI]

,

,

Sebastian Farquhar

,

,

,

Vladimir Mikulik

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022

Red-Teaming the Stable Diffusion Safety Filter.

[BibT_eX]

[DOI]

,

,

,

,

Florian Tramèr

CoRR, 2022

Humans are not Boltzmann Distributions: Challenges and Opportunities for Modelling Human Feedback and Interaction in Reinforcement Learning.

[BibT_eX]

[DOI]

,

Mennatallah El-Assady

CoRR, 2022

Scalable Safe Exploration for Global Optimization of Dynamical Systems.

[BibT_eX]

[DOI]

,

Matteo Turchetta

,

,

,

Sebastian Trimpe

,

Dominik Baumann

CoRR, 2022

Active Exploration for Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

,

,

Giorgia Ramponi

Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Interactively Learning Preference Constraints in Linear Bandits.

[BibT_eX]

[DOI]

,

Sebastian Tschiatschek

,

,

Proceedings of the International Conference on Machine Learning, 2022

2021

Information Directed Reward Learning for Reinforcement Learning.

[BibT_eX]

[DOI]

,

Matteo Turchetta

,

Sebastian Tschiatschek

,

,

Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Addressing the Long-term Impact of ML Decisions via Policy Regret.

[BibT_eX]

[DOI]

,

,

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Learning What To Do by Simulating the Past.

[BibT_eX]

[DOI]

,

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

Challenges for Using Impact Regularizers to Avoid Negative Side Effects.

[BibT_eX]

[DOI]

,

,

Alexander Meulemans

Proceedings of the Workshop on Artificial Intelligence Safety 2021 (SafeAI 2021) co-located with the Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI 2021), 2021

2019

Sensing Social Media Signals for Cryptocurrency News.

[BibT_eX]

[DOI]

,

,

,

,

,

,

Nino Antulov-Fantulin

Proceedings of the Companion of The 2019 World Wide Web Conference, 2019

Detecting Spiky Corruption in Markov Decision Processes.

[BibT_eX]

[DOI]

,

Tomasz Kisielewski

,

,

Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Loading...