Dmitrii Krasheninnikov

Orcid: 0009-0009-4387-8407

According to our database1, Dmitrii Krasheninnikov authored at least 9 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Stress-Testing Capability Elicitation With Password-Locked Models.
CoRR, 2024

Implicit meta-learning may lead language models to trust more reliable sources.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.
Trans. Mach. Learn. Res., 2023

Meta- (out-of-context) learning in neural networks.
CoRR, 2023


2022
Defining and Characterizing Reward Hacking.
CoRR, 2022

Defining and Characterizing Reward Gaming.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Combining Reward Information from Multiple Sources.
CoRR, 2021

2019
Preferences Implicit in the State of the World.
Proceedings of the 7th International Conference on Learning Representations, 2019


  Loading...