Andrea Michi

Orcid: 0009-0001-4797-3593

According to our database1, Andrea Michi authored at least 9 papers between 2020 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning.
CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.
CoRR, 2024


Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023
Faster sorting algorithms discovered using deep reinforcement learning.
Nat., 2023

Nash Learning from Human Feedback.
CoRR, 2023

Towards practical reinforcement learning for tokamak magnetic control.
CoRR, 2023

2020
A Generic Human-Machine Annotation Framework Based on Dynamic Cooperative Learning.
IEEE Trans. Cybern., 2020

Hyperparameter Selection for Offline Reinforcement Learning.
CoRR, 2020


  Loading...