Andrea Michi

Orcid: 0009-0001-4797-3593

According to our database¹, Andrea Michi authored at least 9 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of three.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Conditioned Language Policy: A General Framework for Steerable Multi-Objective Finetuning.

[BibT_eX]

[DOI]

CoRR, 2024

BOND: Aligning LLMs with Best-of-N Distillation.

[BibT_eX]

[DOI]

CoRR, 2024

Nash Learning from Human Feedback.

[BibT_eX]

[DOI]

Rémi Munos

Michal Valko

Daniele Calandriello

Mohammad Gheshlaghi Azar

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Conditional Language Policy: A General Framework For Steerable Multi-Objective Finetuning.

[BibT_eX]

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

2023

Faster sorting algorithms discovered using deep reinforcement learning.

[BibT_eX]

[DOI]

Nat., 2023

Nash Learning from Human Feedback.

[BibT_eX]

[DOI]

Rémi Munos

Michal Valko

Daniele Calandriello

Mohammad Gheshlaghi Azar

CoRR, 2023

Towards practical reinforcement learning for tokamak magnetic control.

[BibT_eX]

[DOI]

CoRR, 2023

2020

A Generic Human-Machine Annotation Framework Based on Dynamic Cooperative Learning.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2020

Hyperparameter Selection for Offline Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2020

Andrea Michi

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...