×
2025
Prompt-to-Leaderboard.
[DOI]
Evan Frick
,
Connor Chen
,
Joseph Tennyson
,
Tianle Li
,
Wei-Lin Chiang
,
Anastasios N. Angelopoulos
,
Ion Stoica
CoRR, February, 2025
How to Evaluate Reward Models for RLHF.
[DOI]
Evan Frick
,
Tianle Li
,
Connor Chen
,
Wei-Lin Chiang
,
Anastasios Nikolas Angelopoulos
,
Jiantao Jiao
,
Banghua Zhu
,
Joseph E. Gonzalez
,
Ion Stoica
Proceedings of the Thirteenth International Conference on Learning Representations, 2025
2024
From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline.
[DOI]
Tianle Li
,
Wei-Lin Chiang
,
Evan Frick
,
Lisa Dunlap
,
Tianhao Wu
,
Banghua Zhu
,
Joseph E. Gonzalez
,
Ion Stoica
CoRR, 2024