2025

Humanity's Last Exam.

[DOI]

Long Phan

Alice Gatti

Mohinder Maheshbhai Naiya

Francesco Fournier-Facio

Christian Schröder de Witt

Emily de Oliveira Santos

Andrey Pupasov Maksimov

CoRR, January, 2025

Planning in Natural Language Improves LLM Search for Code Generation.

[DOI]

Proceedings of the Thirteenth International Conference on Learning Representations, 2025

2024

Planning In Natural Language Improves LLM Search For Code Generation.

[DOI]

CoRR, 2024

LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet.

[DOI]

CoRR, 2024

NATURAL PLAN: Benchmarking LLMs on Natural Language Planning.

[DOI]

CoRR, 2024

A Careful Examination of Large Language Model Performance on Grade School Arithmetic.

[DOI]

CoRR, 2024

Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization.

[DOI]

CoRR, 2024

A Careful Examination of Large Language Model Performance on Grade School Arithmetic.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Learning Goal-Conditioned Representations for Language Reward Models.

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models.

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Chain-of-Thought Reasoning is a Policy Improvement Operator.

[DOI]

Hugh Zhang

David C. Parkes

CoRR, 2023

No-regret Learning Dynamics for Sequential Correlated Equilibria.

[DOI]

Hugh Zhang

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022

A Simple Adaptive Procedure Converging to Forgiving Correlated Equilibria.

[DOI]

Hugh Zhang

CoRR, 2022

Equilibrium Finding in Normal-Form Games via Greedy Regret Minimization.

[DOI]

Hugh Zhang

Adam Lerer

Noam Brown

Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2020

Trading Off Diversity and Quality in Natural Language Generation.

[DOI]

CoRR, 2020

2019

Unifying Human and Statistical Evaluation for Natural Language Generation.

[DOI]

Tatsunori B. Hashimoto

Hugh Zhang

Percy Liang

Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2019