2024

Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs.

[DOI]

Chris Yuhao Liu

Liang Zeng

CoRR, 2024

Skywork-Math: Data Scaling Laws for Mathematical Reasoning in Large Language Models - The Story Goes On.

[DOI]

CoRR, 2024

Q*: Improving Multi-step Reasoning for LLMs with Deliberative Planning.

[DOI]

CoRR, 2024

2021

SCC: an efficient deep reinforcement learning agent mastering the game of StarCraft II.

[DOI]

Proceedings of the 38th International Conference on Machine Learning, 2021