Shengyi Huang

Orcid: 0000-0003-4986-1365

According to our database1, Shengyi Huang authored at least 15 papers between 2019 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
The N+ Implementation Details of RLHF with PPO: A Case Study on TL;DR Summarization.
CoRR, 2024

Open RL Benchmark: Comprehensive Tracked Experiments for Reinforcement Learning.
CoRR, 2024

Cleanba: A Reproducible and Efficient Distributed Reinforcement Learning Platform.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Zephyr: Direct Distillation of LM Alignment.
CoRR, 2023

Reward Scale Robustness for Proximal Policy Optimization via DreamerV3 Tricks.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2022
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms.
J. Mach. Learn. Res., 2022

A2C is a special case of PPO.
CoRR, 2022

EnvPool: A Highly Parallel Reinforcement Learning Environment Execution Engine.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

A Closer Look at Invalid Action Masking in Policy Gradient Algorithms.
Proceedings of the Thirty-Fifth International Florida Artificial Intelligence Research Society Conference, 2022

2021
CleanRL: High-quality Single-file Implementations of Deep Reinforcement Learning Algorithms.
CoRR, 2021

An Empirical Investigation of Early Stopping Optimizations in Proximal Policy Optimization.
IEEE Access, 2021

Gym-µRTS: Toward Affordable Full Game Real-time Strategy Games Research with Deep Reinforcement Learning.
Proceedings of the 2021 IEEE Conference on Games (CoG), 2021

2020
Griddly: A platform for AI research in games.
CoRR, 2020

Action Guidance: Getting the Best of Sparse Rewards and Shaped Rewards for Real-time Strategy Games.
CoRR, 2020

2019
Comparing Observation and Action Representations for Deep Reinforcement Learning in MicroRTS.
CoRR, 2019


  Loading...