John Schulman

According to our database1, John Schulman authored at least 54 papers between 2011 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2012
2014
2016
2018
2020
2022
2024
0
5
10
2
1
1
4
2
2
2
4
5
1
1
1
2
1
3
1
3
2
4
2
3
5
1
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Measuring short-form factuality in large language models.
CoRR, 2024

Rule Based Rewards for Language Model Safety.
CoRR, 2024

Let's Verify Step by Step.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

2023
Scaling laws for single-agent reinforcement learning.
CoRR, 2023

Scaling Laws for Reward Model Overoptimization.
Proceedings of the International Conference on Machine Learning, 2023

2022
Efficient Training of Language Models to Fill in the Middle.
CoRR, 2022

Training language models to follow instructions with human feedback.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Batch size-invariance for policy optimization.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
WebGPT: Browser-assisted question-answering with human feedback.
CoRR, 2021

Training Verifiers to Solve Math Word Problems.
CoRR, 2021

Unsolved Problems in ML Safety.
CoRR, 2021

The MineRL 2020 Competition on Sample Efficient Reinforcement Learning using Human Priors.
CoRR, 2021

Phasic Policy Gradient.
Proceedings of the 38th International Conference on Machine Learning, 2021

2020
Teacher-Student Curriculum Learning.
IEEE Trans. Neural Networks Learn. Syst., 2020

Scaling Laws for Autoregressive Generative Modeling.
CoRR, 2020

Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark.
Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 2020

Distribution Augmentation for Generative Modeling.
Proceedings of the 37th International Conference on Machine Learning, 2020

Leveraging Procedural Generation to Benchmark Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Policy Gradient Search: Online Planning and Expert Iteration without Search Trees.
CoRR, 2019

Semi-Supervised Learning by Label Gradient Alignment.
CoRR, 2019

Quantifying Generalization in Reinforcement Learning.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Gotta Learn Fast: A New Benchmark for Generalization in RL.
CoRR, 2018

On First-Order Meta-Learning Algorithms.
CoRR, 2018

Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.
Proceedings of the Robotics: Science and Systems XIV, 2018

Meta Learning Shared Hierarchies.
Proceedings of the 6th International Conference on Learning Representations, 2018

Model-Based Reinforcement Learning via Meta-Policy Optimization.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

2017
Learning Complex Dexterous Manipulation with Deep Reinforcement Learning and Demonstrations.
CoRR, 2017

Proximal Policy Optimization Algorithms.
CoRR, 2017

Equivalence Between Policy Gradients and Soft Q-Learning.
CoRR, 2017

UCB and InfoGain Exploration via $\boldsymbol{Q}$-Ensembles.
CoRR, 2017

#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Variational Lossy Autoencoder.
Proceedings of the 5th International Conference on Learning Representations, 2017

2016
Optimizing Expectations: From Deep Reinforcement Learning to Stochastic Computation Graphs.
PhD thesis, 2016

High-Dimensional Continuous Control Using Generalized Advantage Estimation.
Proceedings of the 4th International Conference on Learning Representations, 2016

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks.
CoRR, 2016

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning.
CoRR, 2016

OpenAI Gym.
CoRR, 2016

Concrete Problems in AI Safety.
CoRR, 2016

Theano: A Python framework for fast computation of mathematical expressions.
, , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,
CoRR, 2016

VIME: Variational Information Maximizing Exploration.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Benchmarking Deep Reinforcement Learning for Continuous Control.
Proceedings of the 33nd International Conference on Machine Learning, 2016

2015
Gradient Estimation Using Stochastic Computation Graphs.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Trust Region Policy Optimization.
Proceedings of the 32nd International Conference on Machine Learning, 2015

2014
Motion planning with sequential convex optimization and convex collision checking.
Int. J. Robotics Res., 2014

Scaling up Gaussian Belief Space Planning Through Covariance-Free Trajectory Optimization and Automatic Differentiation.
Proceedings of the Algorithmic Foundations of Robotics XI, 2014

Gaussian belief space planning with discontinuities in sensing domains.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

Planning locally optimal, curvature-constrained trajectories in 3D using sequential convex optimization.
Proceedings of the 2014 IEEE International Conference on Robotics and Automation, 2014

2013
Finding Locally Optimal, Collision-Free Trajectories with Sequential Convex Optimization.
Proceedings of the Robotics: Science and Systems IX, Technische Universität Berlin, Berlin, Germany, June 24, 2013

Learning from Demonstrations Through the Use of Non-rigid Registration.
Proceedings of the Robotics Research, 2013

A case study of trajectory transfer through non-rigid registration for a simplified suturing scenario.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Sigma hulls for Gaussian belief space planning for imprecise articulated robots amid obstacles.
Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013

Tracking deformable objects with point clouds.
Proceedings of the 2013 IEEE International Conference on Robotics and Automation, 2013

2011
Grasping and Fixturing as Submodular Coverage Problems.
Proceedings of the Robotics Research, 2011


  Loading...