Zhaohan Guo

Orcid: 0000-0002-2497-6441

According to our database1, Zhaohan Guo authored at least 24 papers between 2014 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning.
CoRR, 2024

Understanding the performance gap between online and offline alignment algorithms.
CoRR, 2024

Generalized Preference Optimization: A Unified Approach to Offline Alignment.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


Human Alignment of Large Language Models through Online Preference Optimisation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

A General Theoretical Paradigm to Understand Learning from Human Preferences.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Nash Learning from Human Feedback.
CoRR, 2023

Understanding Self-Predictive Learning for Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Representations and Exploration for Deep Reinforcement Learning using Singular Value Decomposition.
Proceedings of the International Conference on Machine Learning, 2023

2022
BYOL-Explore: Exploration by Bootstrapped Prediction.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Geometric Entropic Exploration.
CoRR, 2021

2020
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Agent57: Outperforming the Atari Human Benchmark.
Proceedings of the 37th International Conference on Machine Learning, 2020

Never Give Up: Learning Directed Exploration Strategies.
Proceedings of the 8th International Conference on Learning Representations, 2020

2019
Directed Exploration for Reinforcement Learning.
CoRR, 2019

2018
Neural Predictive Belief Representations.
CoRR, 2018

2017
Using Options for Long-Horizon Off-Policy Evaluation.
CoRR, 2017

Sample Efficient Feature Selection for Factored MDPs.
CoRR, 2017

Using Options and Covariance Testing for Long Horizon Off-Policy Policy Evaluation.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

2016
PAC Continuous State Online Multitask Reinforcement Learning with Identification.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

A PAC RL Algorithm for Episodic POMDPs.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Concurrent PAC RL.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Joint semantic utterance classification and slot filling with recursive neural networks.
Proceedings of the 2014 IEEE Spoken Language Technology Workshop, 2014


  Loading...