Bernardo Ávila Pires

According to our database1, Bernardo Ávila Pires authored at least 26 papers between 2012 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning.
CoRR, 2024

Offline Regularised Reinforcement Learning for Large Language Models Alignment.
CoRR, 2024

Understanding the performance gap between online and offline alignment algorithms.
CoRR, 2024

Human Alignment of Large Language Models through Online Preference Optimisation.
CoRR, 2024

Off-policy Distributional Q(λ): Distributional RL without Importance Sampling.
CoRR, 2024

Generalized Preference Optimization: A Unified Approach to Offline Alignment.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Human Alignment of Large Language Models through Online Preference Optimisation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023
Hierarchical Reinforcement Learning in Complex 3D Environments.
CoRR, 2023

DoMo-AC: Doubly Multi-step Off-policy Actor-Critic Algorithm.
Proceedings of the International Conference on Machine Learning, 2023

Understanding Self-Predictive Learning for Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Understanding Plasticity in Neural Networks.
Proceedings of the International Conference on Machine Learning, 2023

2022
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

BYOL-Explore: Exploration by Bootstrapped Prediction.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

2021
Neural Recursive Belief States in Multi-Agent Reinforcement Learning.
CoRR, 2021

Geometric Entropic Exploration.
CoRR, 2021

2020
Bootstrap Your Own Latent - A New Approach to Self-Supervised Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
World Discovery Models.
CoRR, 2019

2018
Neural Predictive Belief Representations.
CoRR, 2018

2016
Multiclass Classification Calibration Functions.
CoRR, 2016

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models.
CoRR, 2016

Policy Error Bounds for Model-Based Reinforcement Learning with Factored Linear Models.
Proceedings of the 29th Conference on Learning Theory, 2016

2015
Pathological Effects of Variance on Classification-Based Policy Iteration.
Proceedings of the Learning for General Competency in Video Games, 2015

2014
Pseudo-MDPs and factored linear action models.
Proceedings of the 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2014

2013
Cost-sensitive Multiclass Classification Risk Bounds.
Proceedings of the 30th International Conference on Machine Learning, 2013

2012
Statistical linear estimation with penalized estimators: an application to reinforcement learning.
Proceedings of the 29th International Conference on Machine Learning, 2012


  Loading...