Yinlam Chow

Proceedings of the 9th International Conference on Learning Representations, 2021

Non-Stationary Off-Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

2020

Non-Stationary Latent Bandits.

[BibT_eX]

[DOI]

CoRR, 2020

Piecewise-Stationary Off-Policy Optimization.

[BibT_eX]

[DOI]

CoRR, 2020

Latent Bandits Revisited.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

CoinDICE: Off-Policy Confidence Interval Estimation.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

BRPO: Batch Residual Policy Optimization.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Predictive Coding for Locally-Linear Control.

[BibT_eX]

[DOI]

Proceedings of the 37th International Conference on Machine Learning, 2020

CAQL: Continuous Action Q-Learning.

[BibT_eX]

[DOI]

Moonkyung Ryu

Ross Anderson

Christian Tjandraatmadja

Craig Boutilier

Proceedings of the 8th International Conference on Learning Representations, 2020

Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control.

[BibT_eX]

[DOI]

Proceedings of the 8th International Conference on Learning Representations, 2020

Safe Policy Learning for Continuous Control.

[BibT_eX]

[DOI]

Ofir Nachum

Aleksandra Faust

Edgar A. Duéñez-Guzmán

Proceedings of the 4th Conference on Robot Learning, 2020

2019

A Framework for Time-Consistent, Risk-Sensitive Model Predictive Control: Theory and Algorithms.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2019

AlgaeDICE: Policy Gradient from Arbitrary Experience.

[BibT_eX]

[DOI]

CoRR, 2019

Lyapunov-based Safe Policy Optimization for Continuous Control.

[BibT_eX]

[DOI]

Edgar A. Duéñez-Guzmán

CoRR, 2019

DualDICE: Behavior-Agnostic Estimation of Discounted Stationary Distribution Corrections.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Risk-Sensitive Generative Adversarial Imitation Learning.

[BibT_eX]

[DOI]

Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

2018

A Block Coordinate Ascent Algorithm for Mean-Variance Optimization.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

A Lyapunov-based Approach to Safe Reinforcement Learning.

[BibT_eX]

[DOI]

Ofir Nachum

Edgar A. Duéñez-Guzmán

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

More Robust Doubly Robust Off-policy Evaluation.

[BibT_eX]

[DOI]

Mehrdad Farajtabar

Proceedings of the 35th International Conference on Machine Learning, 2018

Path Consistency Learning in Tsallis Entropy Regularized MDPs.

[BibT_eX]

[DOI]

Ofir Nachum

Proceedings of the 35th International Conference on Machine Learning, 2018

Imitation Learning from Visual Data with Multiple Intentions.

[BibT_eX]

[DOI]

Aviv Tamar

Khashayar Rohanimanesh

Proceedings of the 6th International Conference on Learning Representations, 2018

2017

Sequential Decision Making With Coherent Risk.

[BibT_eX]

[DOI]

IEEE Trans. Autom. Control., 2017

Weighted SGD for $\ell_p$ Regression with Randomized Preconditioning.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2017

Risk-Constrained Reinforcement Learning with Percentile Risk Criteria.

[BibT_eX]

[DOI]

J. Mach. Learn. Res., 2017

A Framework for Time-Consistent, Risk-Averse Model Predictive Control: Theory and Algorithms.

[BibT_eX]

[DOI]

CoRR, 2017

Sequential Multiple Hypothesis Testing with Type I Error Control.

[BibT_eX]

[DOI]

Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, 2017

2016

Distributed Online Modified Greedy Algorithm for Networked Storage Operation Under Uncertainty.

[BibT_eX]

[DOI]

IEEE Trans. Smart Grid, 2016

Weighted SGD for <i>ℓ<sub>p</sub></i> Regression with Randomized Preconditioning.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, 2016

Safe Policy Improvement by Minimizing Robust Baseline Regret.

[BibT_eX]

[DOI]

Marek Petrik

Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Risk aversion in finite Markov Decision Processes using total cost criteria and average value at risk.

[BibT_eX]

[DOI]

Stefano Carpin

Proceedings of the 2016 IEEE International Conference on Robotics and Automation, 2016

2015

Trading Safety Versus Performance: Rapid Deployment of Robotic Swarms with Robust Performance Constraints.

[BibT_eX]

[DOI]

CoRR, 2015

Policy Gradient for Coherent Risk Measures.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Real-time Bidding based Vehicle Sharing.

[BibT_eX]

[DOI]

Jia Yuan Yu

Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

2014

Algorithms for CVaR Optimization in MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Modeling and online control of generalized energy storage networks.

[BibT_eX]

[DOI]

Proceedings of the Fifth International Conference on Future Energy Systems, 2014

Weighted difference approximation of value functions for slow-discounting Markov Decision Processes.

[BibT_eX]

[DOI]

Junjie Qin

Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

A framework for time-consistent, risk-averse model predictive control: Theory and algorithms.

[BibT_eX]

[DOI]

Yin-Lam Chow

Proceedings of the American Control Conference, 2014

2013

A uniform-grid discretization algorithm for stochastic optimal control with risk constraints.

[BibT_eX]

[DOI]

Yin-Lam Chow

Proceedings of the 52nd IEEE Conference on Decision and Control, 2013

Stochastic optimal control with dynamic, time-consistent risk constraints.

[BibT_eX]

[DOI]

Yin-Lam Chow