A. Rupam Mahmood

Orcid: 0000-0001-6266-162X

Affiliations:
  • University of Alberta, Reinforcement Learning & Artificial Intelligence Lab, Edmonton, AB, Canada
  • Alberta Machine Intelligence Institute (Amii), Edmonton, AB, Canada
  • Kindred AI, Toronto, ON, Canada


According to our database1, A. Rupam Mahmood authored at least 49 papers between 2012 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Loss of plasticity in deep continual learning.
Nat., August, 2024

Deep Policy Gradient Methods Without Batch Updates, Target Networks, or Replay Buffers.
CoRR, 2024

Streaming Deep Reinforcement Learning Finally Works.
CoRR, 2024

Revisiting Sparse Rewards for Goal-Reaching Reinforcement Learning.
RLJ, 2024

Learning to Optimize for Reinforcement Learning.
RLJ, 2024

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling.
RLJ, 2024

Weight Clipping for Deep Continual and Reinforcement Learning.
RLJ, 2024

Target Networks and Over-parameterization Stabilize Off-policy Bootstrapping with Function Approximation.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Revisiting Scalable Hessian Diagonal Approximations for Applications in Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

MaDi: Learning to Mask Distractions for Generalization in Visual Deep Reinforcement Learning.
Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, 2024

2023
Memory-efficient Reinforcement Learning with Value-based Knowledge Consolidation.
Trans. Mach. Learn. Res., 2023

Elephant Neural Networks: Born to Be a Continual Learner.
CoRR, 2023

Maintaining Plasticity in Deep Continual Learning.
CoRR, 2023

Utility-based Perturbed Gradient Descent: An Optimizer for Continual Learning.
CoRR, 2023

Loosely consistent emphatic temporal-difference learning.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Dynamic Decision Frequency with Continuous Options.
IROS, 2023

Reducing the Cost of Cycle-Time Tuning for Real-World Policy Optimization.
Proceedings of the International Joint Conference on Neural Networks, 2023

Real-Time Reinforcement Learning for Vision-Based Robotics Utilizing Local and Remote Computers.
Proceedings of the IEEE International Conference on Robotics and Automation, 2023

Correcting discount-factor mismatch in on-policy policy gradient methods.
Proceedings of the International Conference on Machine Learning, 2023

2022
Greedification Operators for Policy Optimization: Investigating Forward and Reverse KL Divergences.
J. Mach. Learn. Res., 2022

Variable-Decision Frequency Option Critic.
CoRR, 2022

HesScale: Scalable Computation of Hessian Diagonals.
CoRR, 2022

Memory-efficient Reinforcement Learning with Knowledge Consolidation.
CoRR, 2022

Asynchronous Reinforcement Learning for Real-Time Control of Physical Robots.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022

A Temporal-Difference Approach to Policy Gradient Estimation.
Proceedings of the International Conference on Machine Learning, 2022

Model-free Policy Learning with Reward Gradients.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

An Alternate Policy Gradient Estimator for Softmax Policies.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

2021
Continual Backprop: Stochastic Gradient Descent with Persistent Randomness.
CoRR, 2021

Model-free Policy Learning with Reward Gradients.
CoRR, 2021

Analyzing Neural Jacobian Methods in Applications of Visual Servoing and Kinematic Control.
Proceedings of the IEEE International Conference on Robotics and Automation, 2021

2020
Heteroscedastic Uncertainty for Robust Generative Latent Dynamics.
IEEE Robotics Autom. Lett., 2020

2019
Autoregressive Policies for Continuous Control Deep Reinforcement Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

2018
On Generalized Bellman Equations and Temporal-Difference Learning.
J. Mach. Learn. Res., 2018

Setting up a Reinforcement Learning Task with a Real-World Robot.
Proceedings of the 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2018

Benchmarking Reinforcement Learning Algorithms on Real-World Robots.
Proceedings of the 2nd Annual Conference on Robot Learning, 2018

2017
Multi-step Off-policy Learning Without Importance Sampling Ratios.
CoRR, 2017

2016
An Emphatic Approach to the Problem of Off-policy Temporal-Difference Learning.
J. Mach. Learn. Res., 2016

True Online Temporal-Difference Learning.
J. Mach. Learn. Res., 2016

2015
An Empirical Evaluation of True Online TD(λ).
CoRR, 2015

Emphatic Temporal-Difference Learning.
CoRR, 2015

Off-policy learning based on weighted importance sampling with linear computational complexity.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

2014
Off-policy TD( l) with a true online equivalence.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Weighted importance sampling for off-policy learning with linear function approximation.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

A new Q(lambda) with interim forward view and Monte Carlo equivalence.
Proceedings of the 31th International Conference on Machine Learning, 2014

2013
Position Paper: Representation Search through Generate and Test.
Proceedings of the Tenth Symposium on Abstraction, Reformulation, and Approximation, 2013

Representation Search through Generate and Test.
Proceedings of the Learning Rich Representations from Low-Level Sensors, 2013

2012
Tuning-free step-size adaptation.
Proceedings of the 2012 IEEE International Conference on Acoustics, 2012


  Loading...