Prashanth L. A.

Orcid: 0000-0003-0362-6730

Affiliations:
  • University of Maryland
  • INRIA Lille - Nord Europe
  • Indian Institute of Science, Department of Computer Science and Automation


According to our database1, Prashanth L. A. authored at least 66 papers between 2008 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Finite Time Analysis of Temporal Difference Learning for Mean-Variance in a Discounted MDP.
CoRR, 2024

Concentration Bounds for Optimized Certainty Equivalent Risk Estimation.
CoRR, 2024

Truncated Cauchy random perturbations for smoothed functional-based stochastic optimization.
Autom., 2024

Risk Estimation in a Markov Cost Process: Lower and Upper Bounds.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Policy Evaluation for Variance in Average Reward Reinforcement Learning.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Nonasymptotic Bounds for Stochastic Optimization With Biased Noisy Gradient Oracles.
IEEE Trans. Autom. Control., March, 2023

Optimization of utility-based shortfall risk: A non-asymptotic viewpoint.
CoRR, 2023

VaR\ and CVaR Estimation in a Markov Cost Process: Lower and Upper Bounds.
CoRR, 2023

A policy gradient approach for optimization of smooth risk measures.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias.
Proceedings of the 57th Annual Conference on Information Sciences and Systems, 2023

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022
A Wasserstein Distance Approach for Concentration of Empirical Risk Estimates.
J. Mach. Learn. Res., 2022

Risk-Sensitive Reinforcement Learning via Policy Gradient Search.
Found. Trends Mach. Learn., 2022

A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization.
CoRR, 2022

Adaptive Estimation of Random Vectors with Bandit Feedback.
CoRR, 2022

Approximate gradient ascent methods for distortion risk measures.
CoRR, 2022

A Survey of Risk-Aware Multi-Armed Bandits.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021
Smoothed functional-based gradient algorithms for off-policy reinforcement learning: A non-asymptotic viewpoint.
Syst. Control. Lett., 2021

Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling.
Mach. Learn., 2021

Online Estimation and Optimization of Utility-Based Shortfall Risk.
CoRR, 2021

Likelihood ratio-based policy gradient methods for distorted risk measures: A non-asymptotic analysis.
CoRR, 2021

Smoothed functional-based gradient algorithms for off-policy reinforcement learning.
CoRR, 2021

Estimation of Spectral Risk Measures.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Random Directions Stochastic Approximation With Deterministic Perturbations.
IEEE Trans. Autom. Control., 2020

Non-Asymptotic Bounds for Zeroth-Order Stochastic Optimization.
CoRR, 2020

Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions.
Proceedings of the 37th International Conference on Machine Learning, 2020

2019
Concentration bounds for empirical conditional value-at-risk: The unbounded case.
Oper. Res. Lett., 2019

Improved Concentration Bounds for Conditional Value-at-Risk and Cumulative Prospect Theory using Wasserstein distance.
CoRR, 2019

Risk-aware Multi-armed Bandits Using Conditional Value-at-Risk.
CoRR, 2019

Concentration of risk measures: A Wasserstein distance approach.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Correlated bandits or: How to minimize mean-squared error online.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Stochastic Optimization in a Cumulative Prospect Theory Framework.
IEEE Trans. Autom. Control., 2018

Risk-Sensitive Reinforcement Learning: A Constrained Optimization Viewpoint.
CoRR, 2018

2017
Adaptive System Optimization Using Random Directions Stochastic Approximation.
IEEE Trans. Autom. Control., 2017

Weighted Bandits or: How Bandits Learn Distorted Values That Are Not Expected.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
A constrained optimization perspective on actor-critic algorithms and application to network routing.
Syst. Control. Lett., 2016

Variance-constrained actor-critic algorithms for discounted and average reward MDPs.
Mach. Learn., 2016

Cumulative Prospect Theory Meets Reinforcement Learning: Prediction and Control.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Improved Hessian estimation for adaptive random directions stochastic approximation.
Proceedings of the 55th IEEE Conference on Decision and Control, 2016

(Bandit) Convex Optimization with Biased Noisy Gradient Oracles.
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, 2016

2015
Simultaneous perturbation methods for adaptive labor staffing in service systems.
Simul., 2015

Simultaneous Perturbation Newton Algorithms for Simulation Optimization.
J. Optim. Theory Appl., 2015

Cumulative Prospect Theory Meets Reinforcement Learning: Estimation and Control.
CoRR, 2015

Adaptive system optimization using (simultaneous) random directions stochastic approximation.
CoRR, 2015

On TD(0) with function approximation: Concentration bounds and a centered variant with exponential convergence.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015

Fast Gradient Descent for Drifting Least Squares Regression, with Application to Bandits.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks.
Wirel. Networks, 2014

Algorithms for Nash Equilibria in General-Sum Stochastic Games.
CoRR, 2014

Actor-Critic Algorithms for Risk-Sensitive Reinforcement Learning.
CoRR, 2014

Fast LSTD Using Stochastic Approximation: Finite Time Analysis and Application to Traffic Control.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2014

Adaptive sleep-wake control using reinforcement learning in sensor networks.
Proceedings of the Sixth International Conference on Communication Systems and Networks, 2014

Simultaneous perturbation algorithms for batch off-policy search.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Policy Gradients for CVaR-Constrained MDPs.
Proceedings of the Algorithmic Learning Theory - 25th International Conference, 2014

2013
Analysis of stochastic approximation for efficient least squares regression and LSTD.
CoRR, 2013

Online gradient descent for least squares regression: Non-asymptotic bounds and application to bandits.
CoRR, 2013

Reinforcement Learning for Sleep-Wake Scheduling in Sensor Networks.
CoRR, 2013

Actor-Critic Algorithms for Risk-Sensitive MDPs.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Mechanisms for hostile agents with capacity constraints.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

2012
Threshold Tuning Using Stochastic Optimization for Graded Signal Control.
IEEE Trans. Veh. Technol., 2012

2011
Reinforcement Learning With Function Approximation for Traffic Signal Control.
IEEE Trans. Intell. Transp. Syst., 2011

Reinforcement learning with average cost for adaptive control of traffic lights at intersections.
Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems, 2011

Stochastic Optimization for Adaptive Labor Staffing in Service Systems.
Proceedings of the Service-Oriented Computing - 9th International Conference, 2011

2008
OFDM-MAC algorithms and their impact on TCP performance in next generation mobile networks.
Proceedings of the Third International Conference on COMmunication System softWAre and MiddlewaRE (COMSWARE 2008), 2008

MAC Design for Heterogeneous Application Support in OFDM Based Wireless Systems.
Proceedings of the 5th IEEE Consumer Communications and Networking Conference, 2008


  Loading...