Shalabh Bhatnagar
Orcid: 0000-0001-7644-3914
According to our database1,
Shalabh Bhatnagar
authored at least 221 papers
between 1995 and 2025.
Collaborative distances:
Collaborative distances:
Book In proceedings Article PhD thesis Dataset OtherLinks
Online presence:
Approximate linear programming for decentralized policy iteration in cooperative multi-agent Markov decision processes.
Syst. Control. Lett., 2025
IEEE Trans. Artif. Intell., July, 2024
IEEE Commun. Lett., January, 2024
Gradient-Weighted Feature Back-Projection: A Fast Alternative to Feature Distillation in 3D Gaussian Splatting.
CoRR, 2024
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks.
CoRR, 2024
Critic-Actor for Average Reward MDPs with Function Approximation: A Finite-Time Analysis.
CoRR, 2024
Truncated Cauchy random perturbations for smoothed functional-based stochastic optimization.
Autom., 2024
Proceedings of the SIGGRAPH Asia 2024 Posters, SA 2024, Tokyo, Japan, December 3-6, 2024, 2024
Proceedings of the IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, 2024
Proceedings of the Pattern Recognition - 27th International Conference, 2024
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024
IEEE Control. Syst. Lett., 2023
Approximate Linear Programming and Decentralized Policy Improvement in Cooperative Multi-agent Markov Decision Processes.
CoRR, 2023
Finite Time Analysis of Constrained Actor Critic and Constrained Natural Actor Critic Algorithms.
CoRR, 2023
A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks.
CoRR, 2023
Proceedings of the 32nd IEEE International Conference on Robot and Human Interactive Communication, 2023
Proceedings of the International Conference on Machine Learning, 2023
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias.
Proceedings of the 57th Annual Conference on Information Sciences and Systems, 2023
Proceedings of the 62nd IEEE Conference on Decision and Control, 2023
IEEE Trans. Autom. Control., 2022
IEEE Trans. Autom. Control., 2022
A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization.
CoRR, 2022
Proceedings of the IEEE International Conference on Systems, Man, and Cybernetics, 2022
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Proceedings of the International Joint Conference on Neural Networks, 2022
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning.
Proceedings of the 2022 International Conference on Robotics and Automation, 2022
Co-operative Multi-agent Twin Delayed DDPG for Robust Phase Duration Optimization of Large Road Networks.
Proceedings of the Agents and Artificial Intelligence - 14th International Conference, 2022
Robust Traffic Signal Timing Control using Multiagent Twin Delayed Deep Deterministic Policy Gradients.
Proceedings of the 14th International Conference on Agents and Artificial Intelligence, 2022
Proceedings of the 58th Annual Allerton Conference on Communication, 2022
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022
Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge.
IEEE Trans. Intell. Transp. Syst., 2021
Asynchronous Stochastic Approximations With Asymptotically Biased Errors and Deep Multiagent Learning.
IEEE Trans. Autom. Control., 2021
Stochastic Approximation With Iterate-Dependent Markov Noise Under Verifiable Conditions in Compact State Space With the Stability of Iterates Not Ensured.
IEEE Trans. Autom. Control., 2021
On tight bounds for function approximation error in risk-sensitive reinforcement learning.
Syst. Control. Lett., 2021
Novel First Order Bayesian Optimization with an Application to Reinforcement Learning.
Appl. Intell., 2021
Attention Actor-Critic Algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021
Analysis of Stochastic Approximation Schemes With Set-Valued Maps in the Absence of a Stability Guarantee and Their Stabilization.
IEEE Trans. Autom. Control., 2020
IEEE Trans. Autom. Control., 2020
Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise.
Math. Oper. Res., 2020
CoRR, 2020
CoRR, 2020
Appl. Intell., 2020
Proceedings of the 29th IEEE International Conference on Robot and Human Interactive Communication, 2020
Proceedings of the 31st IEEE Annual International Symposium on Personal, 2020
Proceedings of the IEEE PES Innovative Smart Grid Technologies Europe, 2020
Deep Reinforcement Learning with Successive Over-Relaxation and its Application in Autoscaling Cloud Resources.
Proceedings of the 2020 International Joint Conference on Neural Networks, 2020
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020
Proceedings of the 4th Conference on Robot Learning, 2020
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020
Stability of Stochastic Approximations With "Controlled Markov" Noise and Temporal Difference Learning.
IEEE Trans. Autom. Control., 2019
An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms.
IEEE Control. Syst. Lett., 2019
Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch.
CoRR, 2019
Efficient Adaptive Resource Provisioning for Cloud Applications using Reinforcement Learning.
Proceedings of the IEEE 4th International Workshops on Foundations and Applications of Self* Systems, 2019
Proceedings of the 28th IEEE International Conference on Robot and Human Interactive Communication, 2019
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots.
Proceedings of the 28th IEEE International Conference on Robot and Human Interactive Communication, 2019
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives.
Proceedings of the International Conference on Robotics and Automation, 2019
Predictive and Prescriptive Analytics for Performance Optimization: Framework and a Case Study on a Large-Scale Enterprise System.
Proceedings of the 18th IEEE International Conference On Machine Learning And Applications, 2019
Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, 2019
Proceedings of the 58th IEEE Conference on Decision and Control, 2019
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019
Proceedings of the 57th Annual Allerton Conference on Communication, 2019
Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks.
IEEE Wirel. Commun. Lett., 2018
Telecommun. Syst., 2018
IEEE Trans. Autom. Control., 2018
IEEE Trans. Autom. Control., 2018
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning.
Math. Oper. Res., 2018
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method.
Mach. Learn., 2018
An incremental off-policy search in a model-free Markov decision process using a single sample path.
Mach. Learn., 2018
Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space.
INFORMS J. Comput., 2018
CoRR, 2018
A unified decision making framework for supply and demand management in microgrid networks.
Proceedings of the 2018 IEEE International Conference on Communications, 2018
Proceedings of the 57th IEEE Conference on Decision and Control, 2018
Adaptive mean queue size and its rate of change: queue management with random dropping.
Telecommun. Syst., 2017
IEEE Trans. Autom. Control., 2017
Math. Oper. Res., 2017
A unified decision making framework for supply and demand management in microgrid networks.
CoRR, 2017
Conditions for Stability and Convergence of Set-Valued Stochastic Approximations: Applications to Approximate Value and Fixed point Iterations with Noise.
CoRR, 2017
CoRR, 2017
Deterministic Perturbations For Simultaneous Perturbation Methods Using Circulant Matrices.
CoRR, 2017
Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization.
Comput. Optim. Appl., 2017
Autom., 2017
Proceedings of the Pattern Recognition and Machine Intelligence, 2017
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017
Proceedings of the 2017 IEEE 10th International Conference on Cloud Computing (CLOUD), 2017
ACM Trans. Model. Comput. Simul., 2016
A constrained optimization perspective on actor-critic algorithms and application to network routing.
Syst. Control. Lett., 2016
Discret. Event Dyn. Syst., 2016
Stochastic Recursive Inclusions in two timescales with non-additive iterate dependent Markov noise.
CoRR, 2016
CoRR, 2016
Gradient-based learning algorithms with constant-error estimators: stability and convergence.
CoRR, 2016
On a convergent off -policy temporal difference learning algorithm in on-line learning environment.
CoRR, 2016
A note on the function approximation error bound for risk-sensitive reinforcement learning.
CoRR, 2016
A Cross Entropy based Stochastic Approximation Algorithm for Reinforcement Learning with Linear Function Approximation.
CoRR, 2016
Proceedings of the Winter Simulation Conference, 2016
Proceedings of the 2016 International Joint Conference on Neural Networks, 2016
Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016
Revisiting the Cross Entropy Method with Applications in Stochastic Global Optimization and Reinforcement Learning.
Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016
Proceedings of the 55th IEEE Conference on Decision and Control, 2016
IEEE Trans. Commun., 2015
Simul., 2015
Necessary and sufficient conditions for optimality in constrained general sum stochastic games.
Syst. Control. Lett., 2015
J. Optim. Theory Appl., 2015
A bi-convex optimization problem to compute Nash equilibrium in n-player games and an algorithm.
CoRR, 2015
Adaptive system optimization using (simultaneous) random directions stochastic approximation.
CoRR, 2015
Proceedings of the Neural Information Processing - 22nd International Conference, 2015
Proceedings of the 7th International Conference on Communication Systems and Networks, 2015
Two-Timescale Algorithms for Learning Nash Equilibria in General-Sum Stochastic Games.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015
Two timescale convergent Q-learning for sleep-scheduling in wireless sensor networks.
Wirel. Networks, 2014
Smoothed Functional Algorithms for Stochastic Optimization Using <i>q</i>-Gaussian Distributions.
ACM Trans. Model. Comput. Simul., 2014
Int. Trans. Oper. Res., 2014
CoRR, 2014
Newton-based stochastic optimization using q-Gaussian smoothed functional algorithms.
Autom., 2014
Proceedings of the 2014 Winter Simulation Conference, 2014
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014
Proceedings of the 17th International IEEE Conference on Intelligent Transportation Systems, 2014
A Markov Decision Process Framework for Predictable Job Completion Times on Crowdsourcing Platforms.
Proceedings of the Seconf AAAI Conference on Human Computation and Crowdsourcing, 2014
Proceedings of the Sixth International Conference on Communication Systems and Networks, 2014
Approximate Dynamic Programming with (min; +) linear function approximation for Markov decision processes.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014
Q-Learning Based Energy Management Policies for a Single Sensor Node with Finite Buffer.
IEEE Wirel. Commun. Lett., 2013
IEEE J. Sel. Top. Signal Process., 2013
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013
IEEE Trans. Veh. Technol., 2012
An Online Actor-Critic Algorithm with Function Approximation for Constrained Markov Decision Processes.
J. Optim. Theory Appl., 2012
Smoothed Functional Algorithms for Stochastic Optimization using q-Gaussian Distributions.
CoRR, 2012
CoRR, 2012
Comput. Networks, 2012
Autom., 2012
Proceedings of the 2012 IEEE International Symposium on Information Theory, 2012
A novel Q-learning algorithm with function approximation for constrained Markov decision processes.
Proceedings of the 50th Annual Allerton Conference on Communication, 2012
ACM Trans. Model. Comput. Simul., 2011
IEEE Trans. Intell. Transp. Syst., 2011
IEEE Trans Autom. Sci. Eng., 2011
Syst. Control. Lett., 2011
Reinforcement learning with average cost for adaptive control of traffic lights at intersections.
Proceedings of the 14th International IEEE Conference on Intelligent Transportation Systems, 2011
Proceedings of the Service-Oriented Computing - 9th International Conference, 2011
Smoothed Functional and Quasi-Newton Algorithms for Routing in Multi-stage Queueing Network with Constraints.
Proceedings of the Distributed Computing and Internet Technology, 2011
Wirel. Networks, 2010
Simul., 2010
An actor-critic algorithm with function approximation for discounted cost constrained Markov decision processes.
Syst. Control. Lett., 2010
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010
Proceedings of the Encyclopedia of Data Warehousing and Mining, Second Edition (4 Volumes), 2009
Optimal parameter trajectory estimation in parameterized SDEs: An algorithmic procedure.
ACM Trans. Model. Comput. Simul., 2009
A probabilistic constrained nonlinear optimization framework to optimize RED parameters.
Perform. Evaluation, 2009
IEEE Commun. Lett., 2009
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009
Fast gradient-descent methods for temporal-difference learning with linear function approximation.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009
Proceedings of the 48th IEEE Conference on Decision and Control, 2009
Simulation-Based Optimization Algorithms for Finite-Horizon Markov Decision Processes.
Simul., 2008
Proceedings of the Network Control and Optimization, Second Euro-NF Workshop, 2008
Proceedings of the International Workshop on Multimedia Signal Processing, 2008
Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization.
ACM Trans. Model. Comput. Simul., 2007
Inf. Sci., 2007
Discret. Event Dyn. Syst., 2007
Proceedings of the Advances in Neural Information Processing Systems 20, 2007
Proceedings of the Distributed Computing and Internet Technology, 2007
Proceedings of the Distributed Computing and Internet Technology, 2007
Proceedings of the Interactive TV: a Shared Experience, 5th European Conference, 2007
Proceedings of the 46th IEEE Conference on Decision and Control, 2007
Discrete parameter simulation optimization algorithms with applications to admission control with dependent service times.
Proceedings of the 46th IEEE Conference on Decision and Control, 2007
Proceedings of the 46th IEEE Conference on Decision and Control, 2007
Proceedings of the American Control Conference, 2007
Proceedings of the American Control Conference, 2007
Partition based pattern synthesis technique with efficient algorithms for nearest neighbor classification.
Pattern Recognit. Lett., 2006
A Simulation-Based Algorithm for Ergodic Control of Markov Chains Conditioned on Rare Events.
J. Mach. Learn. Res., 2006
On Measure Theoretic definitions of Generalized Information Measures and Maximum Entropy Prescriptions
CoRR, 2006
Proceedings of the Winter Simulation Conference WSC 2006, 2006
Proceedings of the 45th IEEE Conference on Decision and Control, 2006
A Reinforcement Learning Based Algorithm for Finite Horizon Markov Decision Processes.
Proceedings of the 45th IEEE Conference on Decision and Control, 2006
Adaptive multivariate three-timescale stochastic approximation algorithms for simulation based optimization.
ACM Trans. Model. Comput. Simul., 2005
Optimal Threshold Policies for Admission Control in Communication Networks via Discrete Parameter Stochastic Approximation.
Telecommun. Syst., 2005
Simul., 2005
Pattern Recognit., 2005
Proceedings of the 2005 IEEE International Symposium on Information Theory, 2005
Proceedings of the Artificial Intelligence Applications and Innovations - IFIP TC12 WG12.5, 2005
Information theoretic justification of Boltzmann selection and its generalization to Tsallis case.
Proceedings of the IEEE Congress on Evolutionary Computation, 2005
A simultaneous perturbation stochastic approximation-based actor-critic algorithm for Markov decision processes.
IEEE Trans. Autom. Control., 2004
Fusion of multiple approximate nearest neighbor classifiers for fast and efficient classification.
Inf. Fusion, 2004
A Pattern Synthesis Technique with an Efficient Nearest Neighbor Classifier for Binary Pattern Recognition.
Proceedings of the 17th International Conference on Pattern Recognition, 2004
Cauchy annealing schedule: an annealing schedule for Boltzmann selection scheme in evolutionary algorithms.
Proceedings of the IEEE Congress on Evolutionary Computation, 2004
Hierarchical decision making in semiconductor fabs using multi-time scale Markov decision processes.
Proceedings of the 43rd IEEE Conference on Decision and Control, 2004
Two-timescale simultaneous perturbation stochastic approximation using deterministic perturbation sequences.
ACM Trans. Model. Comput. Simul., 2003
Multiscale Chaotic SPSA and Smoothed Functional Algorithms for Simulation Optimization.
Simul., 2003
Quotient evolutionary space: abstraction of evolutionary process w.r.t macroscopic properties.
Proceedings of the IEEE Congress on Evolutionary Computation, 2003
IEEE/ACM Trans. Netw., 2001
Math. Oper. Res., 1995