Rahul Jain

Orcid: 0000-0003-3786-8682

  • University of Southern California, Los Angeles, CA, USA
  • University of California, Department of Electical Engineering and Computer Science, Berkeley, CA, USA (PhD 2004)
  • Rice University, Department of Electrical and Computer Engineering, Houston, TX, USA (MS 1999)

According to our database1, Rahul Jain authored at least 148 papers between 1999 and 2025.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.



In proceedings 
PhD thesis 


Online presence:

On csauthors.net:


Distributionally Robust Direct Preference Optimization.
CoRR, February, 2025

Optimal Control of Logically Constrained Partially Observable and Multiagent Markov Decision Processes.
IEEE Trans. Autom. Control., January, 2025

Best Policy Learning from Trajectory Preference Feedback.
CoRR, January, 2025

Optimal Communication and Control Strategies in a Cooperative Multiagent MDP Problem.
IEEE Trans. Autom. Control., October, 2024

Probabilistic Contraction Analysis of Iterated Random Operators.
IEEE Trans. Autom. Control., September, 2024

Markov Balance Satisfaction Improves Performance in Strictly Batch Offline Imitation Learning.
CoRR, 2024

Online Bandit Learning with Offline Preference Data.
CoRR, 2024

Pure Exploration for Constrained Best Mixed Arm Identification with a Fixed Budget.
CoRR, 2024

e-COP : Episodic Constrained Optimization of Policies.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

ACPO: A Policy Optimization Algorithm for Average MDPs with Constraints.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Compositional Planning for Logically Constrained Multi-Agent Markov Decision Processes.
Proceedings of the 63rd IEEE Conference on Decision and Control, 2024

A Bayesian Learning Algorithm for Unknown Zero-sum Stochastic Games with an Arbitrary Opponent.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale.
Trans. Mach. Learn. Res., 2023

Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach.
CoRR, 2023

Regret Analysis of the Posterior Sampling-based Learning Algorithm for Episodic POMDPs.
CoRR, 2023

Conditional Kernel Imitation Learning for Continuous State Environments.
CoRR, 2023

Optimal Control of Logically Constrained Partially Observable and Multi-Agent Markov Decision Processes.
CoRR, 2023

Average-Constrained Policy Optimization.
CoRR, 2023

Safe Posterior Sampling for Constrained MDPs with Bounded Constraint Violation.
CoRR, 2023

Posterior sampling-based online learning for the stochastic shortest path model.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Leveraging Demonstrations to Improve Online Learning: Quality Matters.
Proceedings of the International Conference on Machine Learning, 2023

A Novel Point-Based Algorithm for Multi-Agent Control Using the Common Information Approach.
Proceedings of the 62nd IEEE Conference on Decision and Control, 2023

Exact and Cost-Effective Automated Transformation of Neural Network Controllers to Decision Tree Controllers.
Proceedings of the 62nd IEEE Conference on Decision and Control, 2023

Two-Stage Electricity Markets With Renewable Energy Integration: Market Mechanisms and Equilibrium Analysis.
IEEE Trans. Control. Netw. Syst., 2022

Scheduling Flexible Nonpreemptive Loads in Smart-Grid Networks.
IEEE Trans. Control. Netw. Syst., 2022

New directions in learning and control of stochastic networks.
Queueing Syst. Theory Appl., 2022

Learning Infinite-Horizon Average-Reward Markov Decision Processes with Constraints.
CoRR, 2022

Optimal control of partially observable Markov decision processes with finite linear temporal logic constraints.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Learning Infinite-horizon Average-reward Markov Decision Process with Constraints.
Proceedings of the International Conference on Machine Learning, 2022

Improved No-Regret Algorithms for Stochastic Shortest Path with Linear MDP.
Proceedings of the International Conference on Machine Learning, 2022

Optimal Communication and Control Strategies for a Multi-Agent System in the Presence of an Adversary.
Proceedings of the 61st IEEE Conference on Decision and Control, 2022

Online Learning for Cooperative Multi-Player Multi-Armed Bandits.
Proceedings of the 61st IEEE Conference on Decision and Control, 2022

Practical Control Design for the Deep Learning Age: Distillation of Deep RL-Based Controllers.
Proceedings of the 58th Annual Allerton Conference on Communication, 2022

Online Learning for Unknown Partially Observable MDPs.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Model-Free Reinforcement Learning for Optimal Control of MarkovDecision Processes Under Signal Temporal Logic Specifications.
CoRR, 2021

Learning Zero-sum Stochastic Games with Posterior Sampling.
CoRR, 2021

Online Learning for Stochastic Shortest Path Model via Posterior Sampling.
CoRR, 2021

Optimal communication and control strategies in a multi-agent MDP problem.
CoRR, 2021

Implicit Finite-Horizon Approximation and Efficient Optimal Algorithms for Stochastic Shortest Path.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Model-Free Reinforcement Learning for Optimal Control of Markov Decision Processes Under Signal Temporal Logic Specifications.
Proceedings of the 2021 60th IEEE Conference on Decision and Control (CDC), 2021

Optimal Control of Discounted-Reward Markov Decision Processes Under Linear Temporal Logic Specifications.
Proceedings of the 2021 American Control Conference, 2021

Learning Infinite-horizon Average-reward MDPs with Linear Function Approximation.
Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 2021

A Sample-Efficient Algorithm for Episodic Finite-Horizon MDP with Constraints.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Posterior Sampling-Based Reinforcement Learning for Control of Unknown Linear Systems.
IEEE Trans. Autom. Control., 2020

A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs.
IEEE Trans. Autom. Control., 2020

Synthesis of Discounted-Reward Optimal Policies for Markov Decision Processes Under Linear Temporal Logic Specifications.
CoRR, 2020

Designing Interpretable Approximations to Deep Reinforcement Learning with Soft Decision Trees.
CoRR, 2020

Randomized Policy Learning for Continuous State and Action MDPs.
CoRR, 2020

Non-indexability of the stochastic appointment scheduling problem.
Autom., 2020

Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.
Proceedings of the 37th International Conference on Machine Learning, 2020

Finite Time Guarantees for Continuous State MDPs with Generative Model.
Proceedings of the 59th IEEE Conference on Decision and Control, 2020

Scheduling of Flexible Non-Preemptive Loads.
Proceedings of the 59th IEEE Conference on Decision and Control, 2020

A Risk Aware Two-Stage Market Mechanism for Electricity with Renewable Generation.
Proceedings of the 2020 American Control Conference, 2020

Approximate Relative Value Learning for Average-reward Continuous State MDPs.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

An Empirical Relative Value Learning Algorithm for Non-parametric MDPs with Continuous State Space.
Proceedings of the 17th European Control Conference, 2019

Empirical Algorithms for General Stochastic Systems with Continuous States and Actions.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

A Two-Stage Market Mechanism for Electricity with Renewable Generation.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

A Two Stage Stochastic Mechanism for Selling Random Power.
Proceedings of the 2019 American Control Conference, 2019

An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions.
Proceedings of the 57th Annual Allerton Conference on Communication, 2019

Aggregating Correlated Wind Power With Full Surplus Extraction.
IEEE Trans. Smart Grid, 2018

Optimal Decentralized Control With Asymmetric One-Step Delayed Information Sharing.
IEEE Trans. Control. Netw. Syst., 2018

On Regret-Optimal Learning in Decentralized Multiplayer Multiarmed Bandits.
IEEE Trans. Control. Netw. Syst., 2018

A Two Stage Mechanism For Selling Random Power.
CoRR, 2018

A Fixed Point Theorem for Iterative Random Contraction Operators over Banach Spaces.
CoRR, 2018

Dynamic Economic Dispatch and Price Evolution under Ramping Constraints and Uncertain Demand.
Proceedings of the 56th Annual Allerton Conference on Communication, 2018

Approachability in Stackelberg Stochastic Games with Vector Costs.
Dyn. Games Appl., 2017

Learning-based Control of Unknown Linear Systems with Thompson Sampling.
CoRR, 2017

Learning Unknown Markov Decision Processes: A Thompson Sampling Approach.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Randomized function fitting-based empirical value iteration.
Proceedings of the 56th IEEE Annual Conference on Decision and Control, 2017

A random monotone operator framework for strongly convex stochastic optimization.
Proceedings of the 56th IEEE Annual Conference on Decision and Control, 2017

Control of unknown linear systems with Thompson sampling.
Proceedings of the 55th Annual Allerton Conference on Communication, 2017

Inexact iteration of averaged operators for non-strongly convex stochastic optimization.
Proceedings of the 55th Annual Allerton Conference on Communication, 2017

Dynamic Economic Dispatch Game: The Value of Storage.
IEEE Trans. Smart Grid, 2016

Empirical Dynamic Programming.
Math. Oper. Res., 2016

A dynamical systems framework for stochastic iterative optimization.
Proceedings of the 55th IEEE Conference on Decision and Control, 2016

On the existence of near-optimal fixed time control of traffic intersection signals.
Proceedings of the 54th Annual Allerton Conference on Communication, 2016

A Convex Analytic Approach to Risk-Aware Markov Decision Processes.
SIAM J. Control. Optim., 2015

A queueing model with independent arrivals, and its fluid and diffusion limits.
Queueing Syst. Theory Appl., 2015

Strategic Arrivals into Queueing Networks: The Network Concert Queueing Game.
Oper. Res., 2015

On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits.
CoRR, 2015

Pricing sequential forward power contracts.
Proceedings of the 2015 IEEE International Conference on Smart Grid Communications, 2015

Equilibria in two-stage electricity markets.
Proceedings of the 54th IEEE Conference on Decision and Control, 2015

An empirical algorithm for relative value iteration for average-cost MDPs.
Proceedings of the 54th IEEE Conference on Decision and Control, 2015

Scheduling, pricing, and efficiency of non-preemptive flexible loads under direct load control.
Proceedings of the 53rd Annual Allerton Conference on Communication, 2015

Coalitional Games for Transmitter Cooperation in MIMO Multiple Access Channels.
IEEE Trans. Signal Process., 2014

Decentralized Learning for Multiplayer Multiarmed Bandits.
IEEE Trans. Inf. Theory, 2014

Risk-Constrained Markov Decision Processes.
IEEE Trans. Autom. Control., 2014

Empirical Q-Value Iteration.
CoRR, 2014

A Learning Scheme for Approachability in MDPs and Stackelberg Stochastic Games.
CoRR, 2014

On Transitory Queueing.
CoRR, 2014

Bertrand equilibria and efficiency in markets for congestible network services.
Autom., 2014

Buying random yet correlated wind power.
Proceedings of the 2014 IEEE International Conference on Smart Grid Communications, 2014

Stochastic dynamic pricing: Utilizing demand response in an adaptive manner.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Blackwell's approachability in stackelberg stochastic games: A learning version.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Mean field limits by population acceleration.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Empirical policy iteration for approximate dynamic programming.
Proceedings of the 53rd IEEE Conference on Decision and Control, 2014

Optimal decentralized control in unidirectional one-step delayed sharing pattern with partial output feedback.
Proceedings of the American Control Conference, 2014

Empirical Value Iteration for approximate dynamic programming.
Proceedings of the American Control Conference, 2014

Dynamic economic dispatch among strategic generators with storage systems.
Proceedings of the 52nd Annual Allerton Conference on Communication, 2014

Spectrum Sharing through Contracts for Cognitive Radios.
IEEE Trans. Mob. Comput., 2013

Stochastic Dominance-Constrained Markov Decision Processes.
SIAM J. Control. Optim., 2013

Broadcast Channel Games: Equilibrium Characterization and a MIMO MAC-BC Game Duality
CoRR, 2013

A Nash Equilibrium Need Not Exist in the Locational Marginal Pricing Mechanism.
CoRR, 2013

Game-theoretic analysis of the nodal pricing mechanism for electricity markets.
Proceedings of the 52nd IEEE Conference on Decision and Control, 2013

Optimal decentralized control in unidirectional one-step delayed sharing pattern.
Proceedings of the 51st Annual Allerton Conference on Communication, 2013

Coalition formation for uplink device to device coordination with cooperation costs.
Proceedings of the 2013 Asilomar Conference on Signals, 2013

Combinatorial Network Optimization With Unknown Variables: Multi-Armed Bandits With Linear Rewards and Individual Observations.
IEEE/ACM Trans. Netw., 2012

Hierarchical Auction Mechanisms for Network Resource Allocation.
IEEE J. Sel. Areas Commun., 2012

Coalitional Games for Transmitter Cooperation in Wireless Networks
CoRR, 2012

Mechanism Designs for Stochastic Resources for Renewable Energy Integration
CoRR, 2012

Δ(i)/GI/1: A New Queueing Model For Transitory Queueing Systems
CoRR, 2012

Network market design part I: bandwidth markets.
IEEE Commun. Mag., 2012

A new transitory queueing model and its process limits.
Proceedings of the 6th International ICST Conference on Performance Evaluation Methodologies and Tools, 2012

A game theoretic model for the Gaussian broadcast channel.
Proceedings of the 2012 IEEE International Symposium on Information Theory, 2012

Incentives for cooperative relaying in a simple information-theoretic model.
Proceedings of the 2012 IEEE International Symposium on Information Theory, 2012

Dynamic pricing of power in Smart-Grid networks.
Proceedings of the 51th IEEE Conference on Decision and Control, 2012

Decentralized learning for multi-player multi-armed bandits.
Proceedings of the 51th IEEE Conference on Decision and Control, 2012

Dominance-constrained Markov decision processes.
Proceedings of the 51th IEEE Conference on Decision and Control, 2012

Characterization of equilibria for the degraded Gaussian broadcast and sum power MAC channels.
Proceedings of the 50th Annual Allerton Conference on Communication, 2012

Multi-player multi-armed bandits: Decentralized learning with IID rewards.
Proceedings of the 50th Annual Allerton Conference on Communication, 2012

The A(i)/GI/l Queue: A new model of transitory queueing.
Proceedings of the 50th Annual Allerton Conference on Communication, 2012

The concert queueing game: to wait or to be late.
Discret. Event Dyn. Syst., 2011

Coalition games for transmitter cooperation in wireless networks.
Proceedings of the 2011 IEEE International Symposium on Information Theory Proceedings, 2011

Hierarchical Auctions for Network Resource Allocation.
Proceedings of the Game Theory for Networks - 2nd International ICST Conference, 2011

Stochastic resource auctions for renewable energy integration.
Proceedings of the 49th Annual Allerton Conference on Communication, 2011

Combinatorial Network Optimization with Unknown Variables: Multi-Armed Bandits with Linear Rewards
CoRR, 2010

An efficient Nash-implementation mechanism for network resource allocation.
Autom., 2010

Simulation-based optimization of Markov decision processes: An empirical process theory approach.
Autom., 2010

A contracts-based approach for spectrum sharing among cognitive radios.
Proceedings of the 8th International Symposium on Modeling and Optimization in Mobile, Ad-Hoc and Wireless Networks (WiOpt 2010), May 31, 2010

DiffServ pricing games in multi-class queueing network models.
Proceedings of the 22nd International Teletraffic Congress, 2010

Communication games on the generalized Gaussian relay channel.
Proceedings of the 48th Annual Allerton Conference on Communication, 2010

Strategic arrivals into queueing networks.
Proceedings of the 48th Annual Allerton Conference on Communication, 2010

The concert/cafeteria queueing problem: a game of arrivals.
Proceedings of the 4th International Conference on Performance Evaluation Methodologies and Tools, 2009

Queueing game models for differentiated services.
Proceedings of the 1st International Conference on Game Theory for Networks, 2009

Bertrand games between multi-class queues.
Proceedings of the 48th IEEE Conference on Decision and Control, 2009

N-player Bertrand-Cournot games in queues: Existence of equilibrium.
Proceedings of the 46th Annual Allerton Conference on Communication, 2008

Designing a strategic bipartite matching market.
Proceedings of the 46th IEEE Conference on Decision and Control, 2007

Simulation-based Uniform Value Function Estimates of Markov Decision Processes.
SIAM J. Control. Optim., 2006

Mechanisms for Efficient Allocation in Divisible Capacity Networks.
Proceedings of the 45th IEEE Conference on Decision and Control, 2006

Efficient Market Mechanisms for Network Resource Allocation.
Proceedings of the 44th IEEE IEEE Conference on Decision and Control and 8th European Control Conference Control, 2005

Scalar estimation and control with noisy binary observations.
IEEE Trans. Autom. Control., 2004

PAC learning for Markov decision processes and dynamic games.
Proceedings of the 2004 IEEE International Symposium on Information Theory, 2004

Simulation-based uniform value function estimates of discounted and average-reward MDPs.
Proceedings of the 43rd IEEE Conference on Decision and Control, 2004

Combination Exchange Mechanisms for Efficient Bandwidth Allocation.
Commun. Inf. Syst., 2003

Control under communication constraints.
Proceedings of the 41st IEEE Conference on Decision and Control, 2002

Geographical routing using partial information for wireless ad hoc networks.
IEEE Wirel. Commun., 2001

Towards Coarse-Grained Mobile QoS.
Proceedings of Second ACM International Workshop on Wireless Mobile Multimedia, 1999

A Framework for Design & Evaluation of Admission Control Algorithms in Multi-Service Mobile Networks.
Proceedings of the Proceedings IEEE INFOCOM '99, 1999
