Michael L. Littman

Orcid: 0000-0002-5596-1840

Affiliations:
  • Brown University, Department of Computer Science, Providence, RI, USA
  • National Science Foundation (NSF), Information and Intelligent Systems Division, Alexandria, VA, USA
  • Rutgers University, Department of Computer Science, Piscataway, NJ, USA (former)
  • AT&T Labs Research, Florham Park, NJ, USA (former)
  • Duke University, Department of Computer Science, Durham, NC, USA (former)
  • Bellcore, Morristown, NJ, USA (former)


According to our database1, Michael L. Littman authored at least 270 papers between 1989 and 2024.

Collaborative distances:

Awards

ACM Fellow

ACM Fellow 2018, "For contributions to the design and analysis of sequential decision making algorithms in artificial intelligence".

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Mitigating Partial Observability in Sequential Decision Processes via the Lambda Discrepancy.
CoRR, 2024

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages.
CoRR, 2024

Tiered Reward: Designing Rewards for Specification and Fast Learning of Desired Behavior.
Proceedings of the 1st Reinforcement Learning Conference, 2024

On Welfare-Centric Fair Reinforcement Learning.
Proceedings of the 1st Reinforcement Learning Conference, 2024

2023
NSF on Chien's Grand Challenge for Sustainability.
Commun. ACM, May, 2023

A domain-agnostic approach for characterization of lifelong learning systems.
Neural Networks, March, 2023

Software Engineering of Machine Learning Systems.
Commun. ACM, February, 2023

Meta-learning Parameterized Skills.
Proceedings of the International Conference on Machine Learning, 2023

Coarse-Grained Smoothness for Reinforcement Learning in Metric Spaces.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Computably Continuous Reinforcement-Learning Objectives Are PAC-Learnable.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Helping Users Debug Trigger-Action Programs.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2022

Specifying Behavior Preference with Tiered Reward Functions.
CoRR, 2022

Evaluation Beyond Task Performance: Analyzing Concepts in AlphaZero in Hex.
CoRR, 2022

Reward-Predictive Clustering.
CoRR, 2022

Gathering Strength, Gathering Storms: The One Hundred Year Study on Artificial Intelligence (AI100) 2021 Study Panel Report.
CoRR, 2022

Meta-Learning Transferable Parameterized Skills.
CoRR, 2022

Designing Rewards for Fast Learning.
CoRR, 2022

Does DQN really learn? Exploring adversarial training schemes in Pong.
CoRR, 2022

Evaluation beyond Task Performance: Analyzing Concepts in AlphaZero in Hex.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Model-based Lifelong Reinforcement Learning with Bayesian Exploration.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Faster Deep Reinforcement Learning with Slower Online Network.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data.
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2022

On the (In)Tractability of Reinforcement Learning for LTL Objectives.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

On the Expressivity of Markov Reward (Extended Abstract).
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

2021
Deep Q-Network with Proximal Iteration.
CoRR, 2021

Learning Generalizable Behavior via Visual Rewrite Rules.
CoRR, 2021

Reinforcement Learning for General LTL Objectives Is Intractable.
CoRR, 2021

Learning Finite Linear Temporal Logic Specifications with a Specialized Neural Operator.
CoRR, 2021

Coarse-Grained Smoothness for RL in Metric Spaces.
CoRR, 2021

Bad-Policy Density: A Measure of Reinforcement Learning Hardness.
CoRR, 2021

Convergence of a Human-in-the-Loop Policy-Gradient Algorithm With Eligibility Trace Under Reward, Policy, and Advantage Feedback.
CoRR, 2021

Brittle AI, Causal Confusion, and Bad Mental Models: Challenges and Successes in the XAI Program.
CoRR, 2021

Control of mental representations in human planning.
CoRR, 2021

Model Selection's Disparate Impact in Real-World Deep Learning Applications.
CoRR, 2021

Collusion rings threaten the integrity of computer science research.
Commun. ACM, 2021

On the Expressivity of Markov Reward.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Understanding Trigger-Action Programs Through Novel Visualizations of Program Differences.
Proceedings of the CHI '21: CHI Conference on Human Factors in Computing Systems, 2021

Towards Sample Efficient Agents through Algorithmic Alignment (Student Abstract).
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Lipschitz Lifelong Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Deep Radial-Basis Value Functions for Continuous Control.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Reward-predictive representations generalize across tasks in reinforcement learning.
PLoS Comput. Biol., 2020

Successor Features Combine Elements of Model-Free and Model-based Reinforcement Learning.
J. Mach. Learn. Res., 2020

Trace2TAP: Synthesizing Trigger-Action Programs from Traces of Behavior.
Proc. ACM Interact. Mob. Wearable Ubiquitous Technol., 2020

Task Scoping: Building Goal-Specific Abstractions for Planning in Complex Domains.
CoRR, 2020

Towards Sample Efficient Agents through Algorithmic Alignment.
CoRR, 2020

The Efficiency of Human Cognition Reflects Planned Information Processing.
CoRR, 2020

Learning State Abstractions for Transfer in Continuous Control.
CoRR, 2020

Deep RBF Value Functions for Continuous Control.
CoRR, 2020

Applying prerequisite structure inference to adaptive testing.
Proceedings of the LAK '20: 10th International Conference on Learning Analytics and Knowledge, 2020

Teaching a Robot Tasks of Arbitrary Complexity via Human Feedback.
Proceedings of the HRI '20: ACM/IEEE International Conference on Human-Robot Interaction, 2020

Value Preserving State-Action Abstractions.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Context-Driven Satirical News Generation.
Proceedings of the Second Workshop on Figurative Language Processing, 2020

Task Scoping for Efficient Planning in Open Worlds (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

People Do Not Just Plan, They Plan to Plan.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Decision trees.
Inroads, 2019

Individual predictions matter: Assessing the effect of data ordering in training fine-tuned CNNs for medical imaging.
CoRR, 2019

Interactive Learning of Environment Dynamics for Sequential Tasks.
CoRR, 2019

Combating the Compounding-Error Problem with a Multi-step Model.
CoRR, 2019

Teaching with IMPACT.
CoRR, 2019

Deep Reinforcement Learning from Policy-Dependent Human Feedback.
CoRR, 2019

Successor Features Support Model-based and Model-free Reinforcement Learning.
CoRR, 2019

ReNeg and Backseat Driver: Learning from Demonstration with Continuous Human Feedback.
CoRR, 2019

Stackelberg Punishment and Bully-Proofing Autonomous Vehicles.
Proceedings of the Social Robotics - 11th International Conference, 2019

Evidence Humans Provide When Explaining Data-Labeling Decisions.
Proceedings of the Human-Computer Interaction - INTERACT 2019, 2019

DeepMellow: Removing the Need for a Target Network in Deep Q-Learning.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

The Expected-Length Model of Options.
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, 2019

Finding Options that Minimize Planning Time.
Proceedings of the 36th International Conference on Machine Learning, 2019

How Users Interpret Bugs in Trigger-Action Programming.
Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, 2019

Removing the Target Network from Deep Q-Networks with the Mellowmax Operator.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Theory of Minds: Understanding Behavior in Groups through Inverse Planning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

State Abstraction as Compression in Apprenticeship Learning.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018
Curriculum Design for Machine Learners in Sequential Decision Tasks.
IEEE Trans. Emerg. Top. Comput. Intell., 2018

Evolutionary huffman encoding.
Inroads, 2018

Measuring and Characterizing Generalization in Deep Reinforcement Learning.
CoRR, 2018

Mitigating Planner Overfitting in Model-Based Reinforcement Learning.
CoRR, 2018

Towards a Simple Approach to Multi-step Model-based Reinforcement Learning.
CoRR, 2018

Finding Options that Minimize Planning Time.
CoRR, 2018

Personalized Education at Scale.
CoRR, 2018

Transfer with Model Features in Reinforcement Learning.
CoRR, 2018

Equivalence Between Wasserstein and Value-Aware Model-based Reinforcement Learning.
CoRR, 2018

Lipschitz Continuity in Model-based Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Policy and Value Transfer in Lifelong Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

State Abstractions for Lifelong Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Effectively Learning from Pedagogical Demonstrations.
Proceedings of the 40th Annual Meeting of the Cognitive Science Society, 2018

Bandit-Based Solar Panel Control.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Learning Approximate Stochastic Transition Models.
CoRR, 2017

Summable Reparameterizations of Wasserstein Critics in the One-Dimensional Setting.
CoRR, 2017

Mean Actor Critic.
CoRR, 2017

Advantages and Limitations of using Successor Features for Transfer in Reinforcement Learning.
CoRR, 2017

Interactive Learning from Policy-Dependent Human Feedback.
CoRR, 2017

Environment-Independent Task Specifications via GLTL.
CoRR, 2017

Latent Attention Networks.
CoRR, 2017

Ask Me Anything about MOOCs.
AI Mag., 2017

Interactive Learning from Policy-Dependent Human Feedback.
Proceedings of the 34th International Conference on Machine Learning, 2017

An Alternative Softmax Operator for Reinforcement Learning.
Proceedings of the 34th International Conference on Machine Learning, 2017

Teaching by Intervention: Working Backwards, Undoing Mistakes, or Correcting Mistakes?
Proceedings of the 39th Annual Meeting of the Cognitive Science Society, 2017

Planning with Abstract Markov Decision Processes.
Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, 2017

2016
A New Softmax Operator for Reinforcement Learning.
CoRR, 2016

Learning behaviors via human-delivered discrete feedback: modeling implicit feedback strategies to speed up learning.
Auton. Agents Multi Agent Syst., 2016

Learning User's Preferred Household Organization via Collaborative Filtering Methods.
Proceedings of the Joint Workshop on Interfaces and Human Decision Making for Recommender Systems co-located with ACM Conference on Recommender Systems (RecSys 2016), 2016

Showing versus doing: Teaching by demonstration.
Proceedings of the Advances in Neural Information Processing Systems 29: Annual Conference on Neural Information Processing Systems 2016, 2016

Peer Reviewing Short Answers using Comparative Judgement.
Proceedings of the Third ACM Conference on Learning @ Scale, 2016

Near Optimal Behavior via Approximate State Abstraction.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Feature-based Joint Planning and Norm Learning in Collaborative Games.
Proceedings of the 38th Annual Meeting of the Cognitive Science Society, 2016

Trigger-Action Programming in the Wild: An Analysis of 200, 000 IFTTT Recipes.
Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, 2016

A Need for Speed: Adapting Agent Action Speed to Improve Task Learning from Non-Expert Humans.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016

Towards Behavior-Aware Model Learning from Human-Generated Trajectories.
Proceedings of the 2016 AAAI Fall Symposia, Arlington, Virginia, USA, November 17-19, 2016, 2016

Reinforcement Learning as a Framework for Ethical Decision Making.
Proceedings of the AI, 2016

2015
Reinforcement learning improves behaviour from evaluative feedback.
Nat., 2015

Who speaks for AI?
AI Matters, 2015

Grounding English Commands to Reward Functions.
Proceedings of the Robotics: Science and Systems XI, Sapienza University of Rome, 2015

Between Imitation and Intention Learning.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Teaching with Rewards and Punishments: Reinforcement or Communication?
Proceedings of the 37th Annual Meeting of the Cognitive Science Society, 2015

2014
Learning something from nothing: Leveraging implicit human feedback strategies.
Proceedings of the 23rd IEEE International Symposium on Robot and Human Interactive Communication, 2014

Flexible theft and resolute punishment: Evolutionary dynamics of social behavior among reinforcement-learning agents.
Proceedings of the 36th Annual Meeting of the Cognitive Science Society, 2014

Practical trigger-action programming in the smart home.
Proceedings of the CHI Conference on Human Factors in Computing Systems, 2014

Quantifying Uncertainty in Batch Personalized Sequential Decision Making.
Proceedings of the Modern Artificial Intelligence for Health Analytics, 2014

A Strategy-Aware Technique for Learning Behaviors from Discrete Human Feedback.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Coco-Q: Learning in Stochastic Games with Side Payments.
Proceedings of the 30th International Conference on Machine Learning, 2013

The Cross-Entropy Method Optimizes for Quantiles.
Proceedings of the 30th International Conference on Machine Learning, 2013

Open-Loop Planning in Large-Scale Stochastic Domains.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

An Ensemble of Linearly Combined Reinforcement-Learning Agents.
Proceedings of the Late-Breaking Developments in the Field of Artificial Intelligence, 2013

AAAI-13 Preface.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012
Learning web-service task descriptions from traces.
Web Intell. Agent Syst., 2012

On the Computational Complexity of Stochastic Controller Optimization in POMDPs.
ACM Trans. Comput. Theory, 2012

Inducing Partially Observable Markov Decision Processes.
Proceedings of the Eleventh International Conference on Grammatical Inference, 2012

A new way to search game trees: technical perspective.
Commun. ACM, 2012

Rollout-based Game-tree Search Outprunes Traditional Alpha-beta.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Planning in Reward-Rich Domains via PAC Bandits.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Bandit-Based Planning and Learning in Continuous-Action Markov Decision Processes.
Proceedings of the Twenty-Second International Conference on Automated Planning and Scheduling, 2012

A framework for modeling population strategies by depth of reasoning.
Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

Covering Number as a Complexity Measure for POMDP Planning and Learning.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Puzzle: baffling raffling.
SIGecom Exch., 2011

Introduction to the special issue on empirical evaluations in reinforcement learning.
Mach. Learn., 2011

Knows what it knows: a framework for self-aware learning.
Mach. Learn., 2011

Integrating machine learning in <i>ad hoc</i> routing: A wireless adaptive routing protocol.
Int. J. Commun. Syst., 2011

Most Relevant Explanation: computational complexity and approximation methods.
Ann. Math. Artif. Intell., 2011

Democratic approximation of lexicographic preference models.
Artif. Intell., 2011

Learning is planning: near Bayes-optimal reinforcement learning via Monte-Carlo tree search.
Proceedings of the UAI 2011, 2011

Apprenticeship Learning About Multiple Intentions.
Proceedings of the 28th International Conference on Machine Learning, 2011

Scratchable Devices: User-Friendly Programming for Household Appliances.
Proceedings of the Human-Computer Interaction. Towards Mobile and Intelligent Interaction Environments, 2011

The effects of selection on noisy fitness optimization.
Proceedings of the 13th Annual Genetic and Evolutionary Computation Conference, 2011

Using iterated reasoning to predict opponent strategies.
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Sample-Based Planning for Continuous Action Markov Decision Processes.
Proceedings of the 21st International Conference on Automated Planning and Scheduling, 2011

2010
Dimension reduction and its application to model-based exploration in continuous spaces.
Mach. Learn., 2010

Reducing reinforcement learning to KWIK online regression.
Ann. Math. Artif. Intell., 2010

Broadening student enthusiasm for computer science with a great insights course.
Proceedings of the 41st ACM technical symposium on Computer science education, 2010

Classes of Multiagent Q-learning Dynamics with epsilon-greedy Exploration.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

Generalizing Apprenticeship Learning across Hypothesis Classes.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A Cognitive Hierarchy Model Applied to the Lemonade Game.
Proceedings of the Interactive Decision Theory and Game Theory, 2010

Integrating Sample-Based Planning and Model-Based Reinforcement Learning.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

Efficient Apprenticeship Learning with Smart Humans.
Proceedings of the Enabling Intelligence through Middleware, 2010

Learning Lexicographic Preference Models.
Proceedings of the Preference Learning., 2010

2009
Hierarchical Reinforcement Learning.
Proceedings of the Encyclopedia of Artificial Intelligence (3 Volumes), 2009

Reinforcement Learning in Finite MDPs: PAC Analysis.
J. Mach. Learn. Res., 2009

Provably Efficient Learning with Typed Parametric Models.
J. Mach. Learn. Res., 2009

Learning and planning in environments with delayed feedback.
Auton. Agents Multi Agent Syst., 2009

Exploring compact reinforcement-learning representations with linear regression.
Proceedings of the UAI 2009, 2009

A Bayesian Sampling Approach to Exploration in Reinforcement Learning.
Proceedings of the UAI 2009, 2009

Online exploration in least-squares policy iteration.
Proceedings of the 8th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2009), 2009

2008
An analysis of model-based Interval Estimation for Markov Decision Processes.
J. Comput. Syst. Sci., 2008

Optimization problems involving collections of dependent objects.
Ann. Oper. Res., 2008

A Polynomial-time Nash Equilibrium Algorithm for Repeated Stochastic Games.
Proceedings of the UAI 2008, 2008

CORL: A Continuous-state Offset-dynamics Reinforcement Learner.
Proceedings of the UAI 2008, 2008

Autonomous Model Learning for Reinforcement Learning.
Proceedings of the Fifth International Conference on the Quantitative Evaluaiton of Systems (QEST 2008), 2008

Multi-resolution Exploration in Continuous Spaces.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Efficient Value-Function Approximation via Online Linear Regression.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning.
Proceedings of the Machine Learning, 2008

Knows what it knows: a framework for self-aware learning.
Proceedings of the Machine Learning, 2008

An object-oriented representation for efficient reinforcement learning.
Proceedings of the Machine Learning, 2008

Social reward shaping in the prisoner's dilemma.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Efficient Learning of Action Schemas and Web-Service Descriptions.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

Potential-based Shaping in Model-based Reinforcement Learning.
Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, 2008

2007
Introduction to the special issue on learning and computational game theory.
Mach. Learn., 2007

A hierarchy of prescriptive goals for multiagent learning.
Artif. Intell., 2007

Online Linear Regression and Its Application to Model-Based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Analyzing feature generation for value-function approximation.
Proceedings of the Machine Learning, 2007

Planning and Learning in Environments with Delayed Feedback.
Proceedings of the Machine Learning: ECML 2007, 2007

A Multiple Representation Approach to Learning Dynamical Systems.
Proceedings of the Computational Approaches to Representation Change during Learning and Development, 2007

Efficient Structure Learning in Factored-State MDPs.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

Efficient Reinforcement Learning with Relocatable Action Models.
Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence, 2007

2006
Incremental Model-based Learners With Formal Learning-Time Guarantees.
Proceedings of the UAI '06, 2006

An Efficient Optimal-Equilibrium Algorithm for Two-player Game Trees.
Proceedings of the UAI '06, 2006

Towards a Unified Theory of State Abstraction for MDPs.
Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2006

Experience-efficient learning in associative bandit problems.
Proceedings of the Machine Learning, 2006

PAC model-free reinforcement learning.
Proceedings of the Machine Learning, 2006

A hierarchical approach to efficient reinforcement learning in deterministic domains.
Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), 2006

A Change Detection Model for Non-Stationary k-Armed Bandit Problems.
Proceedings of the Between a Rock and a Hard Place: Cognitive Science Principles Meet AI-Hard Problems, 2006

Targeting Specific Distributions of Trajectories in MDPs.
Proceedings of the Proceedings, 2006

2005
Corpus-based Learning of Analogies and Semantic Relations.
Mach. Learn., 2005

The First Probabilistic Track of the International Planning Competition.
J. Artif. Intell. Res., 2005

A polynomial-time Nash equilibrium algorithm for repeated games.
Decis. Support Syst., 2005

Reports on the 2004 AAAI Fall Symposia.
AI Mag., 2005

Efficient Exploration With Latent Structure.
Proceedings of the Robotics: Science and Systems I, 2005

Cyclic Equilibria in Markov Games.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

A theoretical analysis of Model-Based Interval Estimation.
Proceedings of the Machine Learning, 2005

Activity Recognition from Accelerometer Data.
Proceedings of the Proceedings, 2005

Lazy Approximation for Solving Continuous Finite-Horizon MDPs.
Proceedings of the Proceedings, 2005

2004
An Empirical Evaluation of Interval Estimation for Markov Decision Processes.
Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence (ICTAI 2004), 2004

Planning with predictive state representations.
Proceedings of the 2004 International Conference on Machine Learning and Applications, 2004

Reinforcement Learning for Autonomic Network Repair.
Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

An Instance-Based State Representation for Network Repair.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
Measuring praise and criticism: Inference of semantic orientation from association.
ACM Trans. Inf. Syst., 2003

Decision-Theoretic Bidding Based on Learned Density Models in Simultaneous, Interacting Auctions.
J. Artif. Intell. Res., 2003

Learning Analogies and Semantic Relations
CoRR, 2003

Combining Independent Modules to Solve Multiple-choice Synonym and Analogy Problems
CoRR, 2003

AAAI-2002 Fall Symposium Series.
AI Mag., 2003

Contingent planning under uncertainty via stochastic satisfiability.
Artif. Intell., 2003

Combining independent modules in lexical multiple-choice problems.
Proceedings of the Recent Advances in Natural Language Processing III, 2003

Learning Predictive State Representations.
Proceedings of the Machine Learning, 2003

Tutorial: Learning Topics in Game-Theoretic Decision Making.
Proceedings of the Computational Learning Theory and Kernel Machines, 2003

2002
Unsupervised Learning of Semantic Orientation from a Hundred-Billion-Word Corpus
CoRR, 2002

A probabilistic approach to solving crossword puzzles.
Artif. Intell., 2002

Least-Squares Methods in Reinforcement Learning for Control.
Proceedings of the Methods and Applications of Artificial Intelligence, 2002

Modeling Auction Price Uncertainty Using Boosting-based Conditional Density Estimation.
Proceedings of the Machine Learning, 2002

Randomized strategic demand reduction: getting more by asking for less.
Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

ATTac-2001: A Learning, Autonomous Bidding Agent.
Proceedings of the Agent-Mediated Electronic Commerce IV, 2002

Self-Enforcing Strategic Demand Reduction.
Proceedings of the Agent-Mediated Electronic Commerce IV, 2002

2001
Stochastic Boolean Satisfiability.
J. Autom. Reason., 2001

ATTac-2000: An Adaptive Autonomous Bidding Agent.
J. Artif. Intell. Res., 2001

Learning to Select Branching Rules in the DPLL Procedure for Satisfiability.
Electron. Notes Discret. Math., 2001

Value-function reinforcement learning in Markov games.
Cogn. Syst. Res., 2001

FAucS : An FCC Spectrum Auction Simulator for Autonomous Bidding Agents.
Proceedings of the Electronic Commerce, Second International Workshop, 2001

Graphical Models for Game Theory.
Proceedings of the UAI '01: Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence, 2001

Approximate Dimension Reduction at NTCIR.
Proceedings of the Third Second Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization, 2001

Predictive Representations of State.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

An Efficient, Exact Algorithm for Solving Tree-Structured Graphical Games.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

PAC Generalization Bounds for Co-training.
Proceedings of the Advances in Neural Information Processing Systems 14 [Neural Information Processing Systems: Natural and Synthetic, 2001

Friend-or-Foe Q-learning in General-Sum Games.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

Implicit Negotiation in Repeated Games.
Proceedings of the Intelligent Agents VIII, 8th International Workshop, 2001

2000
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms.
Mach. Learn., 2000

A Review of Reinforcement Learning.
AI Mag., 2000

Exact Solutions to Time-Dependent MDPs.
Proceedings of the Advances in Neural Information Processing Systems 13, 2000

Algorithm Selection using Reinforcement Learning.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Approximate Dimension Equalization in Vector-based Information Retrieval.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Abstraction Methods for Game Theoretic Poker.
Proceedings of the Computers and Games, Second International Conference, 2000

Review: Computer Language Games.
Proceedings of the Computers and Games, Second International Conference, 2000

Towards Approximately Optimal Poker.
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

Reinforcement Learning for Algorithm Selection.
Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on on Innovative Applications of Artificial Intelligence, July 30, 2000

1999
A Unified Analysis of Value-Function-Based Reinforcement Learning Algorithms.
Neural Comput., 1999

The AAAI Fall Symposia.
AI Mag., 1999

Solving Crossword Puzzles as Probabilistic Constraint Satisfaction.
Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

Solving Crosswords with PROVERB.
Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

Initial Experiments in Stochastic Satisfiability.
Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

PROVERB: The Probabilistic Cruciverbalist.
Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, 1999

1998
The Computational Complexity of Probabilistic Planning.
J. Artif. Intell. Res., 1998

Planning and Acting in Partially Observable Stochastic Domains.
Artif. Intell., 1998

Learning a Language-Independent Representation for Terms from a Partially Aligned Corpus.
Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

MAXPLAN: A New Approach to Probabilistic Planning.
Proceedings of the Fourth International Conference on Artificial Intelligence Planning Systems, 1998

Using Caching to Solve Larger Probabilistic Planning Problems.
Proceedings of the Fifteenth National Conference on Artificial Intelligence and Tenth Innovative Applications of Artificial Intelligence Conference, 1998

1997
The Complexity of Plan Existence and Evaluation in Probabilistic Domains.
Proceedings of the UAI '97: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997

Incremental Pruning: A Simple, Fast, Exact Method for Partially Observable Markov Decision Processes.
Proceedings of the UAI '97: Proceedings of the Thirteenth Conference on Uncertainty in Artificial Intelligence, 1997

Automatic 3-Language Cross-Language Information Retrieval with Latent Semantic Indexing.
Proceedings of The Sixth Text REtrieval Conference, 1997

Probabilistic Propositional Planning: Representations and Complexity.
Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

Speeding Safely: Multi-Criteria Optimization in Probabilistic Planning.
Proceedings of the Fourteenth National Conference on Artificial Intelligence and Ninth Innovative Applications of Artificial Intelligence Conference, 1997

1996
Reinforcement Learning: A Survey.
J. Artif. Intell. Res., 1996

Taggers for Parsers.
Artif. Intell., 1996

A Generalized Reinforcement-Learning Model: Convergence and Applications.
Proceedings of the Machine Learning, 1996

1995
On the Complexity of Solving Markov Decision Problems.
Proceedings of the UAI '95: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, 1995

Partially Observable Markov Decision Processes for Artificial Intelligence.
Proceedings of the KI-95: Advances in Artificial Intelligence, 1995

Learning Policies for Partially Observable Environments: Scaling Up.
Proceedings of the Machine Learning, 1995

1994
Markov Games as a Framework for Multi-Agent Reinforcement Learning.
Proceedings of the Machine Learning, 1994

Acting Optimally in Partially Observable Stochastic Domains.
Proceedings of the 12th National Conference on Artificial Intelligence, Seattle, WA, USA, July 31, 1994

1993
Packet Routing in Dynamically Changing Networks: A Reinforcement Learning Approach.
Proceedings of the Advances in Neural Information Processing Systems 6, 1993

An interface for navigating clustered document sets returned by queries.
Proceedings of the Conference on Organizational Computing Systems, 1993

1992
Supporting Informal Communication via Ephemeral Interest Groups.
Proceedings of the CSCW '92, Proceedings of the Conference on Computer Supported Cooperative Work, Toronto, Canada, October 31, 1992

1991
Adaptation in Constant Utility Non-Stationary Environments.
Proceedings of the 4th International Conference on Genetic Algorithms, 1991

Hypertext for the Electronic Library? CORE Sample Results.
Proceedings of the Hypertext'91 Proceedings, San Antonio, Texas, USA, 1991

1989
Generalization and Scaling in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 2, 1989


  Loading...