Gerald Tesauro

Jian Sun

Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Learning Hierarchical Teaching Policies for Cooperative Agents.

[BibT_eX]

[DOI]

Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

On the Role of Weight Sharing During Deep Option Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Learning Hierarchical Teaching in Cooperative Multiagent Reinforcement Learning.

[BibT_eX]

[DOI]

CoRR, 2019

Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference.

[BibT_eX]

[DOI]

Proceedings of the 7th International Conference on Learning Representations, 2019

Learning to Teach in Cooperative Multiagent Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Hybrid Reinforcement Learning with Expert State Sequences.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

2018

Introduction to the special issue on deep reinforcement learning: An editorial.

[BibT_eX]

[DOI]

Neural Networks, 2018

Learning Abstract Options.

[BibT_eX]

[DOI]

Matthew Riemer

Miao Liu

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Dialog-based Interactive Image Retrieval.

[BibT_eX]

[DOI]

Rogério Schmidt Feris

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Diverse Few-Shot Text Classification with Multiple Metrics.

[BibT_eX]

[DOI]

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2018

Evidence Aggregation for Answer Re-Ranking in Open-Domain Question Answering.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

Eigenoption Discovery through the Deep Successor Representation.

[BibT_eX]

[DOI]

Proceedings of the 6th International Conference on Learning Representations, 2018

R<sup>3</sup>: Reinforced Ranker-Reader for Open-Domain Question Answering.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Cognitive Computing.

[BibT_eX]

[DOI]

Mohan Sridharan

James A. Hendler

IEEE Intell. Syst., 2017

The Eigenoption-Critic Framework.

[BibT_eX]

[DOI]

CoRR, 2017

R<sup>3</sup>: Reinforced Reader-Ranker for Open-Domain Question Answering.

[BibT_eX]

[DOI]

CoRR, 2017

Robust Task Clustering for Deep Many-Task Learning.

[BibT_eX]

[DOI]

CoRR, 2017

Learning to Query, Reason, and Answer Questions On Ambiguous Texts.

[BibT_eX]

[DOI]

Proceedings of the 5th International Conference on Learning Representations, 2017

Optimal Sequential Drilling for Hydrocarbon Field Development Planning.

[BibT_eX]

[DOI]

Ruben Rodriguez Torrado

Jesus Rios

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Multiresolution Recurrent Neural Networks: An Application to Dialogue Response Generation.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Hierarchical Memory Networks.

[BibT_eX]

[DOI]

CoRR, 2016

Selecting Near-Optimal Learners via Incremental Data Allocation.

[BibT_eX]

[DOI]

Ashish Sabharwal

Horst Samulowitz

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Reports of the AAAI 2014 Conference Workshops.

[BibT_eX]

[DOI]

AI Mag., 2015

Towards Cognitive Automation of Data Science.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

Budgeted Prediction with Expert Advice.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2013

Analysis of Watson's Strategies for Playing Jeopardy!

[BibT_eX]

[DOI]

J. Artif. Intell. Res., 2013

2012

Simulation, learning, and optimization techniques in Watson's game strategies.

[BibT_eX]

[DOI]

IBM J. Res. Dev., 2012

Applying a framework for healthcare incentives simulation.

[BibT_eX]

[DOI]

Proceedings of the Winter Simulation Conference, 2012

Playing repeated Stackelberg games with unknown opponents.

[BibT_eX]

[DOI]

Janusz Marecki

Richard B. Segal

Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, 2012

2010

Bayesian Inference in Monte-Carlo Tree Search.

[BibT_eX]

[DOI]

V. T. Rajan

Richard B. Segal

Proceedings of the UAI 2010, 2010

2009

Monte-Carlo simulation balancing.

[BibT_eX]

[DOI]

David Silver

Proceedings of the 26th Annual International Conference on Machine Learning, 2009

2008

Active Collaborative Prediction with Maximum Margin Matrix Factorization.

[BibT_eX]

[DOI]

Irina Rish

Proceedings of the International Symposium on Artificial Intelligence and Mathematics, 2008

Autonomic multi-agent management of power and performance in data centers.

[BibT_eX]

[DOI]

Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

2007

Metric Learning for Kernel Regression.

[BibT_eX]

[DOI]

Kilian Q. Weinberger

Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, 2007

Reinforcement Learning in Autonomic Computing: A Manifesto and Case Studies.

[BibT_eX]

[DOI]

IEEE Internet Comput., 2007

On the use of hybrid reinforcement learning for autonomic resource allocation.

[BibT_eX]

[DOI]

Clust. Comput., 2007

Managing Power Consumption and Performance of Computing Systems Using Reinforcement Learning.

[BibT_eX]

[DOI]

Freeman L. Rawson III

Charles Lefurgy

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Estimating End-to-End Performance by Collaborative Prediction with Active Sampling.

[BibT_eX]

[DOI]

Irina Rish

Proceedings of the Integrated Network Management, 2007

Coordinating Multiple Autonomic Managers to Achieve Specified Power-Performance Tradeoffs.

[BibT_eX]

[DOI]

Freeman L. Rawson III

Charles Lefurgy

Proceedings of the Fourth International Conference on Autonomic Computing (ICAC'07), 2007

2006

A Hybrid Reinforcement Learning Approach to Autonomic Resource Allocation.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Conference on Autonomic Computing, 2006

Improvement of Systems Management Policies Using Hybrid Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning: ECML 2006, 2006

2005

Utility-Function-Driven Resource Allocation in Autonomic Systems.

[BibT_eX]

[DOI]

Proceedings of the Second International Conference on Autonomic Computing (ICAC 2005), 2005

Online Resource Allocation Using Decompositional Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Proceedings, 2005

New Approaches to Optimization and Utility Elicitation in Autonomic Computing.

[BibT_eX]

[DOI]

Proceedings of the Proceedings, 2005

2004

Utility Functions in Autonomic Systems.

[BibT_eX]

[DOI]

Proceedings of the 1st International Conference on Autonomic Computing (ICAC 2004), 2004

A Multi-Agent Systems Approach to Autonomic Computing.

[BibT_eX]

[DOI]

Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

2003

Cooperative Negotiation in Autonomic Systems using Incremental Utility Elicitation.

[BibT_eX]

[DOI]

Proceedings of the UAI '03, 2003

A strategic decision model for multi-attribute bilateral negotiation with alternating.

[BibT_eX]

[DOI]

Cuihong Li

Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Multi-agent implementation of asymmetric protocol for bilateral negotiations.

[BibT_eX]

[DOI]

Proceedings of the Proceedings 4th ACM Conference on Electronic Commerce (EC-2003), 2003

Extending Q-Learning to General Adaptive Multi-Agent Systems.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 16 [Neural Information Processing Systems, 2003

2002

Programming backgammon using self-teaching neural nets.

[BibT_eX]

[DOI]

Artif. Intell., 2002

Pricing in Agent Economies Using Multi-Agent Q-Learning.

[BibT_eX]

[DOI]

Auton. Agents Multi Agent Syst., 2002

Strategic sequential bidding in auctions using dynamic programming.

[BibT_eX]

[DOI]

Jonathan Bredin

Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001

High-performance bidding agents for the continuous double auction.

[BibT_eX]

[DOI]

Rajarshi Das

Proceedings of the Proceedings 3rd ACM Conference on Electronic Commerce (EC-2001), 2001

Pricing in Agent Economies Using Neural Networks and Multi-agent Q-Learning.

[BibT_eX]

[DOI]

Proceedings of the Sequence Learning - Paradigms, Algorithms, and Applications, 2001

Agent-Human Interactions in the Continuous Double Auction.

[BibT_eX]

Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, 2001

2000

Foresight-based pricing algorithms in agent economies.

[BibT_eX]

[DOI]

Decis. Support Syst., 2000

Multi-agent Q-learning and Regression Trees for Automated Pricing Decisions.

[BibT_eX]

Manu Sridharan

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Pseudo-convergent Q-Learning by Competitive Pricebots.

[BibT_eX]

Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

1999

Strategic pricebot dynamics.

[BibT_eX]

[DOI]

Amy Greenwald

Proceedings of the First ACM Conference on Electronic Commerce (EC-99), 1999

1998

Comments on "Co-Evolution in the Successful Learning of Backgammon Strategy".

[BibT_eX]

[DOI]

Mach. Learn., 1998

Foresight-based pricing algorithms in an economy of software agents.

[BibT_eX]

[DOI]

Proceedings of the First International Conference on Information and Computation Economies, 1998

1996

On-line Policy Improvement using Monte-Carlo Search.

[BibT_eX]

[DOI]

Gregory R. Galperin

Proceedings of the Advances in Neural Information Processing Systems 9, 1996

1995

Temporal Difference Learning and TD-Gammon.

[BibT_eX]

[DOI]

J. Int. Comput. Games Assoc., 1995

Biologically Inspired Defenses Against Computer Viruses.

[BibT_eX]

[DOI]

Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995

1994

TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play.

[BibT_eX]

[DOI]

Neural Comput., 1994

1992

How Tight Are the Vapnik-Chervonenkis Bounds?

[BibT_eX]

[DOI]

David A. Cohn

Neural Comput., 1992

Practical Issues in Temporal Difference Learning.

[BibT_eX]

[DOI]

Mach. Learn., 1992

Temporal Difference Learning of Backgammon Strategy.

[BibT_eX]

[DOI]

Proceedings of the Ninth International Workshop on Machine Learning (ML 1992), 1992

1991

Visualizing processes in neural networks.

[BibT_eX]

[DOI]

Jakub Wejchert

IBM J. Res. Dev., 1991

1990

Can Neural Networks Do Better Than the Vapnik-Chervonenkis Bounds?

[BibT_eX]

[DOI]

David A. Cohn

Proceedings of the Advances in Neural Information Processing Systems 3, 1990

Neurogammon: a neural-network backgammon program.

[BibT_eX]

[DOI]

Proceedings of the IJCNN 1990, 1990

1989

Asymptotic Convergence of Backpropagation.

[BibT_eX]

[DOI]

Yu He

Neural Comput., 1989

Neurogammon Wins Computer Olympiad.

[BibT_eX]

[DOI]

Neural Comput., 1989

A Parallel Network that Learns to Play Backgammon.

[BibT_eX]

[DOI]

Terrence J. Sejnowski

Artif. Intell., 1989

Neural Network Visualization.

[BibT_eX]

[DOI]

Jakub Wejchert

Proceedings of the Advances in Neural Information Processing Systems 2, 1989

Asymptotic Convergence of Backpropagation: Numerical Experiments.

[BibT_eX]

[DOI]

Yu He

Proceedings of the Advances in Neural Information Processing Systems 2, 1989

1988

A study of scaling and generalization in neural networks.

[BibT_eX]

[DOI]

Neural Networks, 1988

Scaling Relationships in Back-propagation Learning.

[BibT_eX]

[DOI]

Bob Janssens

Complex Syst., 1988

Connectionist Learning of Expert Preferences by Comparison Training.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Scaling and Generalization in Neural Networks: A Case Study.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 1, 1988

Connectionist Learning of Expert Backgammon Evaluations.

[BibT_eX]

Proceedings of the Machine Learning, 1988

1987

Scaling Relationships in Back-Propagation Learning: Dependence on Training Set Size.

[BibT_eX]

[DOI]

Complex Syst., 1987

A 'Neural' Network that Learns to Play Backgammon.

[BibT_eX]

[DOI]