Marcello Restelli

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions.

[BibT_eX]

[DOI]

Giorgia Ramponi

Amarildo Likmeta

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration.

[BibT_eX]

[DOI]

Andrea Battistello

Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies.

[BibT_eX]

[DOI]

Mirco Mutti

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Gradient-Aware Model-Based Policy Search.

[BibT_eX]

[DOI]

Pierluca D'Oro

Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019

Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization.

[BibT_eX]

[DOI]

Alessandro Nuara

Nicola Sosio

Francesco Trovò

Maria Chiara Zaccardi

Proceedings of the World Wide Web Conference, 2019

Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters.

[BibT_eX]

[DOI]

Amarildo Likmeta

Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Exploration Driven by an Optimistic Bellman Equation.

[BibT_eX]

[DOI]

Proceedings of the International Joint Conference on Neural Networks, 2019

Exploiting Action-Value Uncertainty to Drive Exploration in Reinforcement Learning.

[BibT_eX]

[DOI]

Carlo D'Eramo

Andrea Cini

Proceedings of the International Joint Conference on Neural Networks, 2019

Feature Selection via Mutual Information: New Theoretical Insights.

[BibT_eX]

[DOI]

Mario Beraha

Proceedings of the International Joint Conference on Neural Networks, 2019

Transfer of Samples in Policy Search via Multiple Importance Sampling.

[BibT_eX]

[DOI]

Mattia Salvini

Proceedings of the 36th International Conference on Machine Learning, 2019

Optimistic Policy Optimization via Multiple Importance Sampling.

[BibT_eX]

[DOI]

Lorenzo Lupo

Proceedings of the 36th International Conference on Machine Learning, 2019

Reinforcement Learning in Configurable Continuous Environments.

[BibT_eX]

[DOI]

Emanuele Ghelfi

Proceedings of the 36th International Conference on Machine Learning, 2019

2018

Improving multi-armed bandit algorithms in online pricing settings.

[BibT_eX]

[DOI]

Int. J. Approx. Reason., 2018

Transfer of Value Functions via Variational Methods.

[BibT_eX]

[DOI]

Rafael Rodríguez-Sánchez

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Policy Optimization via Importance Sampling.

[BibT_eX]

[DOI]

Francesco Faccio

Alessandra Laura Giulia Pedrocchi

Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Does Reinforcement Learning outperform PID in the control of FES-induced elbow flex-extension?

[BibT_eX]

[DOI]

Simona Ferrante

Proceedings of the 2018 IEEE International Symposium on Medical Measurements and Applications, 2018

Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Importance Weighted Transfer of Samples in Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Stochastic Variance-Reduced Policy Gradient.

[BibT_eX]

[DOI]

Proceedings of the 35th International Conference on Machine Learning, 2018

Configurable Markov Decision Processes.

[BibT_eX]

[DOI]

Mirco Mutti

Proceedings of the 35th International Conference on Machine Learning, 2018

A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017

Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent.

[BibT_eX]

[DOI]

CoRR, 2017

Regret Minimization Algorithms for the Followers Behaviour Identification in Leadership Games.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Gradient-based minimization for multi-expert Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Exploiting structure and uncertainty of Bellman updates in Markov decision processes.

[BibT_eX]

[DOI]

Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Adaptive Batch Size for Safe Policy Gradients.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Compatible Reward Inverse Reinforcement Learning.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

User context estimation for public travel assistance and intelligent service scheduling.

[BibT_eX]

[DOI]

Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems, 2017

Risk-averse trees for learning from logged bandit feedback.

[BibT_eX]

[DOI]

Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Boosted Fitted Q-Iteration.

[BibT_eX]

[DOI]

Proceedings of the 34th International Conference on Machine Learning, 2017

Designing Learning Algorithms over the Sequence Form of an Extensive-Form Game.

[BibT_eX]

[DOI]

Edoardo Manino

Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

Unimodal Thompson Sampling for Graph-Structured Arms.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Estimating the Maximum Expected Value in Continuous Reinforcement Learning Problems.

[BibT_eX]

[DOI]

Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016

Policy Search for the Optimal Control of Markov Decision Processes: A Novel Particle-Based Iterative Scheme.

[BibT_eX]

[DOI]

IEEE Trans. Cybern., 2016

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation.

[BibT_eX]

[DOI]

Simone Parisi

J. Artif. Intell. Res., 2016

Extensive-form games with heterogeneous populations: solution concepts, equilibria characterization, learning dynamics.

[BibT_eX]

[DOI]

Intelligenza Artificiale, 2016

Reconstruction of public transport state.

[BibT_eX]

[DOI]

Proceedings of the 19th IEEE International Conference on Intelligent Transportation Systems, 2016

Estimating Maximum Expected Value through Gaussian Approximation.

[BibT_eX]

[DOI]

Carlo D'Eramo

Alessandro Nuara

Proceedings of the 33nd International Conference on Machine Learning, 2016

Budgeted Multi-Armed Bandit in Continuous Action Space.

[BibT_eX]

[DOI]

Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016

Inverse Reinforcement Learning through Policy Gradient Minimization.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Sequence-Form and Evolutionary Dynamics: Realization Equivalence to Agent Form and Logit Dynamics.

[BibT_eX]

[DOI]

Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015

Policy gradient in Lipschitz Markov Decision Processes.

[BibT_eX]

[DOI]

Luca Bascetta

Mach. Learn., 2015

Sparse multi-task reinforcement learning.

[BibT_eX]

[DOI]

Daniele Calandriello

Intelligenza Artificiale, 2015

Following Newton direction in Policy Gradient with parameter exploration.

[BibT_eX]

[DOI]

Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Estimating a Mean-Path from a set of 2-D curves.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation.

[BibT_eX]

[DOI]

Simone Parisi

Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014

Policy gradient approaches for multi-objective sequential decision making.

[BibT_eX]

[DOI]

Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Policy gradient approaches for multi-objective sequential decision making: A comparison.

[BibT_eX]

[DOI]

Proceedings of the 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2014

Evolutionary Dynamics of Q-Learning over the Sequence Form.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013

Adaptive Step-Size for Policy Gradient Methods.

[BibT_eX]

[DOI]

Luca Bascetta

Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Safe Policy Iteration.

[BibT_eX]

[DOI]

Proceedings of the 30th International Conference on Machine Learning, 2013

Extensive-form games with heterogeneous populations.

[BibT_eX]

[DOI]

Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Efficient Evolutionary Dynamics with Extensive-Form Games.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012

Data-driven dynamic emulation modelling for the optimal management of environmental systems.

[BibT_eX]

[DOI]

Stefano Galelli

Rodolfo Soncini-Sessa

Environ. Model. Softw., 2012

Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems.

[BibT_eX]

[DOI]

Francesca Pianosi

Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012

Computing Equilibria with Two-Player Zero-Sum Continuous Stochastic Games with Switching Controller.

[BibT_eX]

[DOI]

Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011

Transfer from Multiple MDPs.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Multi-objective fitted Q-iteration: Pareto frontier approximation in one single run.

[BibT_eX]

[DOI]

Francesca Pianosi

Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2011

Equilibrium approximation in simulation-based extensive-form games.

[BibT_eX]

[DOI]

Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Fitted policy search.

[BibT_eX]

[DOI]

Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

Tree-based variable selection for dimensionality reduction of large-scale control systems.

[BibT_eX]

[DOI]

Stefano Galelli

Rodolfo Soncini-Sessa

Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

2009

Reinforcement distribution in fuzzy Q-learning.

[BibT_eX]

[DOI]

Fuzzy Sets Syst., 2009

Batch Reinforcement Learning - An Application to a Controllable Semi-active Suspension System.

[BibT_eX]

Proceedings of the ICINCO 2009, 2009

Batch Reinforcement Learning for semi-active suspension control.

[BibT_eX]

[DOI]

Proceedings of the IEEE International Conference on Control Applications, 2009

2008

Improving Batch Reinforcement Learning Performance through Transfer of Samples.

[BibT_eX]

[DOI]

Proceedings of the STAIRS 2008, 2008

Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot.

[BibT_eX]

[DOI]

Proceedings of the Artificial Intelligence in Theory and Practice II, 2008

Transfer of samples in batch reinforcement learning.

[BibT_eX]

[DOI]

Proceedings of the Machine Learning, 2008

On the usefulness of opponent modeling: the Kuhn Poker case study.

[BibT_eX]

[DOI]

Mario Quaresimale

Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Transfer of task representation in reinforcement learning using policy-based proto-value functions.

[BibT_eX]

[DOI]

Eliseo Ferrante

Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Towards Automated Bargaining in Electronic Markets: A Partially Two-Sided Competition Model.

[BibT_eX]

[DOI]

Proceedings of the Agent-Mediated Electronic Commerce and Trading Agent Design and Analysis, 2008

2007

Problems and solutions for anchoring in multi-robot applications.

[BibT_eX]

[DOI]

J. Intell. Fuzzy Syst., 2007

Learning Fuzzy Classifier Systems: Architecture and Exploration Issues.

[BibT_eX]

[DOI]

Int. J. Artif. Intell. Tools, 2007

Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods.

[BibT_eX]

[DOI]

Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Piecewise constant reinforcement learning for robotic applications.

[BibT_eX]

Proceedings of the ICINCO 2007, 2007

Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions.

[BibT_eX]

[DOI]

Proceedings of the AI*IA 2007: Artificial Intelligence and Human-Oriented Computing, 2007

Bifurcation Analysis of Reinforcement Learning Agents in the Selten's Horse Game.

[BibT_eX]

[DOI]

Enrique Munoz de Cote

Fabio Dercole

Proceedings of the Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, 2007

2006

Concepts and fuzzy models for behavior-based robotics.

[BibT_eX]

[DOI]

Int. J. Approx. Reason., 2006

Incremental Skill Acquisition for Self-motivated Learning Animats.

[BibT_eX]

[DOI]

Proceedings of the From Animals to Animats 9, 2006

Learning to cooperate in multi-agent social dilemmas.

[BibT_eX]

[DOI]

Enrique Munoz de Cote

Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), 2006

2005

Reinforcement Distribution in Continuous State Action Space Fuzzy Q-Learning: A Novel Approach.

[BibT_eX]

[DOI]

Francesco Montrone

Proceedings of the Fuzzy Logic and Applications, 6th International Workshop, 2005

MRT: Robotics Off-the-Shelf with the Modular Robotic Toolkit.

[BibT_eX]

[DOI]

Proceedings of the Software Engineering for Experimental Robotics, 2005

Automatic Error Detection and Reduction for an Odometric Sensor based on Two Optical Mice.

[BibT_eX]

[DOI]

Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005

2004

A multi-agent system for multi-agent learning.

[BibT_eX]

[DOI]

PhD thesis, 2004

A kinematic-independent dead-reckoning sensor for indoor mobile robotics.

[BibT_eX]

[DOI]

Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28, 2004

Dead Reckoning for Mobile Robots Using Two Optical Mice.

[BibT_eX]

Proceedings of the ICINCO 2004, 2004

2003

A Probabilistic Framework for Weighting Different Sensor Data in MUREA.

[BibT_eX]

[DOI]

Domenico G. Sorrenti

Fabio M. Marchese

Proceedings of the RoboCup 2003: Robot Soccer World Cup VII, 2003

Filling the Gap among Coordination, Planning, and Reaction Using a Fuzzy Cognitive Model.

[BibT_eX]

[DOI]

Proceedings of the RoboCup 2003: Robot Soccer World Cup VII, 2003

2002

MUREA: A MUlti-Resolution Evidence Accumulation Method for Robot Localization in Known Environments.

[BibT_eX]

[DOI]

Domenico G. Sorrenti

Fabio M. Marchese

Proceedings of the RoboCup 2002: Robot Soccer World Cup VI, 2002

A robot localization method based on evidence accumulation and multi-resolution.

[BibT_eX]

[DOI]

Domenico G. Sorrenti

Fabio M. Marchese

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, September 30, 2002

An architecture to implement agents co-operating in dynamic environments.

[BibT_eX]

[DOI]

Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001

A Framework for Robust Sensing in Multi-agent Systems.

[BibT_eX]

[DOI]

Proceedings of the RoboCup 2001: Robot Soccer World Cup V, 2001

Fun2Mas: The Milan Robocup Team.

[BibT_eX]

[DOI]

Proceedings of the RoboCup 2001: Robot Soccer World Cup V, 2001

Concepts for Anchoring in Robotics.

[BibT_eX]

[DOI]