Marcello Restelli

Orcid: 0000-0002-6322-1076

According to our database1, Marcello Restelli authored at least 208 papers between 2001 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Sample complexity of variance-reduced policy gradient: weaker assumptions and lower bounds.
Mach. Learn., September, 2024

Interpretable linear dimensionality reduction based on bias-variance analysis.
Data Min. Knowl. Discov., July, 2024

Optimizing Empty Container Repositioning and Fleet Deployment via Configurable Semi-POMDPs.
IEEE Trans. Intell. Transp. Syst., May, 2024

Bridging Rested and Restless Bandits with Graph-Triggering: Rising and Rotting.
CoRR, 2024

State and Action Factorization in Power Grids.
CoRR, 2024

A Provably Efficient Option-Based Algorithm for both High-Level and Low-Level Learning.
CoRR, 2024

The Limits of Pure Exploration in POMDPs: When the Observation Entropy is Enough.
CoRR, 2024

Optimal Multi-Fidelity Best-Arm Identification.
CoRR, 2024

Policy Gradient with Active Importance Sampling.
CoRR, 2024

Information Capacity Regret Bounds for Bandits with Mediator Feedback.
CoRR, 2024

Inverse Reinforcement Learning with Sub-optimal Experts.
CoRR, 2024

Interpetable Target-Feature Aggregation for Multi-task Learning Based on Bias-Variance Analysis.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2024

The Power of Hybrid Learning in Industrial Robotics: Efficient Grasping Strategies with Supervised-Driven Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2024

Causal Feature Selection via Transfer Entropy.
Proceedings of the International Joint Conference on Neural Networks, 2024

How to Explore with Belief: State Entropy Maximization in POMDPs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Best Arm Identification for Stochastic Rising Bandits.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Factored-Reward Bandits with Intermediate Observations.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

No-Regret Reinforcement Learning in Smooth MDPs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Graph-Triggered Rising Bandits.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Exploiting Causal Graph Priors with Posterior Sampling for Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Building Surrogate Models Using Trajectories of Agents Trained by Reinforcement Learning.
Proceedings of the Artificial Neural Networks and Machine Learning - ICANN 2024, 2024

Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs.
Proceedings of the Thirty Seventh Annual Conference on Learning Theory, June 30, 2024

Transfer Learning for Dynamical Systems Models via Autoencoders and GANs.
Proceedings of the American Control Conference, 2024

Autoregressive Bandits.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Parameterized Projected Bellman Operator.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Online Markov Decision Processes Configuration with Continuous Decision Space.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023
IWDA: Importance Weighting for Drift Adaptation in Streaming Supervised Learning Problems.
IEEE Trans. Neural Networks Learn. Syst., October, 2023

ARLO: A framework for Automated Reinforcement Learning.
Expert Syst. Appl., August, 2023

Risk-averse optimization of reward-based coherent risk measures.
Artif. Intell., March, 2023

An Option-Dependent Analysis of Regret Minimization Algorithms in Finite-Horizon Semi-MDP.
Trans. Mach. Learn. Res., 2023

Convex Reinforcement Learning in Finite Trials.
J. Mach. Learn. Res., 2023

Pure Exploration under Mediators' Feedback.
CoRR, 2023

Nonlinear Feature Aggregation: Two Algorithms driven by Theory.
CoRR, 2023

An Option-Dependent Analysis of Regret Minimization Algorithms in Finite-Horizon Semi-Markov Decision Processes.
CoRR, 2023

On the Relation between Policy Improvement and Off-Policy Minimum-Variance Policy Evaluation.
Proceedings of the Uncertainty in Artificial Intelligence, 2023

Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes.
Proceedings of the Machine Learning and Knowledge Discovery in Databases: Research Track, 2023

Distributional Policy Evaluation: a Maximum Entropy approach to Representation Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Truncating Trajectories in Monte Carlo Policy Evaluation: an Adaptive Approach.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Information-Theoretic Regret Bounds for Bandits with Fixed Expert Advice.
Proceedings of the IEEE Information Theory Workshop, 2023

Truncating Trajectories in Monte Carlo Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Dynamical Linear Bandits.
Proceedings of the International Conference on Machine Learning, 2023

Towards Theoretical Understanding of Inverse Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2023

Towards an AI-Based Framework for Autonomous Design and Construction: Learning from Reinforcement Learning Success in RTS Games.
Proceedings of the Computer-Aided Architectural Design. INTERCONNECTIONS: Co-computing Beyond Boundaries, 2023

A Brief Guide to Multi-Objective Reinforcement Learning and Planning.
Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

A Tale of Sampling and Estimation in Discounted Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Simultaneously Updating All Persistence Values in Reinforcement Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Dynamic Pricing with Volume Discounts in Online Settings.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Tight Performance Guarantees of Imitator Policies with Continuous Actions.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

Wasserstein Actor-Critic: Directed Exploration via Optimism for Continuous-Actions Control.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Smoothing policies and safe policy gradients.
Mach. Learn., 2022

Policy space identification in configurable environments.
Mach. Learn., 2022

Analysis, Characterization, Prediction and Attribution of Extreme Atmospheric Events with Machine Learning: a Review.
CoRR, 2022

Online joint bid/daily budget optimization of Internet advertising campaigns.
Artif. Intell., 2022

Risk-averse policy optimization via risk-neutral policy optimization.
Artif. Intell., 2022

A practical guide to multi-objective reinforcement learning and planning.
Auton. Agents Multi Agent Syst., 2022

Learning in Markov games: Can we exploit a general-sum opponent?
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Multi-Fidelity Best-Arm Identification.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Challenging Common Assumptions in Convex Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Off-Policy Evaluation with Deficient Support Using Side Information.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Pricing the Long Tail by Explainable Product Aggregation and Monotonic Bandits.
Proceedings of the KDD '22: The 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Washington, DC, USA, August 14, 2022

Storehouse: a Reinforcement Learning Environment for Optimizing Warehouse Management.
Proceedings of the International Joint Conference on Neural Networks, 2022

Multi-Armed Bandit Problem with Temporally-Partitioned Rewards: When Partial Feedback Counts.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

The Importance of Non-Markovianity in Maximum State Entropy Exploration.
Proceedings of the International Conference on Machine Learning, 2022

Stochastic Rising Bandits.
Proceedings of the International Conference on Machine Learning, 2022

Delayed Reinforcement Learning by Imitation.
Proceedings of the International Conference on Machine Learning, 2022

Balancing Sample Efficiency and Suboptimality in Inverse Reinforcement Learning.
Proceedings of the International Conference on Machine Learning, 2022

Goal-Directed Planning via Hindsight Experience Replay.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Addressing Non-Stationarity in FX Trading with Online Model Selection of Offline RL Experts.
Proceedings of the 3rd ACM International Conference on AI in Finance, 2022

Dark-Pool Smart Order Routing: a Combinatorial Multi-armed Bandit Approach.
Proceedings of the 3rd ACM International Conference on AI in Finance, 2022

Trust Region Meta Learning for Policy Optimization.
Proceedings of the ECML/PKDD Workshop on Meta-Knowledge Transfer, 2022

Reward-Free Policy Space Compression for Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Finite Sample Analysis of Mean-Volatility Actor-Critic for Risk-Averse Reinforcement Learning.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2022

Unsupervised Reinforcement Learning in Multiple Environments.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

Lifelong Hyper-Policy Optimization with Multiple Importance Sampling Regularization.
Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence, 2022

2021
Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems.
Mach. Learn., 2021

Safe Policy Iteration: A Monotonically Improving Approximate Policy Iteration Approach.
J. Mach. Learn. Res., 2021

MushroomRL: Simplifying Reinforcement Learning Research.
J. Mach. Learn. Res., 2021

Gaussian Approximation for Bias Reduction in Q-Learning.
J. Mach. Learn. Res., 2021

Time-variant variational transfer for value functions.
Proceedings of the Thirty-Seventh Conference on Uncertainty in Artificial Intelligence, 2021

Exploiting History Data for Nonstationary Multi-armed Bandit.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Conservative Online Convex Optimization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases. Research Track, 2021

Learning in Non-Cooperative Configurable Markov Decision Processes.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Reinforcement Learning in Linear MDPs: Constant Regret and Representation Selection.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Subgaussian and Differentiable Importance Sampling for Off-Policy Evaluation and Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Learning a Belief Representation for Delayed Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2021

Meta-Reinforcement Learning by Tracking Task Non-stationarity.
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, 2021

Leveraging Good Representations in Linear Contextual Bandits.
Proceedings of the 38th International Conference on Machine Learning, 2021

Provably Efficient Learning of Transferable Rewards.
Proceedings of the 38th International Conference on Machine Learning, 2021

Monte carlo tree search for trading and hedging.
Proceedings of the ICAIF'21: 2nd ACM International Conference on AI in Finance, Virtual Event, November 3, 2021

Learning FX trading strategies with FQI and persistent actions.
Proceedings of the ICAIF'21: 2nd ACM International Conference on AI in Finance, Virtual Event, November 3, 2021

Newton Optimization on Helmholtz Decomposition for Continuous Games.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Task-Agnostic Exploration via Policy Gradient of a Non-Parametric State Entropy Estimate.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Policy Optimization as Online Learning with Mediator Feedback.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Combining reinforcement learning with rule-based controllers for transparent and general decision-making in autonomous driving.
Robotics Auton. Syst., 2020

Importance Sampling Techniques for Policy Optimization.
J. Mach. Learn. Res., 2020

Sliding-Window Thompson Sampling for Non-Stationary Settings.
J. Artif. Intell. Res., 2020

On the use of the policy gradient and Hessian in inverse reinforcement learning.
Intelligenza Artificiale, 2020

Newton-based Policy Optimization for Games.
CoRR, 2020

A Policy Gradient Method for Task-Agnostic Exploration.
CoRR, 2020

Time-Variant Variational Transfer for Value Functions.
CoRR, 2020

An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Inverse Reinforcement Learning from a Gradient-based Learner.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Risk-Averse Trust Region Optimization for Reward-Volatility Reduction.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

Inferring Functional Properties from Fluid Dynamics Features.
Proceedings of the 25th International Conference on Pattern Recognition, 2020

Sequential Transfer in Reinforcement Learning with a Generative Model.
Proceedings of the 37th International Conference on Machine Learning, 2020

Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Sharing Knowledge in Multi-Task Deep Reinforcement Learning.
Proceedings of the 8th International Conference on Learning Representations, 2020

Option hedging with risk averse reinforcement learning.
Proceedings of the ICAIF '20: The First ACM International Conference on AI in Finance, 2020

Dealing with transaction costs in portfolio optimization: online gradient descent with momentum.
Proceedings of the ICAIF '20: The First ACM International Conference on AI in Finance, 2020

Fast direct calibration of interest rate derivatives pricing models.
Proceedings of the ICAIF '20: The First ACM International Conference on AI in Finance, 2020

Foreign exchange trading: a risk-averse batch reinforcement learning approach.
Proceedings of the ICAIF '20: The First ACM International Conference on AI in Finance, 2020

Model-Free Non-Stationarity Detection and Adaptation in Reinforcement Learning.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

Driving Exploration by Maximum Distribution in Gaussian Process Bandits.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

A Novel Confidence-Based Algorithm for Structured Bandits.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Truly Batch Model-Free Inverse Reinforcement Learning about Multiple Intentions.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Balancing Learning Speed and Stability in Policy Gradient via Adaptive Exploration.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

An Intrinsically-Motivated Approach for Learning Highly Exploring and Fast Mixing Policies.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Gradient-Aware Model-Based Policy Search.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Dealing with Interdependencies and Uncertainty in Multi-Channel Advertising Campaigns Optimization.
Proceedings of the World Wide Web Conference, 2019

Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Exploration Driven by an Optimistic Bellman Equation.
Proceedings of the International Joint Conference on Neural Networks, 2019

Exploiting Action-Value Uncertainty to Drive Exploration in Reinforcement Learning.
Proceedings of the International Joint Conference on Neural Networks, 2019

Feature Selection via Mutual Information: New Theoretical Insights.
Proceedings of the International Joint Conference on Neural Networks, 2019

Transfer of Samples in Policy Search via Multiple Importance Sampling.
Proceedings of the 36th International Conference on Machine Learning, 2019

Optimistic Policy Optimization via Multiple Importance Sampling.
Proceedings of the 36th International Conference on Machine Learning, 2019

Reinforcement Learning in Configurable Continuous Environments.
Proceedings of the 36th International Conference on Machine Learning, 2019

2018
Improving multi-armed bandit algorithms in online pricing settings.
Int. J. Approx. Reason., 2018

Transfer of Value Functions via Variational Methods.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Policy Optimization via Importance Sampling.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Does Reinforcement Learning outperform PID in the control of FES-induced elbow flex-extension?
Proceedings of the 2018 IEEE International Symposium on Medical Measurements and Applications, 2018

Targeting Optimization for Internet Advertising by Learning from Logged Bandit Feedback.
Proceedings of the 2018 International Joint Conference on Neural Networks, 2018

Importance Weighted Transfer of Samples in Reinforcement Learning.
Proceedings of the 35th International Conference on Machine Learning, 2018

Stochastic Variance-Reduced Policy Gradient.
Proceedings of the 35th International Conference on Machine Learning, 2018

Configurable Markov Decision Processes.
Proceedings of the 35th International Conference on Machine Learning, 2018

A Combinatorial-Bandit Algorithm for the Online Joint Bid/Budget Optimization of Pay-per-Click Advertising Campaigns.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Cost-Sensitive Approach to Batch Size Adaptation for Gradient Descent.
CoRR, 2017

Regret Minimization Algorithms for the Followers Behaviour Identification in Leadership Games.
Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, 2017

Gradient-based minimization for multi-expert Inverse Reinforcement Learning.
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Exploiting structure and uncertainty of Bellman updates in Markov decision processes.
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Adaptive Batch Size for Safe Policy Gradients.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

Compatible Reward Inverse Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 2017

User context estimation for public travel assistance and intelligent service scheduling.
Proceedings of the 20th IEEE International Conference on Intelligent Transportation Systems, 2017

Risk-averse trees for learning from logged bandit feedback.
Proceedings of the 2017 International Joint Conference on Neural Networks, 2017

Boosted Fitted Q-Iteration.
Proceedings of the 34th International Conference on Machine Learning, 2017

Designing Learning Algorithms over the Sequence Form of an Extensive-Form Game.
Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems, 2017

Unimodal Thompson Sampling for Graph-Structured Arms.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

Estimating the Maximum Expected Value in Continuous Reinforcement Learning Problems.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Policy Search for the Optimal Control of Markov Decision Processes: A Novel Particle-Based Iterative Scheme.
IEEE Trans. Cybern., 2016

Multi-objective Reinforcement Learning through Continuous Pareto Manifold Approximation.
J. Artif. Intell. Res., 2016

Extensive-form games with heterogeneous populations: solution concepts, equilibria characterization, learning dynamics.
Intelligenza Artificiale, 2016

Reconstruction of public transport state.
Proceedings of the 19th IEEE International Conference on Intelligent Transportation Systems, 2016

Estimating Maximum Expected Value through Gaussian Approximation.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Budgeted Multi-Armed Bandit in Continuous Action Space.
Proceedings of the ECAI 2016 - 22nd European Conference on Artificial Intelligence, 29 August-2 September 2016, The Hague, The Netherlands, 2016

Inverse Reinforcement Learning through Policy Gradient Minimization.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

Sequence-Form and Evolutionary Dynamics: Realization Equivalence to Agent Form and Logit Dynamics.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
Policy gradient in Lipschitz Markov Decision Processes.
Mach. Learn., 2015

Sparse multi-task reinforcement learning.
Intelligenza Artificiale, 2015

Following Newton direction in Policy Gradient with parameter exploration.
Proceedings of the 2015 International Joint Conference on Neural Networks, 2015

Estimating a Mean-Path from a set of 2-D curves.
Proceedings of the IEEE International Conference on Robotics and Automation, 2015

Multi-Objective Reinforcement Learning with Continuous Pareto Frontier Approximation.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Policy gradient approaches for multi-objective sequential decision making.
Proceedings of the 2014 International Joint Conference on Neural Networks, 2014

Policy gradient approaches for multi-objective sequential decision making: A comparison.
Proceedings of the 2014 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning, 2014

Evolutionary Dynamics of Q-Learning over the Sequence Form.
Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, 2014

2013
Adaptive Step-Size for Policy Gradient Methods.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Safe Policy Iteration.
Proceedings of the 30th International Conference on Machine Learning, 2013

Extensive-form games with heterogeneous populations.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Efficient Evolutionary Dynamics with Extensive-Form Games.
Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, 2013

2012
Data-driven dynamic emulation modelling for the optimal management of environmental systems.
Environ. Model. Softw., 2012

Tree-based Fitted Q-iteration for Multi-Objective Markov Decision problems.
Proceedings of the 2012 International Joint Conference on Neural Networks (IJCNN), 2012

Computing Equilibria with Two-Player Zero-Sum Continuous Stochastic Games with Switching Controller.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Transfer from Multiple MDPs.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Multi-objective fitted Q-iteration: Pareto frontier approximation in one single run.
Proceedings of the IEEE International Conference on Networking, Sensing and Control, 2011

Equilibrium approximation in simulation-based extensive-form games.
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Fitted policy search.
Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

Tree-based variable selection for dimensionality reduction of large-scale control systems.
Proceedings of the 2011 IEEE Symposium on Adaptive Dynamic Programming And Reinforcement Learning, 2011

2009
Reinforcement distribution in fuzzy Q-learning.
Fuzzy Sets Syst., 2009

Batch Reinforcement Learning - An Application to a Controllable Semi-active Suspension System.
Proceedings of the ICINCO 2009, 2009

Batch Reinforcement Learning for semi-active suspension control.
Proceedings of the IEEE International Conference on Control Applications, 2009

2008
Improving Batch Reinforcement Learning Performance through Transfer of Samples.
Proceedings of the STAIRS 2008, 2008

Batch Reinforcement Learning for Controlling a Mobile Wheeled Pendulum Robot.
Proceedings of the Artificial Intelligence in Theory and Practice II, 2008

Transfer of samples in batch reinforcement learning.
Proceedings of the Machine Learning, 2008

On the usefulness of opponent modeling: the Kuhn Poker case study.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Transfer of task representation in reinforcement learning using policy-based proto-value functions.
Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008), 2008

Towards Automated Bargaining in Electronic Markets: A Partially Two-Sided Competition Model.
Proceedings of the Agent-Mediated Electronic Commerce and Trading Agent Design and Analysis, 2008

2007
Problems and solutions for anchoring in multi-robot applications.
J. Intell. Fuzzy Syst., 2007

Learning Fuzzy Classifier Systems: Architecture and Exploration Issues.
Int. J. Artif. Intell. Tools, 2007

Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods.
Proceedings of the Advances in Neural Information Processing Systems 20, 2007

Piecewise constant reinforcement learning for robotic applications.
Proceedings of the ICINCO 2007, 2007

Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions.
Proceedings of the AI*IA 2007: Artificial Intelligence and Human-Oriented Computing, 2007

Bifurcation Analysis of Reinforcement Learning Agents in the Selten's Horse Game.
Proceedings of the Adaptive Agents and Multi-Agent Systems III. Adaptation and Multi-Agent Learning, 2007

2006
Concepts and fuzzy models for behavior-based robotics.
Int. J. Approx. Reason., 2006

Incremental Skill Acquisition for Self-motivated Learning Animats.
Proceedings of the From Animals to Animats 9, 2006

Learning to cooperate in multi-agent social dilemmas.
Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2006), 2006

2005
Reinforcement Distribution in Continuous State Action Space Fuzzy Q-Learning: A Novel Approach.
Proceedings of the Fuzzy Logic and Applications, 6th International Workshop, 2005

MRT: Robotics Off-the-Shelf with the Modular Robotic Toolkit.
Proceedings of the Software Engineering for Experimental Robotics, 2005

Automatic Error Detection and Reduction for an Odometric Sensor based on Two Optical Mice.
Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005

2004
A multi-agent system for multi-agent learning.
PhD thesis, 2004

A kinematic-independent dead-reckoning sensor for indoor mobile robotics.
Proceedings of the 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems, Sendai, Japan, September 28, 2004

Dead Reckoning for Mobile Robots Using Two Optical Mice.
Proceedings of the ICINCO 2004, 2004

2003
A Probabilistic Framework for Weighting Different Sensor Data in MUREA.
Proceedings of the RoboCup 2003: Robot Soccer World Cup VII, 2003

Filling the Gap among Coordination, Planning, and Reaction Using a Fuzzy Cognitive Model.
Proceedings of the RoboCup 2003: Robot Soccer World Cup VII, 2003

2002
MUREA: A MUlti-Resolution Evidence Accumulation Method for Robot Localization in Known Environments.
Proceedings of the RoboCup 2002: Robot Soccer World Cup VI, 2002

A robot localization method based on evidence accumulation and multi-resolution.
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, Lausanne, Switzerland, September 30, 2002

An architecture to implement agents co-operating in dynamic environments.
Proceedings of the First International Joint Conference on Autonomous Agents & Multiagent Systems, 2002

2001
A Framework for Robust Sensing in Multi-agent Systems.
Proceedings of the RoboCup 2001: Robot Soccer World Cup V, 2001

Fun2Mas: The Milan Robocup Team.
Proceedings of the RoboCup 2001: Robot Soccer World Cup V, 2001

Concepts for Anchoring in Robotics.
Proceedings of the AI*IA 2001: Advances in Artificial Intelligence, 2001


  Loading...