Doina Precup

Affiliations:
  • McGill University, Montreal, Canada


According to our database1, Doina Precup authored at least 363 papers between 1997 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

Online presence:

On csauthors.net:

Bibliography

2024
Connecting weighted automata, tensor networks and recurrent neural networks through spectral learning.
Mach. Learn., May, 2024

Policy Gradient Methods in the Presence of Symmetries and State Abstractions.
J. Mach. Learn. Res., 2024

Soft Condorcet Optimization for Ranking of General Agents.
CoRR, 2024

Learning Successor Features the Simple Way.
CoRR, 2024

Identifying and Addressing Delusions for Target-Directed Decision-Making.
CoRR, 2024

Mitigating Downstream Model Risks via Model Provenance.
CoRR, 2024

EnzymeFlow: Generating Reaction-specific Enzyme Catalytic Pockets through Flow Matching and Co-Evolutionary Dynamics.
CoRR, 2024

Training Language Models to Self-Correct via Reinforcement Learning.
CoRR, 2024

Reactzyme: A Benchmark for Enzyme-Reaction Prediction.
CoRR, 2024

Functional Acceleration for Policy Mirror Descent.
CoRR, 2024

The Heterophilic Graph Learning Handbook: Benchmarks, Models, Theoretical Analysis, Applications and Challenges.
CoRR, 2024

On the Limits of Multi-modal Meta-Learning with Auxiliary Task Modulation Using Conditional Batch Normalization.
CoRR, 2024

Partial Models for Building Adaptive Model-Based Reinforcement Learning Agents.
CoRR, 2024

Adaptive Exploration for Data-Efficient General Value Function Evaluations.
CoRR, 2024

Generative Active Learning for the Search of Small-molecule Protein Binders.
CoRR, 2024

Offline Multitask Representation Learning for Reinforcement Learning.
CoRR, 2024

Discrete Probabilistic Inference as Control in Multi-path Environments.
CoRR, 2024

QGFN: Controllable Greediness with Action Values.
CoRR, 2024

Effective Protein-Protein Interaction Exploration with PPIretrieval.
CoRR, 2024

More Efficient Randomized Exploration for Reinforcement Learning via Approximate Sampling.
RLJ, 2024

Finding Increasingly Large Extremal Graphs with AlphaZero and Tabu Search.
Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence, 2024

Code as Reward: Empowering Reinforcement Learning with VLMs.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Mixtures of Experts Unlock Parameter Scaling for Deep RL.
Proceedings of the Forty-first International Conference on Machine Learning, 2024


Provable and Practical: Efficient Exploration in Reinforcement Learning via Langevin Monte Carlo.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Consciousness-Inspired Spatio-Temporal Abstractions for Better Generalization in Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds.
Proceedings of the IEEE International Conference on Acoustics, 2024

On learning history-based policies for controlling Markov decision processes.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

On the Privacy of Selection Mechanisms with Gaussian Noise.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Temporal Abstraction in Reinforcement Learning with the Successor Representation.
J. Mach. Learn. Res., 2023

Nash Learning from Human Feedback.
CoRR, 2023

DGFN: Double Generative Flow Networks.
CoRR, 2023

Forecaster: Towards Temporally Abstract Tree-Search Planning from Pixels.
CoRR, 2023

A cry for help: Early detection of brain injury in newborns.
CoRR, 2023

Combining Spatial and Temporal Abstraction in Planning for Better Generalization.
CoRR, 2023

Policy composition in reinforcement learning via multi-objective policy optimization.
CoRR, 2023

On the Convergence of Bounded Agents.
CoRR, 2023

An Empirical Study of the Effectiveness of Using a Replay Buffer on Mode Discovery in GFlowNets.
CoRR, 2023

Optimism and Adaptivity in Policy Optimization.
CoRR, 2023

Accelerating exploration and representation learning with offline pre-training.
CoRR, 2023

The Stable Entropy Hypothesis and Entropy-Aware Decoding: An Analysis and Algorithm for Robust Natural Language Generation.
CoRR, 2023

Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning.
IEEE Access, 2023

When Do Graph Neural Networks Help with Node Classification? Investigating the Homophily Principle on Node Distinguishability.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

For SALE: State-Action Representation Learning for Deep Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Prediction and Control in Continual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Definition of Continual Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

MUDiff: Unified Diffusion for Complete Molecule Generation.
Proceedings of the Learning on Graphs Conference, 27-30 November 2023, Virtual Event., 2023

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets.
Proceedings of the International Conference on Machine Learning, 2023

Training Matters: Unlocking Potentials of Deeper Graph Convolutional Neural Networks.
Proceedings of the Complex Networks & Their Applications XII, 2023

When Do We Need Graph Neural Networks for Node Classification?
Proceedings of the Complex Networks & Their Applications XII, 2023

Minimal Value-Equivalent Partial Models for Scalable and Robust Planning in Lifelong Reinforcement Learning.
Proceedings of the Conference on Lifelong Learning Agents, 2023

Hybrid Scattering Transform - Long Short-Term Memory Networks for Intrapartum Fetal Heart Rate Classification.
Proceedings of the Computing in Cardiology, 2023

Finite time analysis of temporal difference learning with linear function approximation: Tail averaging and regularisation.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

Towards Safe Mechanical Ventilation Treatment Using Deep Offline Reinforcement Learning.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

On the Challenges of Using Reinforcement Learning in Precision Drug Dosing: Delay and Prolongedness of Action Effects.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023

2022
Behind the Machine's Gaze: Neural Networks with Biologically-inspired Constraints Exhibit Human-like Visual Attention.
Trans. Mach. Learn. Res., 2022

Deep learning, reinforcement learning, and world models.
Neural Networks, 2022

Low-Rank Representation of Reinforcement Learning Policies.
J. Artif. Intell. Res., 2022

Towards Continual Reinforcement Learning: A Review and Perspectives.
J. Artif. Intell. Res., 2022

Offline Policy Optimization in RL with Variance Regularizaton.
CoRR, 2022

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Neural Networks.
CoRR, 2022

Simulating Human Gaze with Neural Visual Attention.
CoRR, 2022

When Do We Need GNN for Node Classification?
CoRR, 2022

Bayesian Q-learning With Imperfect Expert Demonstrations.
CoRR, 2022

Understanding Decision-Time vs. Background Planning in Model-Based Reinforcement Learning.
CoRR, 2022

Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning.
CoRR, 2022

Behind the Machine's Gaze: Biologically Constrained Neural Networks Exhibit Human-like Visual Attention.
CoRR, 2022

Towards Painless Policy Optimization for Constrained MDPs.
CoRR, 2022

Selective Credit Assignment.
CoRR, 2022

Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers.
CoRR, 2022

The Paradox of Choice: Using Attention in Hierarchical Reinforcement Learning.
CoRR, 2022

Attention Option-Critic.
CoRR, 2022

Towards painless policy optimization for constrained MDPs.
Proceedings of the Uncertainty in Artificial Intelligence, 2022

Continuous MDP Homomorphisms and Homomorphic Policy Gradient.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Revisiting Heterophily For Graph Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

On the Expressivity of Markov Reward (Extended Abstract).
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification.
Proceedings of the International Conference on Machine Learning, 2022

Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error.
Proceedings of the International Conference on Machine Learning, 2022

Proving Theorems using Incremental Learning and Hindsight Experience Replay.
Proceedings of the International Conference on Machine Learning, 2022

Policy Gradients Incorporating the Future.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates.
Proceedings of the Tenth International Conference on Learning Representations, 2022

COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Assessing Intrapartum Risk of Hypoxic Ischemic Encephalopathy Using Fetal Heart Rate With Long Short-Term Memory Networks.
Proceedings of the Computing in Cardiology, 2022

2021
Safe option-critic: learning safety in the option-critic architecture.
Knowl. Eng. Rev., 2021

Single-Shot Pruning for Offline Reinforcement Learning.
CoRR, 2021

Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning.
CoRR, 2021

Proving Theorems using Incremental Learning and Hindsight Experience Replay.
CoRR, 2021

Temporal Abstraction in Reinforcement Learning with the Successor Representation.
CoRR, 2021

Is Heterophily A Real Nightmare For Graph Neural Networks To Do Node Classification?
CoRR, 2021

Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning.
CoRR, 2021

A Survey of Exploration Methods in Reinforcement Learning.
CoRR, 2021

Correcting Momentum in Temporal Difference Learning.
CoRR, 2021

Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Offline RL.
CoRR, 2021

AndroidEnv: A Reinforcement Learning Platform for Android.
CoRR, 2021

What is Going on Inside Recurrent Meta Reinforcement Learning Agents?
CoRR, 2021

Training a First-Order Theorem Prover from Synthetic Data.
CoRR, 2021

Reward is enough.
Artif. Intell., 2021

A Consciousness-Inspired Planning Agent for Model-Based Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Gradient Starvation: A Learning Proclivity in Neural Networks.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Flexible Option Learning.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Temporally Abstract Partial Models.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Flow Network based Generative Models for Non-Iterative Diverse Candidate Generation.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

On the Expressivity of Markov Reward.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Randomized Exploration in Reinforcement Learning with General Value Function Approximation.
Proceedings of the 38th International Conference on Machine Learning, 2021

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation.
Proceedings of the 38th International Conference on Machine Learning, 2021

Preferential Temporal Difference Learning.
Proceedings of the 38th International Conference on Machine Learning, 2021

Locally Persistent Exploration in Continuous Control Tasks with Sparse Rewards.
Proceedings of the 38th International Conference on Machine Learning, 2021

Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata.
Proceedings of the 48th International Colloquium on Automata, Languages, and Programming, 2021

Self-Supervised Attention-Aware Reinforcement Learning.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

Variance Penalized On-Policy and Off-Policy Actor-Critic.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020
Multiple Kernel Learning-Based Transfer Regression for Electric Load Forecasting.
IEEE Trans. Smart Grid, 2020

Fast reinforcement learning with generalized policy updates.
Proc. Natl. Acad. Sci. USA, 2020

Exploring uncertainty measures in deep networks for Multiple sclerosis lesion detection and segmentation.
Medical Image Anal., 2020

Diversity-Enriched Option-Critic.
CoRR, 2020

A Study of Policy Gradient on a Class of Exactly Solvable Models.
CoRR, 2020

A Fully Tensorized Recurrent Neural Network.
CoRR, 2020

Complete the Missing Half: Augmenting Aggregation Filtering with Diversification for Graph Convolutional Networks.
CoRR, 2020

Learning to Prove from Synthetic Theorems.
CoRR, 2020

A Brief Look at Generalization in Visual Meta-Reinforcement Learning.
CoRR, 2020

Policy Evaluation Networks.
CoRR, 2020

oIRL: Robust Adversarial Inverse Reinforcement Learning with Temporally Extended Actions.
CoRR, 2020

Provably efficient reconstruction of policy networks.
CoRR, 2020

On Efficiency in Hierarchical Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Reward Propagation Using Graph Convolutional Networks.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Value-driven Hindsight Modelling.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Forethought and Hindsight in Credit Assignment.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

SVRG for Policy Evaluation with Fewer Gradient Evaluations.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

What can I do here? A Theory of Affordances in Reinforcement Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Interference and Generalization in Temporal Difference Learning.
Proceedings of the 37th International Conference on Machine Learning, 2020

Invariant Causal Prediction for Block MDPs.
Proceedings of the 37th International Conference on Machine Learning, 2020

Keynote Lecture - Building Knowledge For AI AgentsWith Reinforcement Learning.
Proceedings of the 16th IEEE International Conference on Intelligent Computer Communication and Processing, 2020

Learning to cooperate: Emergent communication in multi-agent navigation.
Proceedings of the 42th Annual Meeting of the Cognitive Science Society, 2020

Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites.
Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine, 2020

META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Gifting in Multi-Agent Reinforcement Learning.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Option-Critic in Cooperative Multi-agent Systems.
Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems, 2020

Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Value Preserving State-Action Abstractions.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

Gifting in Multi-Agent Reinforcement Learning (Student Abstract).
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Options of Interest: Temporal Abstraction with Interest Functions.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

Algorithmic Improvements for Deep Reinforcement Learning Applied to Interactive Fiction.
Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, 2020

2019
Singular value automata and approximate minimization.
Math. Struct. Comput. Sci., 2019

Shaping representations through communication: community size effect in artificial learning systems.
CoRR, 2019

Marginalized State Distribution Entropy Regularization in Policy Optimization.
CoRR, 2019

Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning.
CoRR, 2019

Entropy Regularization with Discounted Future State Distribution in Policy Gradient Methods.
CoRR, 2019

Actor Critic with Differentially Private Critic.
CoRR, 2019

Augmenting learning using symmetry in a biologically-inspired domain.
CoRR, 2019

Avoidance Learning Using Observational Reinforcement Learning.
CoRR, 2019

Revisit Policy Optimization in Matrix Form.
CoRR, 2019

An Empirical Study of Batch Normalization and Group Normalization in Conditional Computation.
CoRR, 2019

Self-supervised Learning of Distance Functions for Goal-Conditioned Reinforcement Learning.
CoRR, 2019

Recurrent Value Functions.
CoRR, 2019

Community size effect in artificial learning systems.
Proceedings of the Visually Grounded Interaction and Language (ViGIL), 2019

Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Hindsight Credit Assignment.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

The Option Keyboard: Combining Skills in Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

Prediction of Disease Progression in Multiple Sclerosis Patients using Deep Learning Analysis of MRI Data.
Proceedings of the International Conference on Medical Imaging with Deep Learning, 2019

Improving Pathological Structure Segmentation via Transfer Learning Across Diseases.
Proceedings of the Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data, 2019

Early Prediction of Alzheimer's Disease Progression Using Variational Autoencoders.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2019, 2019

Neural Transfer Learning for Cry-Based Diagnosis of Perinatal Asphyxia.
Proceedings of the 20th Annual Conference of the International Speech Communication Association, 2019

Learning Reliable Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
Proceedings of the 4th International Workshop on Knowledge Discovery in Healthcare Data co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Learning Modular Safe Policies in the Bandit Setting with Application to Adaptive Clinical Trials.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the 28th International Joint Conference on Artificial Intelligence, 2019

Uncertainty Aware Learning from Demonstrations in Multiple Contexts using Bayesian Neural Networks.
Proceedings of the International Conference on Robotics and Automation, 2019

Per-Decision Option Discounting.
Proceedings of the 36th International Conference on Machine Learning, 2019

Off-Policy Deep Reinforcement Learning without Exploration.
Proceedings of the 36th International Conference on Machine Learning, 2019

Learning proposals for sequential importance samplers using reinforced variational inference.
Proceedings of the Deep Reinforcement Learning Meets Structured Prediction, 2019

Navigation Agents for the Visually Impaired: A Sidewalk Simulator and Experiments.
Proceedings of the 3rd Annual Conference on Robot Learning, 2019

Building Knowledge for AI Agents with Reinforcement Learning.
Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, 2019

Connecting Weighted Automata and Recurrent Neural Networks through Spectral Learning.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

The Termination Critic.
Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics, 2019

Leveraging Observations in Bandits: Between Risks and Benefits.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Learning Options with Interest Functions.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Combined Reinforcement Learning via Abstract Representations.
Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence, 2019

Temporally Extended Metrics for Markov Decision Processes.
Proceedings of the Workshop on Artificial Intelligence Safety 2019 co-located with the Thirty-Third AAAI Conference on Artificial Intelligence 2019 (AAAI-19), 2019

2018
Clustering-Oriented Representation Learning with Attractive-Repulsive Loss.
CoRR, 2018

Environments for Lifelong Reinforcement Learning.
CoRR, 2018

The Barbados 2018 List of Open Issues in Continual Learning.
CoRR, 2018

Attend Before you Act: Leveraging human visual attention for continual learning.
CoRR, 2018

Dyna Planning using a Feature Based Generative Model.
CoRR, 2018

Disentangling the independently controllable factors of variation by interacting with the world.
CoRR, 2018

Constructing Temporal Abstractions Autonomously in Reinforcement Learning.
AI Mag., 2018

Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization.
Proceedings of the Seventh Joint Conference on Lexical and Computational Semantics, 2018

Temporal Regularization for Markov Decision Process.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Learning Safe Policies with Expert Guidance.
Proceedings of the Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, 2018

Convergent TREE BACKUP and RETRACE with Function Approximation.
Proceedings of the 35th International Conference on Machine Learning, 2018

Undersampling and Bagging of Decision Trees in the Analysis of Cardiorespiratory Behavior for the Prediction of Extubation Readiness in Extremely Preterm Infants.
Proceedings of the 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2018

Leveraging Observational Learning for Exploration in Bandits.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Eligibility Traces for Options.
Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, 2018

Nonlinear Weighted Finite Automata.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2018

Learning Robust Options.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Imitation Upper Confidence Bound for Bandits on a Graph.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning With Options That Terminate Off-Policy.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

When Waiting Is Not an Option: Learning Options With a Deliberation Cost.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Learning Predictive State Representations From Non-Uniform Sampling.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

Deep Reinforcement Learning That Matters.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

OptionGAN: Learning Joint Reward-Policy Options Using Generative Adversarial Inverse Reinforcement Learning.
Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018

2017
Learnings Options End-to-End for Continuous Action Tasks.
CoRR, 2017

Ubenwa: Cry-based Diagnosis of Birth Asphyxia.
CoRR, 2017

Neural Network Based Nonlinear Weighted Finite Automata.
CoRR, 2017

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control.
CoRR, 2017

Independently Controllable Factors.
CoRR, 2017

Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options.
CoRR, 2017

Investigating Recurrence and Eligibility Traces in Deep Q-Networks.
CoRR, 2017

Independently Controllable Features.
CoRR, 2017

Predicting extubation readiness in extreme preterm infants based on patterns of breathing.
Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence, 2017

Boosting Based Multiple Kernel Learning and Transfer Regression for Electricity Load Forecasting.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2017

Learning-based interactive segmentation using the maximum mean cycle weight formalism.
Proceedings of the Medical Imaging 2017: Image Processing, 2017

Predicting Future Disease Activity and Treatment Responders for Multiple Sclerosis Patients Using a Bag-of-Lesions Brain Representation.
Proceedings of the Medical Image Computing and Computer Assisted Intervention - MICCAI 2017, 2017

Approximate Value Iteration with Temporally Extended Actions (Extended Abstract).
Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, 2017

Horizontal and Vertical Self-Adaptive Cloud Controller with Reward Optimization for Resource Allocation.
Proceedings of the 2017 International Conference on Cloud and Autonomic Computing, 2017

World Knowledge for Reading Comprehension: Rare Entity Prediction with Hierarchical LSTMs Using External Descriptions.
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 2017

A semi-Markov chain approach to modeling respiratory patterns prior to extubation in preterm infants.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies.
Proceedings of the 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), 2017

Real-Time Indoor Localization in Smart Homes Using Semi-Supervised Learning.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

The Option-Critic Architecture.
Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, 2017

2016
Hierarchical Spatio-Temporal Probabilistic Graphical Model with Multiple Feature Fusion for Binary Facial Attribute Classification in Real-World Face Videos.
IEEE Trans. Pattern Anal. Mach. Intell., 2016

Practical Kernel-Based Reinforcement Learning.
J. Mach. Learn. Res., 2016

Editorial on Special Issue on Probabilistic Models for Biomedical Image Analysis.
Comput. Vis. Image Underst., 2016

A Matrix Splitting Perspective on Planning with Options.
CoRR, 2016

Learning Multi-Step Predictive State Representations.
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016

Differentially Private Policy Evaluation.
Proceedings of the 33nd International Conference on Machine Learning, 2016

Verb Phrase Ellipsis Resolution Using Discriminative and Margin-Infused Algorithms.
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 2016

Automated ongoing data validation and quality control of multi-institutional studies.
Proceedings of the 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2016

Prediction of Cell Type Specific Transcription Factor Binding Site Occupancy.
Proceedings of the 7th ACM International Conference on Bioinformatics, 2016

Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data.
Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, 2016

Incremental Stochastic Factorization for Online Reinforcement Learning.
Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence, 2016

2015
The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS).
IEEE Trans. Medical Imaging, 2015

Classification-Based Approximate Policy Iteration.
IEEE Trans. Autom. Control., 2015

Quantifying the determinants of outbreak detection performance through simulation and machine learning.
J. Biomed. Informatics, 2015

Approximate Value Iteration with Temporally Extended Actions.
J. Artif. Intell. Res., 2015

Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos.
Comput. Vis. Image Underst., 2015

Policy Gradient Methods for Off-policy Control.
CoRR, 2015

Conditional Computation in Neural Networks for faster models.
CoRR, 2015

Testing Visual Attention in Dynamic Environments.
CoRR, 2015

Learning and Planning with Timing Information in Markov Decision Processes.
Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, 2015

Basis refinement strategies for linear value function approximation in MDPs.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

Data Generation as Sequential Decision Making.
Proceedings of the Advances in Neural Information Processing Systems 28: Annual Conference on Neural Information Processing Systems 2015, 2015

A Canonical Form for Weighted Automata and Applications to Approximate Minimization.
Proceedings of the 30th Annual ACM/IEEE Symposium on Logic in Computer Science, 2015

IMaGe: Iterative Multilevel Probabilistic Graphical Model for Detection and Segmentation of Multiple Sclerosis Lesions in Brain MRI.
Proceedings of the Information Processing in Medical Imaging, 2015

An Expectation-Maximization Algorithm to Compute a Stochastic Factorization From Data.
Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 2015

Variational Generative Stochastic Networks with Collaborative Shaping.
Proceedings of the 32nd International Conference on Machine Learning, 2015

Correlation of clinical parameters with cardiorespiratory behavior in successfully extubated extremely preterm infants.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Organizational principles of cloud storage to support collaborative biomedical research.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Feature selection and oversampling in analysis of clinical data for extubation readiness in extreme preterm infants.
Proceedings of the 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2015

Representation Discovery for MDPs Using Bisimulation Metrics.
Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2014
Policy Iteration Based on Stochastic Factorization.
J. Artif. Intell. Res., 2014

Algorithms for multi-armed bandit problems.
CoRR, 2014

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions.
CoRR, 2014

Theoretical results on the effect of 'shortcut' actions in MDPs.
Connect. Sci., 2014

Bisimulation Metrics are Optimal Value Functions.
Proceedings of the Thirtieth Conference on Uncertainty in Artificial Intelligence, 2014

Optimizing Energy Production Using Policy Search and Predictive State Representations.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

Learning with Pseudo-Ensembles.
Proceedings of the Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, 2014

A new Q(lambda) with interim forward view and Monte Carlo equivalence.
Proceedings of the 31th International Conference on Machine Learning, 2014

Sample-based approximate regularization.
Proceedings of the 31th International Conference on Machine Learning, 2014

Multi-layer temporal graphical model for head pose estimation in real-world videos.
Proceedings of the 2014 IEEE International Conference on Image Processing, 2014

Probabilistic Temporal Head Pose Estimation Using a Hierarchical Graphical Model.
Proceedings of the Computer Vision - ECCV 2014, 2014

Iterative Multilevel MRF Leveraging Context and Voxel Information for Brain Tumour Segmentation in MRI.
Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014

Bisimulation for Markov Decision Processes through Families of Functional Expressions.
Proceedings of the Horizons of the Mind. A Tribute to Prakash Panangaden, 2014

Analyzing User Trajectories from Mobile Device Data with Hierarchical Dirichlet Processes.
Proceedings of the Advances in Artificial Intelligence, 2014

2013
Generating storylines from sensor data.
Pervasive Mob. Comput., 2013

Time Series Analysis Using Geometric Template Matching.
IEEE Trans. Pattern Anal. Mach. Intell., 2013

Greedy Confidence Pursuit: A Pragmatic Approach to Multi-bandit Optimization.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2013

Learning from Limited Demonstrations.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Bellman Error Based Feature Generation using Random Projections on Sparse Spaces.
Proceedings of the Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013

Hierarchical Probabilistic Gabor and MRF Segmentation of Brain Tumours in MRI Volumes.
Proceedings of the Medical Image Computing and Computer-Assisted Intervention - MICCAI 2013, 2013

Smart Classifier Selection for Activity Recognition on Wearable Devices.
Proceedings of the ICPRAM 2013, 2013

Average Reward Optimization Objective In Partially Observable Domains.
Proceedings of the 30th International Conference on Machine Learning, 2013

Assessing the Predictability of Hospital Readmission Using Machine Learning.
Proceedings of the Twenty-Fifth Innovative Applications of Artificial Intelligence Conference, 2013

Smart exploration in reinforcement learning using absolute temporal difference errors.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013

Using Hierarchical Mixture of Experts Model for Fusion of Outbreak Detection Methods.
Proceedings of the AMIA 2013, 2013

2012
An information-theoretic approach to curiosity-driven reinforcement learning.
Theory Biosci., 2012

On Average Reward Policy Evaluation in Infinite-State Partially Observable Systems.
Proceedings of the Fifteenth International Conference on Artificial Intelligence and Statistics, 2012

A Machine Learning Approach to the Detection of Fetal Hypoxia during Labor and Delivery.
AI Mag., 2012

Reports of the AAAI 2011 Conference Workshops.
AI Mag., 2012

On-the-Fly Algorithms for Bisimulation Metrics.
Proceedings of the Ninth International Conference on Quantitative Evaluation of Systems, 2012

Value Pursuit Iteration.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

On-line Reinforcement Learning Using Incremental Kernel-Based Stochastic Factorization.
Proceedings of the Advances in Neural Information Processing Systems 25: 26th Annual Conference on Neural Information Processing Systems 2012. Proceedings of a meeting held December 3-6, 2012

Improved Estimation in Time Varying Models.
Proceedings of the 29th International Conference on Machine Learning, 2012

An Empirical Analysis of Off-policy Learning in Discrete MDPs.
Proceedings of the Tenth European Workshop on Reinforcement Learning, 2012

Prediction of extubation readiness in extreme preterm infants based on measures of cardiorespiratory variability.
Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012

Soft biometric trait classification from real-world face videos conditioned on head pose estimation.
Proceedings of the 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2012

Mining Administrative Data to Predict Falls in the Elderly Population.
Proceedings of the Advances in Artificial Intelligence, 2012

Compressed Least-Squares Regression on Sparse Spaces.
Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence, 2012

2011
Bisimulation Metrics for Continuous Markov Decision Processes.
SIAM J. Comput., 2011

The Duality of State and Observation in Probabilistic Transition Systems.
Proceedings of the Logic, Language, and Computation, 2011

Activity Recognition with Mobile Phones.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2011

Reinforcement Learning using Kernel-Based Stochastic Factorization.
Proceedings of the Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011. Proceedings of a meeting held 12-14 December 2011, 2011

Adapted MRF Segmentation of Multiple Sclerosis Lesions Using Local Contextual Information.
Proceedings of the Medical Image Understanding and Analysis, 2011

A Framework for Computing Bounds for the Return of a Policy.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Automatic Construction of Temporally Extended Actions for MDPs Using Bisimulation Metrics.
Proceedings of the Recent Advances in Reinforcement Learning - 9th European Workshop, 2011

Horde: a scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction.
Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2011), 2011

Activity Recognition with Time-Delay Emobeddings.
Proceedings of the Computational Physiology, 2011

Basis Function Discovery Using Spectral Clustering and Bisimulation Metrics.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

Learning Compact Representations of Time-Varying Processes.
Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, 2011

2010
Classification of Normal and Hypoxic Fetuses From Systems Modeling of Intrapartum Cardiotocography.
IEEE Trans. Biomed. Eng., 2010

A Study of Approximate Inference in Probabilistic Relational Models.
Proceedings of the 2nd Asian Conference on Machine Learning, 2010

Smarter Sampling in Model-Based Bayesian Reinforcement Learning.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2010

Approximate Predictive Representations of Partially Observable Systems.
Proceedings of the 27th International Conference on Machine Learning (ICML-10), 2010

A novel similarity measure for time series data with applications to gait and activity recognition.
Proceedings of the UbiComp 2010: Ubiquitous Computing, 12th International Conference, 2010

An Algebraic Approach to Dynamic Epistemic Logic.
Proceedings of the 23rd International Workshop on Description Logics (DL 2010), 2010

Automatically suggesting topics for augmenting text documents.
Proceedings of the 19th ACM Conference on Information and Knowledge Management, 2010

Optimal policy switching algorithms for reinforcement learning.
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2010), 2010

Activity and Gait Recognition with Time-Delay Embeddings.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

Using Bisimulation for Policy Transfer in MDPs.
Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, 2010

2009
Identification of the Dynamic Relationship Between Intrapartum Uterine Pressure and Fetal Heart Rate for Normal and Hypoxic Fetuses.
IEEE Trans. Biomed. Eng., 2009

Learning the Difference between Partially Observable Dynamical Systems.
Proceedings of the Machine Learning and Knowledge Discovery in Databases, 2009

Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation.
Proceedings of the Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7-10 December 2009, 2009

Wikispeedia: An Online Game for Inferring Semantic Distances between Concepts.
Proceedings of the IJCAI 2009, 2009

Equivalence Relations in Fully and Partially Observable Markov Decision Processes.
Proceedings of the IJCAI 2009, 2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation.
Proceedings of the 26th Annual International Conference on Machine Learning, 2009

Completing wikipedia's hyperlink structure through dimensionality reduction.
Proceedings of the 18th ACM Conference on Information and Knowledge Management, 2009

2008
Anytime similarity measures for faster alignment.
Comput. Vis. Image Underst., 2008

Bounding Performance Loss in Approximate MDP Homomorphisms.
Proceedings of the Advances in Neural Information Processing Systems 21, 2008

Reinforcement learning in the presence of rare events.
Proceedings of the Machine Learning, 2008

Point-Based Planning for Predictive State Representations.
Proceedings of the Advances in Artificial Intelligence , 2008

2007
Apprentissage actif dans les processus décisionnels de Markov partiellement observables L'algorithme MEDUSA.
Rev. d'Intelligence Artif., 2007

Using Linear Programming for Bayesian Exploration in Markov Decision Processes.
Proceedings of the IJCAI 2007, 2007

Fast Image Alignment Using Anytime Algorithms.
Proceedings of the IJCAI 2007, 2007

Context-Driven Predictions.
Proceedings of the IJCAI 2007, 2007

A formal framework for robot learning and control under model uncertainty.
Proceedings of the 2007 IEEE International Conference on Robotics and Automation, 2007

2006
Methods for Computing State Similarity in Markov Decision Processes.
Proceedings of the UAI '06, 2006

Data Mining Using Relational Database Management Systems.
Proceedings of the Advances in Knowledge Discovery and Data Mining, 2006

Automatic basis function construction for approximate dynamic programming and reinforcement learning.
Proceedings of the Machine Learning, 2006

Linear models of intrapartum uterine pressure-fetal heart rate interaction for the normal and hypoxic fetus.
Proceedings of the 28th International Conference of the IEEE Engineering in Medicine and Biology Society, 2006

PAC-Learning of Markov Models with Hidden State.
Proceedings of the Machine Learning: ECML 2006, 2006

Belief Selection in Point-Based Planning Algorithms for POMDPs.
Proceedings of the Advances in Artificial Intelligence, 2006

Representing Systems with Hidden State.
Proceedings of the Proceedings, 2006

2005
The Workshop Program at the Nineteenth National Conference on Artificial Intelligence.
AI Mag., 2005

Metrics for Markov Decision Processes with Infinite State Spaces.
Proceedings of the UAI '05, 2005

An approximation algorithm for labelled Markov processes: towards realistic approximation.
Proceedings of the Second International Conference on the Quantitative Evaluaiton of Systems (QEST 2005), 2005

Off-policy Learning with Options and Recognizers.
Proceedings of the Advances in Neural Information Processing Systems 18 [Neural Information Processing Systems, 2005

Using core beliefs for point-based value iteration.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Model minimization by linear PSR.
Proceedings of the IJCAI-05, Proceedings of the Nineteenth International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, UK, July 30, 2005

Active Learning in Partially Observable Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2005, 2005

Using Rewards for Belief State Updates in Partially Observable Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2005, 2005

2004
Redagent: winner of TAC SCM 2003.
SIGecom Exch., 2004

Classification Using Phi-Machines and Constructive Function Approximation.
Mach. Learn., 2004

Sparse Distributed Memories for On-Line Value-Based Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2004, 2004

RedAgent-2003: An Autonomous Market-Based Supply-Chain Management Agent.
Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS 2004), 2004

Metrics for Finite Markov Decision Processes.
Proceedings of the Nineteenth National Conference on Artificial Intelligence, 2004

2003
A Planning Algorithm for Predictive State Representations.
Proceedings of the IJCAI-03, 2003

Combining TD-learning with Cascade-correlation Networks.
Proceedings of the Machine Learning, 2003

Using MDP Characteristics to Guide Exploration in Reinforcement Learning.
Proceedings of the Machine Learning: ECML 2003, 2003

2002
Developing Collaborative Golog Agents by Reinforcement Learning.
Int. J. Artif. Intell. Tools, 2002

Learning Options in Reinforcement Learning.
Proceedings of the Abstraction, 2002

A Convergent Form of Approximate Policy Iteration.
Proceedings of the Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, 2002

Combining and Adapting Software Quality Predictive Models by Genetic Algorithms.
Proceedings of the 17th IEEE International Conference on Automated Software Engineering (ASE 2002), 2002

Characterizing Markov Decision Processes.
Proceedings of the Machine Learning: ECML 2002, 2002

2001
Off-Policy Temporal Difference Learning with Function Approximation.
Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williams College, Williamstown, MA, USA, June 28, 2001

2000
Eligibility Traces for Off-Policy Policy Evaluation.
Proceedings of the Seventeenth International Conference on Machine Learning (ICML 2000), Stanford University, Stanford, CA, USA, June 29, 2000

Using Finite Experiments to Study Asymptotic Performance.
Proceedings of the Experimental Algorithmics, 2000

1999
Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning.
Artif. Intell., 1999

1998
Improved Switching among Temporally Abstract Actions.
Proceedings of the Advances in Neural Information Processing Systems 11, [NIPS Conference, Denver, Colorado, USA, November 30, 1998

Intra-Option Learning about Temporally Abstract Actions.
Proceedings of the Fifteenth International Conference on Machine Learning (ICML 1998), 1998

Theoretical Results on Reinforcement Learning with Temporally Abstract Options.
Proceedings of the Machine Learning: ECML-98, 1998

1997
Multi-time Models for Temporally Abstract Planning.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

Learning to Schedule Straight-Line Code.
Proceedings of the Advances in Neural Information Processing Systems 10, 1997

How to Find Big-Oh in Your Data Set (and How Not to).
Proceedings of the Advances in Intelligent Data Analysis, 1997

Exponentiated Gradient Methods for Reinforcement Learning.
Proceedings of the Fourteenth International Conference on Machine Learning (ICML 1997), 1997


  Loading...