Masatoshi Uehara

Orcid: 0000-0001-9017-3105

According to our database1, Masatoshi Uehara authored at least 58 papers between 2018 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond.
J. Mach. Learn. Res., 2024

Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design.
CoRR, 2024

Derivative-Free Guidance in Continuous and Discrete Diffusion Models with Soft Value-Based Decoding.
CoRR, 2024

Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review.
CoRR, 2024

Adding Conditional Control to Diffusion Models with Reinforcement Learning.
CoRR, 2024

Bridging Model-Based Optimization and Generative Modeling via Conservative Fine-Tuning of Diffusion Models.
CoRR, 2024

Regularized DeepIV with Model Selection.
CoRR, 2024

Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control.
CoRR, 2024

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization.
CoRR, 2024

Feedback Efficient Online Fine-Tuning of Diffusion Models.
Proceedings of the Forty-first International Conference on Machine Learning, 2024

Provable Offline Preference-Based Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Provable Reward-Agnostic Preference-Based Reinforcement Learning.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Functional Graphical Models: Structure Enables Offline Data-Driven Optimization.
Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

2023
Source Condition Double Robust Inference on Functionals of Inverse Problems.
CoRR, 2023

How to Query Human Feedback Efficiently in RL?
CoRR, 2023

Provable Offline Reinforcement Learning with Human Feedback.
CoRR, 2023

Minimax Instrumental Variable Regression and $L_2$ Convergence Guarantees without Identification or Closedness.
CoRR, 2023

Refined Value-Based Offline RL under Realizability and Partial Coverage.
CoRR, 2023

Offline Minimax Soft-Q-learning Under Realizability and Partial Coverage.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Future-Dependent Value-Based Off-Policy Evaluation in POMDPs.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Off-Policy Evaluation of Ranking Policies under Diverse User Behavior.
Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Distributional Offline Policy Evaluation with Predictive Error Guarantees.
Proceedings of the International Conference on Machine Learning, 2023

Computationally Efficient PAC RL in POMDPs with Latent Determinism and Conditional Embeddings.
Proceedings of the International Conference on Machine Learning, 2023

PAC Reinforcement Learning for Predictive State Representations.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Minimax Instrumental Variable Regression and L<sub>2</sub> Convergence Guarantees without Identification or Closedness.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

Inference on Strongly Identified Functionals of Weakly Identified Functions.
Proceedings of the Thirty Sixth Annual Conference on Learning Theory, 2023

2022
Efficiently Breaking the Curse of Horizon in Off-Policy Evaluation with Double Reinforcement Learning.
Oper. Res., November, 2022

A Review of Off-Policy Evaluation in Reinforcement Learning.
CoRR, 2022

Optimal Fixed-Budget Best Arm Identification using the Augmented Inverse Probability Weighting Estimator in Two-Armed Gaussian Bandits with Unknown Variances.
CoRR, 2022

Provably Efficient Reinforcement Learning in Partially Observable Dynamical Systems.
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022

Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning approach.
Proceedings of the International Conference on Machine Learning, 2022

A Minimax Learning Approach to Off-Policy Evaluation in Confounded Partially Observable Markov Decision Processes.
Proceedings of the International Conference on Machine Learning, 2022

Representation Learning for Online and Offline RL in Low-rank MDPs.
Proceedings of the Tenth International Conference on Learning Representations, 2022

Pessimistic Model-based Offline Reinforcement Learning under Partial Coverage.
Proceedings of the Tenth International Conference on Learning Representations, 2022

2021
Information criteria for non-normalized models.
J. Mach. Learn. Res., 2021

A Minimax Learning Approach to Off-Policy Evaluation in Partially Observable Markov Decision Processes.
CoRR, 2021

Pessimistic Model-based Offline RL: PAC Bounds and Posterior Sampling under Partial Coverage.
CoRR, 2021

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage.
CoRR, 2021

Causal Inference Under Unmeasured Confounding With Negative Controls: A Minimax Learning Approach.
CoRR, 2021

Finite Sample Analysis of Minimax Offline Reinforcement Learning: Completeness, Fast Rates and First-Order Efficiency.
CoRR, 2021

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage.
Proceedings of the Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, 2021

Optimal Off-Policy Evaluation from Multiple Logging Policies.
Proceedings of the 38th International Conference on Machine Learning, 2021

Fast Rates for the Regret of Offline Reinforcement Learning.
Proceedings of the Conference on Learning Theory, 2021

2020
Double Reinforcement Learning for Efficient Off-Policy Evaluation in Markov Decision Processes.
J. Mach. Learn. Res., 2020

Efficient Evaluation of Natural Stochastic Policies in Offline Reinforcement Learning.
CoRR, 2020

Off-Policy Evaluation and Learning for External Validity under a Covariate Shift.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Doubly Robust Off-Policy Value and Gradient Estimation for Deterministic Policies.
Proceedings of the Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, 2020

Minimax Weight and Q-Function Learning for Off-Policy Evaluation.
Proceedings of the 37th International Conference on Machine Learning, 2020

Statistically Efficient Off-Policy Policy Gradients.
Proceedings of the 37th International Conference on Machine Learning, 2020

Double Reinforcement Learning for Efficient and Robust Off-Policy Evaluation.
Proceedings of the 37th International Conference on Machine Learning, 2020

Imputation estimators for unnormalized models with missing data.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

A Unified Statistically Efficient Estimation Framework for Unnormalized Models.
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2020

2019
Localized Debiased Machine Learning: Efficient Estimation of Quantile Treatment Effects, Conditional Value at Risk, and Beyond.
CoRR, 2019

Minimax Weight and Q-Function Learning for Off-Policy Evaluation.
CoRR, 2019

Efficiently Breaking the Curse of Horizon: Double Reinforcement Learning in Infinite-Horizon Processes.
CoRR, 2019

Unified estimation framework for unnormalized models with statistical efficiency.
CoRR, 2019

Intrinsically Efficient, Stable, and Bounded Off-Policy Evaluation for Reinforcement Learning.
Proceedings of the Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, 2019

2018
Analysis of Noise Contrastive Estimation from the Perspective of Asymptotic Variance.
CoRR, 2018


  Loading...