Kenshi Abe

Atsushi Iwasaki

CoRR, 2024

Time-Varyingness in Auction Breaks Revenue Equivalence.

[BibT_eX]

[DOI]

CoRR, 2024

Last Iterate Convergence in Monotone Mean Field Games.

[BibT_eX]

[DOI]

Noboru Isobe

CoRR, 2024

Boosting Perturbed Gradient Ascent for Last-Iterate Convergence in Games.

[BibT_eX]

[DOI]

CoRR, 2024

Synchronization behind Learning in Periodic Zero-Sum Games Triggers Divergence from Nash equilibrium.

[BibT_eX]

[DOI]

CoRR, 2024

Global Behavior of Learning Dynamics in Zero-Sum Games with Memory Asymmetry.

[BibT_eX]

[DOI]

CoRR, 2024

Regularized Best-of-N Sampling to Mitigate Reward Hacking for Language Model Alignment.

[BibT_eX]

[DOI]

CoRR, 2024

Nash Equilibrium and Learning Dynamics in Three-Player Matching m-Action Games.

[BibT_eX]

[DOI]

CoRR, 2024

Return-Aligned Decision Transformer.

[BibT_eX]

[DOI]

CoRR, 2024

Scalable and Provably Fair Exposure Control for Large-Scale Recommender Systems.

[BibT_eX]

[DOI]

Riku Togashi

Yuta Saito

Proceedings of the ACM on Web Conference 2024, 2024

Policy Gradient Algorithms with Monte Carlo Tree Learning for Non-Markov Decision Processes.

[BibT_eX]

RLJ, 2024

Model-Based Minimum Bayes Risk Decoding for Text Generation.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Adaptively Perturbed Mirror Descent for Learning in Games.

[BibT_eX]

[DOI]

Proceedings of the Forty-first International Conference on Machine Learning, 2024

Filtered Direct Preference Optimization.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

Learning Fair Division from Bandit Feedback.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2024

Memory Asymmetry Creates Heteroclinic Orbits to Nash Equilibrium in Learning in Zero-Sum Games.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

Model-Based Minimum Bayes Risk Decoding.

[BibT_eX]

[DOI]

CoRR, 2023

Why Guided Dialog Policy Learning performs well? Understanding the role of adversarial learning and its alternative.

[BibT_eX]

[DOI]

CoRR, 2023

A Slingshot Approach to Learning in Monotone Games.

[BibT_eX]

[DOI]

CoRR, 2023

Memory Asymmetry: A Key to Convergence in Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2023

Exploration of Unranked Items in Safe Online Learning to Re-Rank.

[BibT_eX]

[DOI]

Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval, 2023

Learning in Multi-Memory Games Triggers Complex Dynamics Diverging from Nash Equilibrium.

[BibT_eX]

[DOI]

Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, 2023

Last-Iterate Convergence with Full and Noisy Feedback in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

Proceedings of the International Conference on Artificial Intelligence and Statistics, 2023

2022

Fair Matrix Factorisation for Large-Scale Recommender Systems.

[BibT_eX]

[DOI]

Riku Togashi

CoRR, 2022

Last-Iterate Convergence with Full- and Noisy-Information Feedback in Two-Player Zero-Sum Games.

[BibT_eX]

[DOI]

CoRR, 2022

Policy Gradient Algorithms with Monte-Carlo Tree Search for Non-Markov Decision Processes.

[BibT_eX]

[DOI]

CoRR, 2022

Mutation-driven follow the regularized leader for last-iterate convergence in zero-sum games.

[BibT_eX]

[DOI]

Mitsuki Sakamoto

Atsushi Iwasaki

Proceedings of the Uncertainty in Artificial Intelligence, 2022

Anytime Capacity Expansion in Medical Residency Match by Monte Carlo Tree Search.

[BibT_eX]

[DOI]

Junpei Komiyama

Atsushi Iwasaki

Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, 2022

Thresholded Lasso Bandit.

[BibT_eX]

[DOI]

Alexandre Proutière

Proceedings of the International Conference on Machine Learning, 2022

2021

Off-Policy Exploitability-Evaluation in Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Yusuke Kaneko

Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021

2020

A Practical Guide of Off-Policy Evaluation for Bandit Problems.

[BibT_eX]

[DOI]

CoRR, 2020

Off-Policy Exploitability-Evaluation and Equilibrium-Learning in Two-Player Zero-Sum Markov Games.

[BibT_eX]

[DOI]

Yusuke Kaneko

CoRR, 2020

2019

A Simple Heuristic for Bayesian Optimization with A Low Budget.

[BibT_eX]

[DOI]

Masahiro Nomura