LlamaRL: A Distributed Asynchronous Reinforcement Learning Framework for Efficient Large-scale LLM Training.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, May, 2025
Self-Generated Critiques Boost Reward Modeling for Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
MLGO: a Machine Learning Guided Compiler Optimizations Framework.
CoRR, 2021
Using Abstractions to Solve Opportunistic Crime Security Games at Scale.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016
Restless Poachers: Handling Exploration-Exploitation Tradeoffs in Security Domains.
Proceedings of the 2016 International Conference on Autonomous Agents & Multiagent Systems, 2016
Robust Strategy against Unknown Risk-averse Attackers in Security Games.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015
To Handle, to Learn and to Manipulate the Attacker's (Uncertain) Payoffs in Security Games: Doctoral Consortium.
Proceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems, 2015
Online planning for optimal protector strategies in resource conservation games.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2014
Planning and learning in security games.
SIGecom Exch., 2013
Bayesian Security Games for Controlling Contagion.
Proceedings of the International Conference on Social Computing, SocialCom 2013, 2013
Defender (Mis)coordination in Security Games.
Proceedings of the IJCAI 2013, 2013
A fast table lookup based, statistical model driven non-uniform unit selection TTS.
Proceedings of the IEEE International Conference on Acoustics, 2013
Security games with contagion: handling asymmetric information.
Proceedings of the International conference on Autonomous Agents and Multi-Agent Systems, 2013