Operationalizing Contextual Integrity in Privacy-Conscious Assistants.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
STAR: SocioTechnical Approach to Red Teaming Language Models.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Gaps in the Safety Evaluation of Generative AI.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024
All Too Human? Mapping and Mitigating the Risk from Anthropomorphic AI.
Proceedings of the Seventh AAAI/ACM Conference on AI, Ethics, and Society (AIES-24) - Full Archival Papers, October 21-23, 2024, San Jose, California, USA, 2024
Sociotechnical Safety Evaluation of Generative AI Systems.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, 2022
Taxonomy of Risks posed by Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the FAccT '22: 2022 ACM Conference on Fairness, Accountability, and Transparency, Seoul, Republic of Korea, June 21, 2022
Alignment of Language Agents.
CoRR, 2021
Modelling Cooperation in Network Games with Spatio-Temporal Complexity.
Proceedings of the AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, 2021
Model-free conventions in multi-agent reinforcement learning with heterogeneous preferences.
CoRR, 2020