Position: Evaluating Generative AI Systems is a Social Science Measurement Challenge.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, February, 2025
A Shared Standard for Valid Measurement of Generative AI Systems' Capabilities, Risks, and Impacts.
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2023
FairPrism: Evaluating Fairness-Related Harms in Text Generation.
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2023