Interpreting the Repeated Token Phenomenon in Large Language Models.
CoRR, March, 2025
Measuring memorization in language models via probabilistic extraction.
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, 2025
Stealing User Prompts from Mixture of Experts.
CoRR, 2024
Measuring memorization through probabilistic discoverable extraction.
CoRR, 2024
Operationalizing Contextual Integrity in Privacy-Conscious Assistants.
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
UnUnlearning: Unlearning is not sufficient for content regulation in advanced generative AI.
CoRR, 2024
Buffer Overflow in Mixture of Experts.
CoRR, 2024