LLM Misalignment via Adversarial RLHF Platforms.
CoRR, March, 2025
OverThink: Slowdown Attacks on Reasoning LLMs.
CoRR, February, 2025
Riddle Me This! Stealthy Membership Inference for Retrieval-Augmented Generation.
CoRR, February, 2025
Synthetic Data Can Mislead Evaluations: Membership Inference as Machine Text Detection.
CoRR, January, 2025
Diffence: Fencing Membership Privacy With Diffusion Models.
Proceedings of the 32nd Annual Network and Distributed System Security Symposium, 2025
Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors.
CoRR, 2024
Iteratively Prompting Multimodal LLMs to Reproduce Natural and AI-Generated Images.
CoRR, 2024
Understanding (Un)Intended Memorization in Text-to-Image Generative Models.
CoRR, 2023
Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication.
CoRR, 2023
On the Risks of Stealing the Decoding Algorithms of Language Models.
CoRR, 2023
Stealing the Decoding Algorithms of Language Models.
Proceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security, 2023