R1dacted: Investigating Local Censorship in DeepSeek's R1 Language Model.
CoRR, May, 2025
Multilingual and Multi-Accent Jailbreaking of Audio LLMs.
CoRR, April, 2025
OverThink: Slowdown Attacks on Reasoning LLMs.
CoRR, February, 2025
FameBias: Embedding Manipulation Bias Attack in Text-to-Image Models.
CoRR, 2024
Injecting Bias in Text-To-Image Models via Composite-Trigger Backdoors.
CoRR, 2024
OSLO: One-Shot Label-Only Membership Inference Attacks.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Understanding (Un)Intended Memorization in Text-to-Image Generative Models.
CoRR, 2023
Memory Triggers: Unveiling Memorization in Text-To-Image Generative Models through Word-Level Duplication.
CoRR, 2023
Robust Smart Home Face Recognition Under Starving Federated Data.
Proceedings of the 6th International Conference on Universal Village, 2022
MSDT: Masked Language Model Scoring Defense in Text Domain.
Proceedings of the 6th International Conference on Universal Village, 2022
Impact of Adversarial Training on the Robustness of Deep Neural Networks.
Proceedings of the 5th IEEE International Conference on Information Systems and Computer Aided Education, 2022