2025

Safety Misalignment Against Large Language Models.

[DOI]

Yichen Gong

Delong Ran

Proceedings of the 32nd Annual Network and Distributed System Security Symposium, 2025

Crab: A Novel Configurable Role-Playing LLM with Assessing Benchmark.

[DOI]

Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2025

FigStep: Jailbreaking Large Vision-Language Models via Typographic Visual Prompts.

[DOI]

Proceedings of the AAAI-25, Sponsored by the Association for the Advancement of Artificial Intelligence, February 25, 2025

2024

JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models.

[DOI]

CoRR, 2024

Have You Merged My Model? On The Robustness of Large Language Model IP Protection Methods Against Model Merging.

[DOI]

Proceedings of the 1st ACM Workshop on Large AI Systems and Models with Privacy and Safety Analysis, 2024