2024

Making Harmful Behaviors Unlearnable for Large Language Models.

[DOI]

Proceedings of the Findings of the Association for Computational Linguistics, 2024