2024
AdvBDGen: Adversarially Fortified Prompt-Specific Fuzzy Backdoor Generator Against LLM Alignment.
CoRR, 2024

Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data?
CoRR, 2024

Is poisoning a real threat to LLM alignment? Maybe more so than you think.
CoRR, 2024