Defending Jailbreak Prompts via In-Context Adversarial Game.
CoRR, 2024
Attack-free Evaluating and Enhancing Adversarial Robustness on Categorical Data.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Defending Jailbreak Prompts via In-Context Adversarial Game.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024
Towards Efficient and Domain-Agnostic Evasion Attack with High-Dimensional Categorical Inputs.
Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, 2023
Towards Understanding the Robustness Against Evasion Attack on Categorical Data.
Proceedings of the Tenth International Conference on Learning Representations, 2022
AdvCat: Domain-Agnostic Robustness Assessment for Cybersecurity-Critical Applications with Categorical Inputs.
Proceedings of the IEEE International Conference on Big Data, 2022
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization.
Proceedings of the 38th International Conference on Machine Learning, 2021
Attackability Characterization of Adversarial Evasion Attack on Discrete Data.
Proceedings of the KDD '20: The 26th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2020
Co-Embedding Attributed Networks.
Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, 2019