GuardDoor: Safeguarding Against Malicious Diffusion Editing via Protective Backdoors.
CoRR, March, 2025
TruthFlow: Truthful LLM Generation via Representation Flow Correction.
CoRR, February, 2025
Towards Robust Multimodal Large Language Models Against Jailbreak Attacks.
CoRR, February, 2025
WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025
Shadow-Activated Backdoor Attacks on Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025
AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models.
CoRR, 2024
Adversarially Robust Industrial Anomaly Detection Through Diffusion Model.
CoRR, 2024
Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024
Tackling the Data Heterogeneity in Asynchronous Federated Learning with Cached Update Calibration.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024
Federated Learning with Projected Trajectory Regularization.
CoRR, 2023
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Model Transfer.
Proceedings of the 40th IEEE Conference on Computer Communications, 2021
RLCard: A Platform for Reinforcement Learning in Card Games.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020
RLCard: A Toolkit for Reinforcement Learning in Card Games.
CoRR, 2019
CoFlux: robustly correlating KPIs by fluctuations for service troubleshooting.
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the International Symposium on Quality of Service, 2019