2025
GuardDoor: Safeguarding Against Malicious Diffusion Editing via Protective Backdoors.
CoRR, March, 2025

TruthFlow: Truthful LLM Generation via Representation Flow Correction.
CoRR, February, 2025

Towards Robust Multimodal Large Language Models Against Jailbreak Attacks.
CoRR, February, 2025

WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Shadow-Activated Backdoor Attacks on Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024
AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models.
CoRR, 2024

Adversarially Robust Industrial Anomaly Detection Through Diffusion Model.
CoRR, 2024

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Tackling the Data Heterogeneity in Asynchronous Federated Learning with Cached Update Calibration.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Federated Learning with Projected Trajectory Regularization.
CoRR, 2023

2021
CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Model Transfer.
Proceedings of the 40th IEEE Conference on Computer Communications, 2021

2020
RLCard: A Platform for Reinforcement Learning in Card Games.
Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019
RLCard: A Toolkit for Reinforcement Learning in Card Games.
CoRR, 2019

CoFlux: robustly correlating KPIs by fluctuations for service troubleshooting.
Proceedings of the International Symposium on Quality of Service, 2019