2025

GuardDoor: Safeguarding Against Malicious Diffusion Editing via Protective Backdoors.

[DOI]

,

,

CoRR, March, 2025

TruthFlow: Truthful LLM Generation via Representation Flow Correction.

[DOI]

,

,

,

CoRR, February, 2025

Towards Robust Multimodal Large Language Models Against Jailbreak Attacks.

[DOI]

,

,

,

,

,

CoRR, February, 2025

WordGame: Efficient & Effective LLM Jailbreak via Simultaneous Obfuscation in Query and Response.

[DOI]

,

,

,

,

Prasenjit Mitra

,

Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, New Mexico, USA, April 29, 2025

Shadow-Activated Backdoor Attacks on Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics, 2025

2024

AdvI2I: Adversarial Image Attack on Image-to-Image Diffusion models.

[DOI]

,

,

,

,

,

CoRR, 2024

Adversarially Robust Industrial Anomaly Detection Through Diffusion Model.

[DOI]

,

,

CoRR, 2024

Personalized Steering of Large Language Models: Versatile Steering Vectors Through Bi-directional Preference Optimization.

[DOI]

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections.

[DOI]

,

,

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), 2024

Tackling the Data Heterogeneity in Asynchronous Federated Learning with Cached Update Calibration.

[DOI]

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM.

[DOI]

,

,

,

Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023

Federated Learning with Projected Trajectory Regularization.

[DOI]

,

,

,

,

CoRR, 2023

2021

CTF: Anomaly Detection in High-Dimensional Time Series with Coarse-to-Fine Model Transfer.

[DOI]

,

,

,

,

,

,

,

,

,

Proceedings of the 40th IEEE Conference on Computer Communications, 2021

2020

RLCard: A Platform for Reinforcement Learning in Card Games.

[DOI]

,

,

,

,

Keerthana Reddy

,

,

,

,

,

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, 2020

2019

RLCard: A Toolkit for Reinforcement Learning in Card Games.

[DOI]

,

,

,

,

,

,

CoRR, 2019

CoFlux: robustly correlating KPIs by fluctuations for service troubleshooting.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the International Symposium on Quality of Service, 2019