Jiongxiao Wang

According to our database1, Jiongxiao Wang authored at least 17 papers between 2022 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
FATH: Authentication-based Test-time Defense against Indirect Prompt Injection Attacks.
CoRR, 2024

Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness.
CoRR, 2024

Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors.
CoRR, 2024

Mitigating Fine-tuning Jailbreak Attack with Backdoor Enhanced Alignment.
CoRR, 2024

Preference Poisoning Attacks on Reward Model Learning.
CoRR, 2024

Conversational Drug Editing Using Retrieval and Domain Feedback.
Proceedings of the Twelfth International Conference on Learning Representations, 2024

RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations.
CoRR, 2023

On the Exploitability of Reinforcement Learning with Human Feedback for Large Language Models.
CoRR, 2023

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback.
CoRR, 2023

Adversarial Demonstration Attacks on Large Language Models.
CoRR, 2023

On the Exploitability of Instruction Tuning.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification.
Proceedings of the International Conference on Machine Learning, 2023

DensePure: Understanding Diffusion Models for Adversarial Robustness.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

Defending against Adversarial Audio via Diffusion Model.
Proceedings of the Eleventh International Conference on Learning Representations, 2023

2022
DensePure: Understanding Diffusion Models towards Adversarial Robustness.
CoRR, 2022

Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack.
Proceedings of the International Conference on Machine Learning, 2022


  Loading...