Junxiao Yang

According to our database1, Junxiao Yang authored at least 5 papers between 2023 and 2024.

Collaborative distances:
  • Dijkstra number2 of four.
  • Erdős number3 of four.

Timeline

2023
2024
0
1
2
3
4
5
3
1
1

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Agent-SafetyBench: Evaluating the Safety of LLM Agents.
CoRR, 2024

Global Challenge for Safe and Secure LLMs Track 1.
CoRR, 2024

Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks.
CoRR, 2024

Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2024

2023
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization.
CoRR, 2023


  Loading...