Tinghao Xie
According to our database1,
Tinghao Xie
authored at least 11 papers
between 2022 and 2024.
Collaborative distances:
Collaborative distances:
Timeline
Legend:
Book In proceedings Article PhD thesis Dataset OtherLinks
On csauthors.net:
Bibliography
2024
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors.
CoRR, 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
Proceedings of the Twelfth International Conference on Learning Representations, 2024
2023
Proceedings of the 32nd USENIX Security Symposium, 2023
Proceedings of the Eleventh International Conference on Learning Representations, 2023
2022
Fight Poison with Poison: Detecting Backdoor Poison Samples via Decoupling Benign Correlations.
CoRR, 2022
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022