Javier Rando

Orcid: 0000-0002-2723-7660

According to our database¹, Javier Rando authored at least 16 papers between 2020 and 2024.

Collaborative distances:

Dijkstra number² of four.
Erdős number³ of four.

Timeline

Legend:

Book

In proceedings

Article

PhD thesis

Dataset

Other

Links

On csauthors.net:

Bibliography

2024

Persistent Pre-Training Poisoning of LLMs.

[BibT_eX]

[DOI]

CoRR, 2024

Gradient-based Jailbreak Images for Multimodal Fusion Models.

[BibT_eX]

[DOI]

CoRR, 2024

Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI.

[BibT_eX]

[DOI]

CoRR, 2024

Attributions toward Artificial Agents in a modified Moral Turing Test.

[BibT_eX]

[DOI]

CoRR, 2024

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition.

[BibT_eX]

[DOI]

CoRR, 2024

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs.

[BibT_eX]

[DOI]

Maksym Andriushchenko

Nicolas Flammarion

Florian Tramèr

CoRR, 2024

Foundational Challenges in Assuring Alignment and Safety of Large Language Models.

[BibT_eX]

[DOI]

CoRR, 2024

Universal Jailbreak Backdoors from Poisoned Human Feedback.

[BibT_eX]

[DOI]

Javier Rando

Florian Tramèr

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Personas as a Way to Model Truthfulness in Language Models.

[BibT_eX]

[DOI]

Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback.

[BibT_eX]

[DOI]

Trans. Mach. Learn. Res., 2023

Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation.

[BibT_eX]

[DOI]

Rusheb Shah

Quentin Feuillade-Montixi

CoRR, 2023

PassGPT: Password Modeling and (Guided) Generation with Large Language Models.

[BibT_eX]

[DOI]

Javier Rando

Fernando Pérez-Cruz

Briland Hitaj

Proceedings of the Computer Security - ESORICS 2023, 2023

2022

Red-Teaming the Stable Diffusion Safety Filter.

[BibT_eX]

[DOI]

CoRR, 2022

Exploring Adversarial Attacks and Defenses in Vision Transformers trained with DINO.

[BibT_eX]

[DOI]

CoRR, 2022

2020

Uneven Coverage of Natural Disasters in Wikipedia: the Case of Flood.

[BibT_eX]

[DOI]

CoRR, 2020

Uneven Coverage of Natural Disasters in Wikipedia: The Case of Floods.

[BibT_eX]

[DOI]

Proceedings of the 17th International Conference on Information Systems for Crisis Response and Management, 2020

Javier Rando

Timeline

Legend:

Links

On csauthors.net:

Bibliography

Loading...