2025

FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings.

[DOI]

,

,

,

,

CoRR, January, 2025

Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs.

[DOI]

,

,

,

,

,

,

,

Anton van den Hengel

,

CoRR, January, 2025

2024

A Survey on Transferability of Adversarial Examples Across Deep Neural Networks.

[DOI]

,

,

,

,

,

,

,

,

,

,

,

Trans. Mach. Learn. Res., 2024

Minimalism is King! High-Frequency Energy-Based Screening for Data-Efficient Backdoor Attacks.

[DOI]

,

,

,

,

,

IEEE Trans. Inf. Forensics Secur., 2024

Fast Propagation Is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks.

[DOI]

,

,

,

,

IEEE Trans. Inf. Forensics Secur., 2024

Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging.

[DOI]

,

,

,

,

,

,

,

,

IEEE Trans. Inf. Forensics Secur., 2024

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Uncovering Vision Modality Threats in Image-to-Image Tasks.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models.

[DOI]

,

,

,

,

CoRR, 2024

UVCG: Leveraging Temporal Consistency for Universal Video Protection.

[DOI]

,

,

,

,

,

Xiao-Ping Zhang

CoRR, 2024

ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos.

[DOI]

,

Md Mohaiminul Islam

,

,

,

Gedas Bertasius

CoRR, 2024

FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Multimodal Pragmatic Jailbreak on Text-to-image Models.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models.

[DOI]

,

,

,

,

,

,

,

,

,

,

CoRR, 2024

RT-Attack: Jailbreaking Text-to-Image Models via Random Token.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Can Editing LLMs Inject Harm?

[DOI]

,

,

,

,

,

,

,

,

,

,

,

William Yang Wang

,

,

,

CoRR, 2024

MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?

[DOI]

,

,

,

,

,

,

,

,

,

CoRR, 2024

Localizing Events in Videos with Multimodal Queries.

[DOI]

,

Mang Ling Ada Fok

,

,

,

,

,

,

CoRR, 2024

Learning Visual Prompts for Guiding the Attention of Vision Transformers.

[DOI]

,

Masoud Jalili Sabet

,

,

Daniel Rueckert

,

,

CoRR, 2024

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models.

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

CoRR, 2024

When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models.

[DOI]

,

,

,

,

,

,

,

Dave Zhenyu Chen

,

,

,

Philip H. S. Torr

,

,

Matthias Nießner

,

,

,

,

Victor Adrian Prisacariu

CoRR, 2024

Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models.

[DOI]

,

,

,

,

CoRR, 2024

Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples.

[DOI]

,

,

,

,

,

,

CoRR, 2024

Responsible Generative AI: What to Generate and What Not.

[DOI]

CoRR, 2024

Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?

[DOI]

,

,

,

,

,

,

,

CoRR, 2024

Model-agnostic Origin Attribution of Generated Images with Few-shot Examples.

[DOI]

,

,

,

,

CoRR, 2024

As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?

[DOI]

,

,

Francesco Pinto

,

Konstantinos Kamnitsas

,

CoRR, 2024

An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts on Vision-Language Models.

[DOI]

,

,

,

CoRR, 2024

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Model.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Can Large Language Model Agents Simulate Human Trust Behavior?

[DOI]

,

,

,

,

,

,

,

,

,

,

,

,

,

Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024

Provably Better Explanations with Optimized Aggregation of Feature Attributions.

[DOI]

,

Ananta R. Bhattarai

,

,

,

Florian Buettner

Proceedings of the Forty-first International Conference on Machine Learning, 2024

An Image Is Worth 1000 Lies: Transferability of Adversarial Images across Prompts on Vision-Language Models.

[DOI]

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Influencer Backdoor Attack on Semantic Segmentation.

[DOI]

,

,

,

Hengshuang Zhao

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images.

[DOI]

,

,

,

,

,

,

Proceedings of the Twelfth International Conference on Learning Representations, 2024

Visual Question Decomposition on Multimodal Large Language Models.

[DOI]

,

,

,

,

,

,

,

Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024

Improving Adversarial Transferability via Model Alignment.

[DOI]

,

Amir-massoud Farahmand

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.

[DOI]

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution.

[DOI]

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Latent Guard: A Safety Framework for Text-to-Image Generation.

[DOI]

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Dataset Distillation by Automatic Training Trajectories.

[DOI]

,

,

,

Carsten Trinitis

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2024, 2024

Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds.

[DOI]

,

,

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Initialization Matters for Adversarial Transfer Learning.

[DOI]

,

,

,

Nicholas Carlini

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

Discretization-Induced Dirichlet Posterior for Robust Uncertainty Quantification on Regression.

[DOI]

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

Does Few-Shot Learning Suffer from Backdoor Attacks?

[DOI]

,

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning.

[DOI]

,

,

,

,

Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024

2023

XAI for In-hospital Mortality Prediction via Multimodal ICU Data.

[DOI]

,

,

,

,

,

CoRR, 2023

OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization.

[DOI]

,

,

,

,

,

CoRR, 2023

TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation.

[DOI]

,

,

,

,

,

,

CoRR, 2023

Understanding and Improving In-Context Learning on Vision-language Models.

[DOI]

,

,

,

,

Philip H. S. Torr

,

,

CoRR, 2023

Benchmarking Robustness of Text-Image Composed Retrieval.

[DOI]

,

,

CoRR, 2023

SPOT! Revisiting Video-Language Models for Event Understanding.

[DOI]

,

,

,

CoRR, 2023

Boosting Fair Classifier Generalization through Adaptive Priority Reweighing.

[DOI]

,

,

,

,

,

CoRR, 2023

Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging.

[DOI]

,

,

,

,

,

,

,

CoRR, 2023

FedPop: Federated Population-based Hyperparameter Tuning.

[DOI]

,

,

,

CoRR, 2023

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.

[DOI]

,

,

,

,

,

,

,

,

,

Philip H. S. Torr

CoRR, 2023

Reliable Evaluation of Adversarial Transferability.

[DOI]

,

,

,

Philip H. S. Torr

CoRR, 2023

Towards Robust Prompts on Vision-Language Models.

[DOI]

,

,

,

,

Philip H. S. Torr

,

CoRR, 2023

Explainability and Robustness of Deep Visual Classification Models.

[DOI]

CoRR, 2023

Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models.

[DOI]

,

,

,

,

Philip H. S. Torr

,

Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

Multi-event Video-Text Retrieval.

[DOI]

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Do DALL-E and Flamingo Understand Each Other?

[DOI]

,

,

,

Sahand Sharifzadeh

,

Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023

Backdoor Defense via Adaptively Splitting Poisoned Dataset.

[DOI]

,

,

,

,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks.

[DOI]

,

,

Philip H. S. Torr

,

Proceedings of the 34th British Machine Vision Conference 2023, 2023

ECOLA: Enhancing Temporal Knowledge Embeddings with Contextualized Language Representations.

[DOI]

,

,

,

,

,

,

,

Hinrich Schütze

,

Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023

2022

Explainability and robustness of deep visual classification models.

[DOI]

PhD thesis, 2022

CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering.

[DOI]

,

,

,

,

,

,

,

CoRR, 2022

Towards Efficient Adversarial Training on Vision Transformers.

[DOI]

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal.

[DOI]

,

,

,

,

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness.

[DOI]

,

Hengshuang Zhao

,

,

Philip H. S. Torr

Proceedings of the Computer Vision - ECCV 2022, 2022

Are Vision Transformers Robust to Patch Perturbations?

[DOI]

,

,

Proceedings of the Computer Vision - ECCV 2022, 2022

2021

Adversarial Examples on Segmentation Models Can be Easy to Transfer.

[DOI]

,

Hengshuang Zhao

,

,

Philip H. S. Torr

CoRR, 2021

Simple Distillation Baselines for Improving Small Self-supervised Models.

[DOI]

,

,

CoRR, 2021

Attacking Adversarial Attacks as A Defense.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2021

Semantics for Global and Local Interpretation of Deep Convolutional Neural Networks.

[DOI]

,

,

Proceedings of the International Joint Conference on Neural Networks, 2021

Effective and Efficient Vote Attack on Capsule Networks.

[DOI]

,

,

Proceedings of the 9th International Conference on Learning Representations, 2021

Quantifying Predictive Uncertainty in Medical Image Analysis with Deep Kernel Learning.

[DOI]

,

,

,

Proceedings of the 9th IEEE International Conference on Healthcare Informatics, 2021

Capsule Network Is Not More Robust Than Convolutional Network.

[DOI]

,

,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021

Interpretable Graph Capsule Networks for Object Recognition.

[DOI]

Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021

2020

Interpretable Graph Capsule Networks for Object Recognition.

[DOI]

,

CoRR, 2020

Search for Better Students to Learn Distilled Knowledge.

[DOI]

,

Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020

Improving the Robustness of Capsule Networks to Image Affine Transformations.

[DOI]

,

Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020

Introspective Learning by Distilling Knowledge from Online Self-explanation.

[DOI]

,

,

Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020

2019

Neural Network Memorization Dissection.

[DOI]

,

CoRR, 2019

Contextual Prediction Difference Analysis.

[DOI]

,

CoRR, 2019

Semantics for Global and Local Interpretation of Deep Neural Networks.

[DOI]

,

CoRR, 2019

Understanding Bias in Machine Learning.

[DOI]

,

CoRR, 2019

Saliency Methods for Explaining Adversarial Attacks.

[DOI]

,

CoRR, 2019

2018

Understanding Individual Decisions of CNNs via Contrastive Backpropagation.

[DOI]

,

,

Proceedings of the Computer Vision - ACCV 2018, 2018