2025
FocalPO: Enhancing Preference Optimizing by Focusing on Correct Preference Rankings.
CoRR, January, 2025
Open Eyes, Then Reason: Fine-grained Visual Mathematical Understanding in MLLMs.
CoRR, January, 2025
2024
A Survey on Transferability of Adversarial Examples Across Deep Neural Networks.
,
,
,
,
,
,
,
,
,
,
,
Trans. Mach. Learn. Res., 2024
Minimalism is King! High-Frequency Energy-Based Screening for Data-Efficient Backdoor Attacks.
IEEE Trans. Inf. Forensics Secur., 2024
Fast Propagation Is Better: Accelerating Single-Step Adversarial Training via Sampling Subnetworks.
IEEE Trans. Inf. Forensics Secur., 2024
Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging.
IEEE Trans. Inf. Forensics Secur., 2024
SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation.
CoRR, 2024
Uncovering Vision Modality Threats in Image-to-Image Tasks.
CoRR, 2024
Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models.
CoRR, 2024
UVCG: Leveraging Temporal Consistency for Universal Video Protection.
CoRR, 2024
ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos.
CoRR, 2024
FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models.
CoRR, 2024
Multimodal Pragmatic Jailbreak on Text-to-image Models.
CoRR, 2024
Manipulation Facing Threats: Evaluating Physical Vulnerabilities in End-to-End Vision Language Action Models.
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
RT-Attack: Jailbreaking Text-to-Image Models via Random Token.
CoRR, 2024
Can Editing LLMs Inject Harm?
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
MMRo: Are Multimodal LLMs Eligible as the Brain for In-Home Robotics?
CoRR, 2024
Localizing Events in Videos with Multimodal Queries.
CoRR, 2024
Learning Visual Prompts for Guiding the Attention of Vision Transformers.
CoRR, 2024
Improved Techniques for Optimization-Based Jailbreaking on Large Language Models.
CoRR, 2024
Typography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal Large Language Models.
CoRR, 2024
When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models.
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
CoRR, 2024
Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models.
CoRR, 2024
Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples.
CoRR, 2024
Responsible Generative AI: What to Generate and What Not.
CoRR, 2024
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks?
CoRR, 2024
Model-agnostic Origin Attribution of Generated Images with Few-shot Examples.
CoRR, 2024
As Firm As Their Foundations: Can open-sourced foundation models be used to create adversarial examples for downstream tasks?
CoRR, 2024
An Image Is Worth 1000 Lies: Adversarial Transferability across Prompts on Vision-Language Models.
CoRR, 2024
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Model.
CoRR, 2024
Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images.
CoRR, 2024
Can Large Language Model Agents Simulate Human Trust Behavior?
,
,
,
,
,
,
,
,
,
,
,
,
,
Proceedings of the Advances in Neural Information Processing Systems 38: Annual Conference on Neural Information Processing Systems 2024, 2024
Provably Better Explanations with Optimized Aggregation of Feature Attributions.
Proceedings of the Forty-first International Conference on Machine Learning, 2024
An Image Is Worth 1000 Lies: Transferability of Adversarial Images across Prompts on Vision-Language Models.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Influencer Backdoor Attack on Semantic Segmentation.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images.
Proceedings of the Twelfth International Conference on Learning Representations, 2024
Visual Question Decomposition on Multimodal Large Language Models.
Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2024, 2024
Improving Adversarial Transferability via Model Alignment.
Proceedings of the Computer Vision - ECCV 2024, 2024
MM-SafetyBench: A Benchmark for Safety Evaluation of Multimodal Large Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
Which Model Generated This Image? A Model-Agnostic Approach for Origin Attribution.
Proceedings of the Computer Vision - ECCV 2024, 2024
Latent Guard: A Safety Framework for Text-to-Image Generation.
Proceedings of the Computer Vision - ECCV 2024, 2024
Dataset Distillation by Automatic Training Trajectories.
Proceedings of the Computer Vision - ECCV 2024, 2024
Unveiling Typographic Deceptions: Insights of the Typographic Vulnerability in Large Vision-Language Models.
Proceedings of the Computer Vision - ECCV 2024, 2024
Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Initialization Matters for Adversarial Transfer Learning.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024
Discretization-Induced Dirichlet Posterior for Robust Uncertainty Quantification on Regression.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
Does Few-Shot Learning Suffer from Backdoor Attacks?
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
FedDAT: An Approach for Foundation Model Finetuning in Multi-Modal Heterogeneous Federated Learning.
Proceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence, 2024
2023
XAI for In-hospital Mortality Prediction via Multimodal ICU Data.
CoRR, 2023
OT-Attack: Enhancing Adversarial Transferability of Vision-Language Models via Optimal Transport Optimization.
CoRR, 2023
TranSegPGD: Improving Transferability of Adversarial Examples on Semantic Segmentation.
CoRR, 2023
Understanding and Improving In-Context Learning on Vision-language Models.
CoRR, 2023
Benchmarking Robustness of Text-Image Composed Retrieval.
CoRR, 2023
SPOT! Revisiting Video-Language Models for Event Understanding.
CoRR, 2023
Boosting Fair Classifier Generalization through Adaptive Priority Reweighing.
CoRR, 2023
Revisiting and Exploring Efficient Fast Adversarial Training via LAW: Lipschitz Regularization and Auto Weight Averaging.
CoRR, 2023
FedPop: Federated Population-based Hyperparameter Tuning.
CoRR, 2023
A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
CoRR, 2023
Reliable Evaluation of Adversarial Transferability.
CoRR, 2023
Towards Robust Prompts on Vision-Language Models.
CoRR, 2023
Explainability and Robustness of Deep Visual Classification Models.
CoRR, 2023
Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023
Multi-event Video-Text Retrieval.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
FRAug: Tackling Federated Learning with Non-IID Features via Representation Augmentation.
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Do DALL-E and Flamingo Understand Each Other?
Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023
Backdoor Defense via Adaptively Splitting Poisoned Dataset.
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023
Exploring Non-additive Randomness on ViT against Query-Based Black-Box Attacks.
Proceedings of the 34th British Machine Vision Conference 2023, 2023
ECOLA: Enhancing Temporal Knowledge Embeddings with Contextualized Language Representations.
Proceedings of the Findings of the Association for Computational Linguistics: ACL 2023, 2023
2022
Explainability and robustness of deep visual classification models.
PhD thesis, 2022
CL-CrossVQA: A Continual Learning Benchmark for Cross-Domain Visual Question Answering.
CoRR, 2022
Towards Efficient Adversarial Training on Vision Transformers.
Proceedings of the Computer Vision - ECCV 2022, 2022
Watermark Vaccine: Adversarial Attacks to Prevent Watermark Removal.
Proceedings of the Computer Vision - ECCV 2022, 2022
SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and Boosting Segmentation Robustness.
Proceedings of the Computer Vision - ECCV 2022, 2022
Are Vision Transformers Robust to Patch Perturbations?
Proceedings of the Computer Vision - ECCV 2022, 2022
2021
Adversarial Examples on Segmentation Models Can be Easy to Transfer.
CoRR, 2021
Simple Distillation Baselines for Improving Small Self-supervised Models.
CoRR, 2021
Attacking Adversarial Attacks as A Defense.
CoRR, 2021
Semantics for Global and Local Interpretation of Deep Convolutional Neural Networks.
Proceedings of the International Joint Conference on Neural Networks, 2021
Effective and Efficient Vote Attack on Capsule Networks.
Proceedings of the 9th International Conference on Learning Representations, 2021
Quantifying Predictive Uncertainty in Medical Image Analysis with Deep Kernel Learning.
Proceedings of the 9th IEEE International Conference on Healthcare Informatics, 2021
Capsule Network Is Not More Robust Than Convolutional Network.
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2021
Interpretable Graph Capsule Networks for Object Recognition.
Proceedings of the Thirty-Fifth AAAI Conference on Artificial Intelligence, 2021
2020
Interpretable Graph Capsule Networks for Object Recognition.
CoRR, 2020
Search for Better Students to Learn Distilled Knowledge.
Proceedings of the ECAI 2020 - 24th European Conference on Artificial Intelligence, 29 August-8 September 2020, Santiago de Compostela, Spain, August 29 - September 8, 2020, 2020
Improving the Robustness of Capsule Networks to Image Affine Transformations.
Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020
Introspective Learning by Distilling Knowledge from Online Self-explanation.
Proceedings of the Computer Vision - ACCV 2020 - 15th Asian Conference on Computer Vision, Kyoto, Japan, November 30, 2020
2019
Neural Network Memorization Dissection.
CoRR, 2019
Contextual Prediction Difference Analysis.
CoRR, 2019
Semantics for Global and Local Interpretation of Deep Neural Networks.
CoRR, 2019
Understanding Bias in Machine Learning.
CoRR, 2019
Saliency Methods for Explaining Adversarial Attacks.
CoRR, 2019
2018
Understanding Individual Decisions of CNNs via Contrastive Backpropagation.
Proceedings of the Computer Vision - ACCV 2018, 2018