2025

TAIJI: Textual Anchoring for Immunizing Jailbreak Images in Vision Language Models.

[DOI]

,

,

,

,

,

,

,

CoRR, March, 2025

CeTAD: Towards Certified Toxicity-Aware Distance in Vision Language Models.

[DOI]

,

,

,

,

,

,

CoRR, March, 2025

Position: Towards a Responsible LLM-empowered Multi-Agent Systems.

[DOI]

,

,

,

,

,

,

Guangliang Cheng

,

Sarvapali D. Ramchurn

,

CoRR, February, 2025

FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model.

[DOI]

,

,

,

,

Guangliang Cheng

,

,

CoRR, February, 2025

2024

A survey of safety and trustworthiness of large language models through the lens of verification and validation.

[DOI]

,

,

,

,

,

,

Saddek Bensalem

,

,

,

,

,

,

,

,

,

,

Mustafa A. Mustafa

Artif. Intell. Rev., July, 2024

Privacy-Preserving Distributed Learning for Residential Short-Term Load Forecasting.

[DOI]

,

,

,

Mustafa A. Mustafa

,

Geert Deconinck

,

IEEE Internet Things J., May, 2024

Reachability Verification Based Reliability Assessment for Deep Reinforcement Learning Controlled Robotics and Autonomous Systems.

[DOI]

,

,

,

IEEE Robotics Autom. Lett., 2024

Adaptive Guardrails For Large Language Models via Trust Modeling and In-Context Learning.

[DOI]

,

,

CoRR, 2024

Safeguarding Large Language Models: A Survey.

[DOI]

,

,

,

,

,

,

,

,

,

,

Saddek Bensalem

,

CoRR, 2024

Building Guardrails for Large Language Models.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2024

Position: Building Guardrails for Large Language Models Requires Systematic Design.

[DOI]

,

,

,

,

,

,

,

,

Proceedings of the Forty-first International Conference on Machine Learning, 2024

2023

Reliability Assessment and Safety Arguments for Machine Learning Components in System Assurance.

[DOI]

,

,

,

,

,

,

,

,

ACM Trans. Embed. Comput. Syst., 2023

STPA for Learning-Enabled Systems: A Survey and A New Method.

[DOI]

,

,

,

CoRR, 2023

STPA for Learning-Enabled Systems: A Survey and A New Practice.

[DOI]

,

,

Siddartha Khastgir

,

Paul A. Jennings

,

,

Proceedings of the 26th IEEE International Conference on Intelligent Transportation Systems, 2023

Short-term Load Forecasting with Distributed Long Short-Term Memory.

[DOI]

,

,

,

Proceedings of the IEEE Power & Energy Society Innovative Smart Grid Technologies Conference, 2023

Decentralised and Cooperative Control of Multi-Robot Systems through Distributed Optimisation.

[DOI]

,

,

,

,

Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023

2022

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems through Probabilistic Model Checking.

[DOI]

,

,

Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022

EnnCore: End-to-End Conceptual Guarding of Neural Architectures.

[DOI]

,

Danilo S. Carvalho

,

,

,

,

Mustafa A. Mustafa

,

,

,

,

,

Lucas C. Cordeiro

Proceedings of the Workshop on Artificial Intelligence Safety 2022 (SafeAI 2022) co-located with the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI2022), 2022

2021

Reliability Assessment and Safety Arguments for Machine Learning Components in Assuring Learning-Enabled Autonomous Systems.

[DOI]

,

,

,

,

,

,

,

,

CoRR, 2021

Dependability Analysis of Deep Reinforcement Learning based Robotics and Autonomous Systems.

[DOI]

,

,

CoRR, 2021

Detecting Operational Adversarial Examples for Reliable Deep Learning.

[DOI]

,

,

,

,

Proceedings of the 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks, 2021