VLM-Defense

Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and GenerationSafety Alignment for Vision Language ModelsAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive ShEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text TransformationMLLM-Protector: Ensuring MLLM’s Safety without Hurting PerformanceSafety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language ModelsSAFEGEN: Mitigating Unsafe Content Generation in Text-to-Image ModelsModerating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large VisionMitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive DecodingEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text TransformationUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSA Mutation-Based Method for Multi-Modal Jailbreaking Attack DetectionUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive ShCleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive LearningImage Safeguarding: Reasoning with Conditional Vision Language Model and Obfuscating Unsafe ContentTowards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion ModelsTypographic Attacks in Large Multimodal Models Can be Alleviated by More Informative PromptsOnthe Robustness of Large Multimodal Models Against Image Adversarial AttacksRemoving NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and GenerationSafety Fine-Tuning at (Almost) No Cost: ABaseline for Vision Large Language ModelsPartially Recentralization Softmax Loss for Vision-Language Models RobustnessAdversarial Prompt Tuning for Vision-Language ModelsDefense-Prefix for Preventing Typographic Attacks on CLIPRLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human FeedbAMutation-Based Method for Multi-Modal Jailbreaking AttackHowEasy is It to Fool Your Multimodal LLMs? AnEmpirical Analysis on Deceptive PromptsMLLM-Protector: Ensuring MLLM's Safety without Hurting PerformanceEFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal LargeAligning Modalities in Vision Large Language Models via Preference Fine-tuningRobust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-LangMachine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via DenoisiRobust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor AttacksHalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data