VLM-Defense
Removing NSFW Concepts from Vision-and-Language Models  for Text-to-Image Retrieval and GenerationSafety Alignment for Vision Language ModelsAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive ShEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text TransformationMLLM-Protector: Ensuring MLLM’s Safety without Hurting PerformanceSafety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language ModelsSAFEGEN: Mitigating Unsafe Content Generation in Text-to-Image ModelsModerating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large VisionMitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive DecodingEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text TransformationUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSA Mutation-Based Method for Multi-Modal Jailbreaking Attack DetectionUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive ShCleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive LearningImage Safeguarding: Reasoning with Conditional Vision Language Model  and Obfuscating Unsafe ContentTowards Safe Self-Distillation of  Internet-Scale Text-to-Image Diffusion ModelsTypographic Attacks in Large Multimodal Models Can be Alleviated by More  Informative PromptsOnthe Robustness of Large Multimodal Models Against Image Adversarial  AttacksRemoving NSFW Concepts from Vision-and-Language Models  for Text-to-Image Retrieval and GenerationSafety Fine-Tuning at (Almost) No Cost:  ABaseline for Vision Large Language ModelsPartially Recentralization Softmax Loss for Vision-Language Models RobustnessAdversarial Prompt Tuning for Vision-Language ModelsDefense-Prefix for Preventing Typographic Attacks on CLIPRLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from  Fine-grained Correctional Human FeedbAMutation-Based Method for Multi-Modal Jailbreaking AttackHowEasy is It to Fool Your Multimodal LLMs?  AnEmpirical Analysis on Deceptive PromptsMLLM-Protector: Ensuring MLLM's Safety without Hurting PerformanceEFUF: Efficient Fine-grained Unlearning Framework for Mitigating  Hallucinations in Multimodal LargeAligning Modalities in Vision Large Language Models via Preference Fine-tuningRobust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings  for Robust Large Vision-LangMachine Vision Therapy: Multimodal Large Language  Models Can Enhance Visual Robustness  via DenoisiRobust Contrastive Language-Image Pre-training  against Data Poisoning and Backdoor AttacksHalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data