VLM-Defense

Removing NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generationchevron-rightSafety Alignment for Vision Language Modelschevron-rightAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shchevron-rightEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformationchevron-rightMLLM-Protector: Ensuring MLLM’s Safety without Hurting Performancechevron-rightSafety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Modelschevron-rightSAFEGEN: Mitigating Unsafe Content Generation in Text-to-Image Modelschevron-rightModerating Illicit Online Image Promotion for Unsafe User-Generated Content Games Using Large Visionchevron-rightMitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decodingchevron-rightEyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformationchevron-rightUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSchevron-rightA Mutation-Based Method for Multi-Modal Jailbreaking Attack Detectionchevron-rightUNDERSTANDING ZERO-SHOT ADVERSARIAL ROBUSTNESS FOR LARGE-SCALE MODELSchevron-rightAdaShield: Safeguarding Multimodal Large Language Models from Structure-based Attack via Adaptive Shchevron-rightCleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learningchevron-rightImage Safeguarding: Reasoning with Conditional Vision Language Model and Obfuscating Unsafe Contentchevron-rightTowards Safe Self-Distillation of Internet-Scale Text-to-Image Diffusion Modelschevron-rightTypographic Attacks in Large Multimodal Models Can be Alleviated by More Informative Promptschevron-rightOnthe Robustness of Large Multimodal Models Against Image Adversarial Attackschevron-rightRemoving NSFW Concepts from Vision-and-Language Models for Text-to-Image Retrieval and Generationchevron-rightSafety Fine-Tuning at (Almost) No Cost: ABaseline for Vision Large Language Modelschevron-rightPartially Recentralization Softmax Loss for Vision-Language Models Robustnesschevron-rightAdversarial Prompt Tuning for Vision-Language Modelschevron-rightDefense-Prefix for Preventing Typographic Attacks on CLIPchevron-rightRLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedbchevron-rightAMutation-Based Method for Multi-Modal Jailbreaking Attackchevron-rightHowEasy is It to Fool Your Multimodal LLMs? AnEmpirical Analysis on Deceptive Promptschevron-rightMLLM-Protector: Ensuring MLLM's Safety without Hurting Performancechevron-rightEFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Largechevron-rightAligning Modalities in Vision Large Language Models via Preference Fine-tuningchevron-rightRobust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Langchevron-rightMachine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoisichevron-rightRobust Contrastive Language-Image Pre-training against Data Poisoning and Backdoor Attackschevron-rightHalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Datachevron-right