VLM-Attack
Circumventing Concept Erasure Methods For Text-to-Image Generative ModelsEfficient LLM-Jailbreaking by Introducing Visual ModalityFrom LLMs to MLLMs: Exploring the Landscape of Multimodal JailbreakingAdversarial Attacks on Multimodal AgentsVisual-RolePlay: Universal Jailbreak Attack on MultiModal Large Language Models via Role-playing ImaCross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language ModelsTypography Leads Semantic Diversifying: Amplifying Adversarial Transferability across Multimodal LarWhite-box Multimodal Jailbreaks Against Large Vision-Language ModelsRed Teaming Visual Language ModelsPrivate Attribute Inference from Images with Vision-Language ModelsAssessment of Multimodal Large Language Models in Alignment with Human ValuesPrivacy-Aware Visual Language ModelsLearning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To JailbreVision-LLMs Can Fool Themselves with Self-Generated Typographic AttacksRed Teaming Visual Language ModelsAdversarial Illusions in Multi-Modal EmbeddingsUniversal Prompt Optimizer for Safe Text-to-Image GenerationOn the Proactive Generation of Unsafe Images From Text-To-Image Models Using Benign PromptsAdversarial Illusions in Multi-Modal EmbeddingsStop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial ImagesINSTRUCTTA: Instruction-Tuned Targeted Attack for Large Vision-Language ModelsOn the Robustness of Large Multimodal Models Against Image Adversarial AttacksHijacking Context in Large Multi-modal ModelsTransferable Multimodal Attack on Vision-Language Pre-training ModelsImages are Achilles’ Heel of Alignment: Exploiting Visual Vulnerabilities for Jailbreaking MultimodaAN IMAGE IS WORTH 1000 LIES: ADVERSARIAL TRANSFERABILITY ACROSS PROMPTS ON VISIONLANGUAGE MODELSTest-Time Backdoor Attacks on Multimodal Large Language ModelsJAILBREAK IN PIECES: COMPOSITIONAL ADVERSARIAL ATTACKS ON MULTI-MODAL LANGUAGE MODELSJailbreaking Attack against Multimodal Large Language ModelJailbreaking GPT-4V via Self-Adversarial Attacks with System PromptsIMAGE HIJACKS: ADVERSARIAL IMAGES CAN CONTROL GENERATIVE MODELS AT RUNTIMEVISUAL ADVERSARIAL EXAMPLES JAILBREAK ALIGNED LARGE LANGUAGE MODELSVision-LLMs Can Fool Themselves with Self-Generated Typographic AttacksQuery-Relevant Images Jailbreak Large Multi-Modal ModelsTowards Adversarial Attack on Vision-Language Pre-training ModelsHowMany Are Unicorns in This Image? ASafety Evaluation Benchmark for Vision LLMsSA-Attack: Improving Adversarial Transferability of Vision-Language Pre-training Models via Self-AuMISUSING TOOLS IN LARGE LANGUAGE MODELS WITH VISUAL ADVERSARIAL EXAMPLESVL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language ModelsINSTRUCTTA: Instruction-Tuned Targeted Attack for Large Vision-Language ModelsSet-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training ModShadowcast: STEALTHY DATA POISONING ATTACKS AGAINST VISION-LANGUAGE MODELSFigStep: Jailbreaking Large Vision-language Models via Typographic Visual PromptsTHE WOLF WITHIN: COVERT INJECTION OF MALICE INTO MLLM SOCIETIES VIA AN MLLM OPERATIVEStop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial ImagesAgent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially FastHow Robust is Google’s Bard to Adversarial Image Attacks?OnEvaluating Adversarial Robustness of Large Vision-Language ModelsOnthe Adversarial Robustness of Multi-Modal Foundation ModelsAre aligned neural networks adversarially aligned?READING ISN’T BELIEVING: ADVERSARIAL ATTACKS ON MULTI-MODAL NEURONSBlack Box Adversarial Prompting for Foundation ModelsEvaluation and Analysis of Hallucination in Large Vision-Language ModelsFOOL YOUR (VISION AND) LANGUAGE MODEL WITH EMBARRASSINGLY SIMPLE PERMUTATIONSVL-Trojan: Multimodal Instruction Backdoor Attacks against Autoregressive Visual Language ModelsTransferable Multimodal Attack on Vision-Language Pre-training ModelsBadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive LearningAdvCLIP: Downstream-agnostic Adversarial Examples in Multimodal Contrastive Learning
PreviousScalable Performance Analysis for Vision-Language ModelsNextCircumventing Concept Erasure Methods For Text-to-Image Generative Models
Last updated