The Butterfly Effect of Altering Prompts: How Small Changes and Jailbreaks Affect Large Language Mod
PreviousCOVER: A Heuristic Greedy Adversarial Attack on Prompt-based Learning in Language ModelsNextOpen the Pandora’s Box of LLMs: Jailbreaking LLMs through Representation Engineering
Last updated
