Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
PreviousSHADOW ALIGNMENT: THE EASE OF SUBVERTING SAFELY-ALIGNED LANGUAGE MODELSNextChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
Last updated

