Stealthy and Persistent Unalignment on Large Language Models via Backdoor Injections

Last updated