Stealth edits for provably fixing or attacking large language models

Last updated