An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulne
PreviousINFERRING OFFENSIVENESS IN IMAGES FROM NATURAL LANGUAGE SUPERVISIONNextMore RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness
Last updated

