RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedb



PreviousDefense-Prefix for Preventing Typographic Attacks on CLIPNextAMutation-Based Method for Multi-Modal Jailbreaking Attack
Last updated



Last updated