Benchmark

HALLUSIONBENCH: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual IllusiOpenEval: Benchmarking Chinese LLMs across Capability, Alignment and SafetyToViLaG: Your Visual-Language Generative Model is Also An EvildoerHarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust RefusalS-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large LanguageUnsafeBench: Benchmarking Image Safety Classifiers on Real-World and AI-Generated ImagesJailBreakV-28K: A Benchmark for Assessing the Robustness of MultiModal Large Language Models againstJailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language ModelsConstructing Benchmarks and Interventions for Combating Hallucinations in LLMsALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red TeamingBenchmarking Llama2, Mistral, Gemma and GPT for Factuality, Toxicity, Bias and Propensity for HallucINJECAGENT: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model AgentsAVIBench: Towards Evaluating the Robustness of Large Vision-Language Model on Adversarial Visual-InsHALLUSIONBENCH: An Advanced Diagnostic Suite for Entangled Language Hallucination and Visual IllusioALL LANGUAGES MATTER: ON THE MULTILINGUAL SAFETY OF LARGE LANGUAGE MODELSWhy Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in AdversarialRed Teaming Visual Language ModelsUnified Hallucination Detection for Multimodal Large Language ModelsMLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language BenchmarkMitigating Hallucination in Large Multi-Modal Models via Robust Instruction TuningCAN LANGUAGE MODELS BE INSTRUCTED TO PROTECT PERSONAL INFORMATION?Detecting and Preventing Hallucinations in Large Vision Language ModelsDRESS : Instructing Large Vision-Language Models to Align and Interact with Humans via Natural LangToViLaG: Your Visual-Language Generative Model is Also An EvildoerSC-Safety: A Multi-round Open-ended Question Adversarial Safety Benchmark for Large Language ModelsPromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial PromptsDo-Not-Answer: A Dataset for Evaluating Safeguards in LLMs