3260 papers • 126 benchmarks • 313 datasets
Determine the safety of a given dialogue context.
(Image credit: Papersgraph)
These leaderboards are used to track progress in dialogue-understanding
Use these libraries to find dialogue-understanding models and implementations
No subtasks available.
This work introduces ProsocialDialog, the first large-scale multi-turn dialogue dataset to teach conversational agents to respond to problematic content following social norms, and introduces a dialogue safety detection module, Canary, capable of generating RoTs given conversational context, and a socially-informed dialogue agent, Prost.
This paper introduces fourteen novel datasets for the evaluation of Large Language Models' safety in the context of enterprise tasks. A method was devised to evaluate a model's safety, as determined by its ability to follow instructions and output factual, unbiased, grounded, and appropriate content. In this research, we used OpenAI GPT as point of comparison since it excels at all levels of safety. On the open-source side, for smaller models, Meta Llama2 performs well at factuality and toxicity but has the highest propensity for hallucination. Mistral hallucinates the least but cannot handle toxicity well. It performs well in a dataset mixing several tasks and safety vectors in a narrow vertical domain. Gemma, the newly introduced open-source model based on Google Gemini, is generally balanced but trailing behind. When engaging in back-and-forth conversation (multi-turn prompts), we find that the safety of open-source models degrades significantly. Aside from OpenAI's GPT, Mistral is the only model that still performed well in multi-turn tests.
ALERT, a large-scale benchmark to assess safety based on a novel fine-grained risk taxonomy, is introduced, designed to evaluate the safety of LLMs through red teaming methodologies and consists of more than 45k instructions categorized using the authors' novel taxonomy.
Adding a benchmark result helps the community track progress.