3260 papers • 126 benchmarks • 313 datasets
Dialogue generation is the task of "understanding" natural language inputs - within natural language processing in order to produce output. The systems are usually intended for conversing with humans, for instance back and forth dialogue with a conversation agent like a chatbot. Some example benchmarks for this task (see others such as Natural Language Understanding) include FusedChat and Ubuntu DIalogue Corpus (UDC). Models can be evaluated via metrics such as BLEU, ROUGE, and METEOR albeit with challenges in terms of weak correlation with human judgement, that may be addressed by new ones like UnSupervised and Reference-free (USR) and Metric for automatic Unreferenced dialog evaluation (MaUde).
(Image credit: Papersgraph)
These leaderboards are used to track progress in dialogue-generation-73
Use these libraries to find dialogue-generation-73 models and implementations
Adding a benchmark result helps the community track progress.