Topic coverage

A prevalent use case of topic models is that of topic discovery. However, most of the topic model evaluation methods rely on abstract metrics such as perplexity or topic coherence. The topic coverage approach is to measure the models' performance by matching model-generated topics to a fixed set of reference topics - topics discovered by humans and represented in a machine-readable format. This way, the models are evaluated in the context of their use, by essentially simulating topic modeling in a fixed setting defined by a text collection and a set of reference topics. Reference topics represent a ground truth that can be used to evaluate both topic models and other measures of model performance. This coverage approach enables large-scale automatic evaluation of existing and future topic models.

Benchmarks

Libraries

Datasets

Subtasks

Most implemented papers

A Topic Coverage Approach to Evaluation of Topic Models

Content

Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

MUG: A General Meeting Understanding and Generation Benchmark

ReGen: Zero-Shot Text Classification via Training Data Generation with Progressive Dense Retrieval

AugESC: Dialogue Augmentation with Large Language Models for Emotional Support Conversation