Data-to-Text Generation

A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation. ( Image credit: Data-to-Text Generation with Content Selection and Planning )

Benchmarks

Libraries

Datasets

Subtasks

Most implemented papers

Challenges in Data-to-Document Generation

Content

MoverScore: Text Generation Evaluating with Contextualized Embeddings and Earth Mover Distance

Investigating Pretrained Language Models for Graph-to-Text Generation

The E2E Dataset: New Challenges For End-to-End Generation

Oversampling for Imbalanced Learning Based on K-Means and SMOTE

Data-to-Text Generation with Content Selection and Planning

Deep Graph Convolutional Encoders for Structured Data to Text Generation

Handling Rare Items in Data-to-Text Generation

Pragmatically Informative Text Generation