3260 papers • 126 benchmarks • 313 datasets
Multi-Document Summarization is a process of representing a set of documents with a short piece of text by capturing the relevant information and filtering out the redundant information. Two prominent approaches to Multi-Document Summarization are extractive and abstractive summarization. Extractive summarization systems aim to extract salient snippets, sentences or passages from documents, while abstractive summarization systems aim to concisely paraphrase the content of the documents. Source: Multi-Document Summarization using Distributed Bag-of-Words Model
(Image credit: Papersgraph)
These leaderboards are used to track progress in multi-document-summarization-10
Use these libraries to find multi-document-summarization-10 models and implementations
No subtasks available.
This work explores the use of data-efficient content selectors to over-determine phrases in a source document that should be part of the summary, and shows that this approach improves the ability to compress text, while still generating fluent summaries.
It is shown that generating English Wikipedia articles can be approached as a multi- document summarization of source documents and a neural abstractive model is introduced, which can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles.
This proposed framework attempts to model human methodology by selecting either a single sentence or a pair of sentences, then compressing or fusing the sentence(s) to produce a summary sentence.
A centroid-based method for text summarization that exploits the compositional capabilities of word embeddings and achieves good performance even in comparison to more complex deep learning models.
This work introduces Multi-News, the first large-scale MDS news dataset, and proposes an end-to-end model which incorporates a traditional extractive summarization model with a standard SDS model and achieves competitive results on MDS datasets.
This work considers the problem of automatically generating a narrative biomedical evidence summary from multiple trial reports and proposes new approaches that capitalize on domain-specific models to inform summarization, e.g., by explicitly demarcating snippets of inputs that convey key findings, and emphasizing the reports of large and high-quality trials.
This study develops a calibrated beam-based algorithm with awareness of the global attention distribution for neural abstractive summarization, aiming to improve the local optimality problem of the original beam search in a rigorous way. Specifically, a novel global protocol is proposed based on the attention distribution to stipulate how a global optimal hypothesis should attend to the source. A global scoring mechanism is then developed to regulate beam search to generate summaries in a near-global optimal fashion. This novel design enjoys a distinctive property, i.e., the global attention distribution could be predicted before inference, enabling step-wise improvements on the beam search through the global scoring mechanism. Extensive experiments on nine datasets show that the global (attention)-aware inference significantly improves state-of-the-art summarization models even using empirical hyper-parameters. The algorithm is also proven robust as it remains to generate meaningful texts with corrupted attention distributions. The codes and a comprehensive set of examples are available.
This work develops a method for automatic extraction of key points, which enables fully automatic analysis, and is shown to achieve performance comparable to a human expert, and demonstrates that the applicability of key point analysis goes well beyond argumentation data.
This work revisits the clustering approach, grouping together sub-sentential propositions, aiming at more precise information alignment in multi-document summarization, and improves the previous state-of-the-art MDS method in the DUC 2004 and TAC 2011 datasets.
Adding a benchmark result helps the community track progress.