3260 papers • 126 benchmarks • 313 datasets
A classic task to extract salient phrases that best summarize a document, which essentially has two stages: candidate generation and keyphrase ranking.
(Image credit: Papersgraph)
These leaderboards are used to track progress in keyphrase-extraction-6
Use these libraries to find keyphrase-extraction-6 models and implementations
No subtasks available.
Empirical analysis on six datasets demonstrates that the proposed generative model for keyphrase prediction with an encoder-decoder framework achieves a significant performance boost on extracting keyphrases that appear in the source text, but also can generate absent keyphRases based on the semantic meaning of the text.
This paper tackles keyphrase extraction from single documents with EmbedRank: a novel unsupervised method, that leverages sentence embeddings, that achieves higher F-scores than graph-based state of the art systems on standard datasets and is suitable for real-time processing of large amounts of Web data.
This article introduces keyphrase extraction, provides a well‐structured review of the existing work, offers interesting insights on the different evaluation approaches, highlights open issues and presents a comparative experimental study of popular unsupervised techniques on five datasets.
Experimental results on OpenKP confirm the effectiveness of BLING-KPE and the contributions of its neural architecture, visual features, and search log weak supervision and Zero-shot evaluations on DUC-2001 demonstrate the improved generalization ability of learning from the open domain data compared to a specific domain.
JointKPE is presented, an open-domain KPE architecture built on pre-trained language models, which can capture both local phraseness and global informativeness when extracting keyphrases and reveals the significant advantages of JointKPE in predicting long and non-entity keyphRases, which are challenging for previous neural KPE methods.
Summaries generated by abstractive summarization are supposed to only contain statements entailed by the source documents. However, state-of-the-art abstractive methods are still prone to hallucinate content inconsistent with the source documents. In this paper, we propose constrained abstractive summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization by specifying tokens as constraints that must be present in the summary. We explore the feasibility of using lexically constrained decoding, a technique applicable to any abstractive method with beam search decoding, to fulfill CAS and conduct experiments in two scenarios: (1) Standard summarization without human involvement, where keyphrase extraction is used to extract constraints from source documents; (2) Interactive summarization with human feedback, which is simulated by taking missing tokens in the reference summaries as constraints. Automatic and human evaluations on two benchmark datasets demonstrate that CAS improves the quality of abstractive summaries, especially on factual consistency. In particular, we observe up to 11.2 ROUGE-2 gains when several ground-truth tokens are used as constraints in the interactive summarization scenario.
This work proposes UCPhrase, a novel unsupervised context-aware quality phrase tagger that induces high-quality phrase spans as silver labels from consistently co-occurring word sequences within each document, thus having unique advantages in preserving contextual completeness and capturing emerging, out-of-KB phrases.
This paper proposes a novel way of employing labeled data such that it also informs LLM of some undesired output, by extending demonstration examples with feedback about answers predicted by an off-the-shelf model.
Adding a benchmark result helps the community track progress.