3260 papers • 126 benchmarks • 313 datasets
Entity Typing is an important task in text analysis. Assigning types (e.g., person, location, organization) to mentions of entities in documents enables effective structured analysis of unstructured text corpora. The extracted type information can be used in a wide range of ways (e.g., serving as primitives for information extraction and knowledge base (KB) completion, and assisting question answering). Traditional Entity Typing systems focus on a small set of coarse types (typically fewer than 10). Recent studies work on a much larger set of fine-grained types which form a tree-structured hierarchy (e.g., actor as a subtype of artist, and artist is a subtype of person). Source: Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding Image Credit: Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding
(Image credit: Papersgraph)
These leaderboards are used to track progress in entity-typing-4
Use these libraries to find entity-typing-4 models and implementations
New pretrained contextualized representations of words and entities based on the bidirectional transformer, and an entity-aware self-attention mechanism that considers the types of tokens (words or entities) when computing attention scores are proposed.
This work proposes a novel end-to-end recurrent neural model which incorporates an entity-aware attention mechanism with a latent entity typing (LET) method and demonstrates that the model outperforms existing state-of-the-art models without any high-level features.
A global objective is formulated for learning the embeddings from text corpora and knowledge bases, which adopts a novel margin-based loss that is robust to noisy labels and faithfully models type correlation derived from knowledge bases.
TextEnt is a neural network model that learns distributed representations of entities and documents directly from a knowledge base (KB) and achieves state-of-the-art performance on both fine-grained entity typing and multiclass text classification.
New methods using real and complex bilinear mappings for integrating hierarchical information are presented, yielding substantial improvement over flat predictions in entity linking and fine-grained entity typing, and achieving new state-of-the-art results for end-to-end models on the benchmark FIGER dataset.
This paper utilizes both large-scale textual corpora and KGs to train an enhanced language representation model (ERNIE) which can take full advantage of lexical, syntactic, and knowledge information simultaneously, and is comparable with the state-of-the-art model BERT on other common NLP tasks.
This work proposes EntEval: a test suite of diverse tasks that require nontrivial understanding of entities including entity typing, entity similarity, entity relation prediction, and entity disambiguation, and develops training techniques for learning better entity representations by using natural hyperlink annotations in Wikipedia.
MTab combines the voting algorithm and the probability models to solve critical problems of the matching tasks and obtains promising performance for the three matching tasks on SemTab 2019.
LEOPARD is trained with the state-of-the-art transformer architecture and shows better generalization to tasks not seen at all during training, with as few as 4 examples per label, than self-supervised pre-training or multi-task training.
K-Adapter is proposed, which remains the original parameters of the pre-trained model fixed and supports continual knowledge infusion and captures richer factual and commonsense knowledge than RoBERTa.
Adding a benchmark result helps the community track progress.