3260 papers • 126 benchmarks • 313 datasets
Table annotation is the task of annotating a table with terms/concepts from knowledge graph or database schema. Table annotation is typically broken down into the following five subtasks: Cell Entity Annotation (CEA) Column Type Annotation (CTA) Column Property Annotation (CPA) Table Type Detection Row Annotation The SemTab challenge is closely related to the Table Annotation problem. It is a yearly challenge which focuses on the first three tasks of table annotation and its purpose is to benchmark different table annotation systems.
(Image credit: Papersgraph)
These leaderboards are used to track progress in table-annotation-10
No benchmarks available.
Use these libraries to find table-annotation-10 models and implementations
Sherlock is introduced, a multi-input deep neural network for detecting semantic types that achieves a support-weighted F$_1 score of $0.89, exceeding that of machine learning baselines, dictionary and regular expression benchmarks, and the consensus of crowdsourced annotations.
MTab combines the voting algorithm and the probability models to solve critical problems of the matching tasks and obtains promising performance for the three matching tasks on SemTab 2019.
A simple pretraining objective (corrupt cell detection) is devised that learns exclusively from tabular data and reaches the state-of-the-art on a suite of table-based prediction tasks and requires far less compute to train.
Gittables, a corpus of 1M relational tables extracted from GitHub, is introduced, demonstrating its value for learned semantic type detection models, schema completion methods, and benchmarks for table-to-KG matching, data search, and preparation.
ERRANT, a grammatical ERRor ANnotation Toolkit designed to automatically extract edits from parallel original and corrected sentences and classify them according to a new, dataset-agnostic, rule-based framework, which facilitates error type evaluation at different levels of granularity.
A neu-ral network based column type annotation framework named ColNet is proposed which is able to integrate KB reasoning and lookup with machine learning and can automatically train Convolu-tional Neural Networks for prediction.
This study focuses on column type prediction for tables without any meta data and proposes a deep prediction model that can fully exploit a table's contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN) and inter-column semantics features learning by a knowledge base (KB).
Evaluations run on a novel dataset consisting of a set of high-quality manually-curated tables with non-obviously linkable cells show that ambiguity is a key problem for entity linking algorithms and encourage a promising direction for future work in the field.
Adding a benchmark result helps the community track progress.