Blocking

Entity resolution (also known as entity matching, record linkage, or duplicate detection) is the task of finding records that refer to the same real-world entity across different data sources (e.g., data files, books, websites, and databases). (Source: Wikipedia) Blocking is a crucial step in any entity resolution pipeline because a pair-wise comparison of all records across two data sources is infeasible. Blocking applies a computationally cheap method to generate a smaller set of candidate record pairs reducing the workload of the matcher. During matching a more expensive pair-wise matcher generates a final set of matching record pairs. Survey on blocking: Papadakis et al.: Blocking and Filtering Techniques for Entity Resolution: A Survey, 2020.

Benchmarks

Libraries

Datasets

Subtasks

Most implemented papers

SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder

Content

Compression Artifacts Reduction by a Deep Convolutional Network

d-blink: Distributed End-to-End Bayesian Entity Resolution

Neural Text Generation with Unlikelihood Training

Deep Convolution Networks for Compression Artifacts Reduction

Towards Universal Dense Blocking for Entity Resolution

Ethnicity Sensitive Author Disambiguation Using Semi-supervised Learning

Percival: Making In-Browser Perceptual Ad Blocking Practical With Deep Learning

Emergent Complexity via Multi-Agent Competition