Construction of… (2018-05-06T00:00:00.000000Z)

TL;DR

This paper reduces literature graph construction into familiar NLP tasks, point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task.

Abstract

We describe a deployed scalable system for organizing published scientific literature into a heterogeneous graph to facilitate algorithmic manipulation and discovery. The resulting literature graph consists of more than 280M nodes, representing papers, authors, entities and various interactions between them (e.g., authorships, citations, entity mentions). We reduce literature graph construction into familiar NLP tasks (e.g., entity extraction and linking), point out research challenges due to differences from standard formulations of these tasks, and report empirical results for each task. The methods described in this paper are used to enable semantic features in www.semanticscholar.org.

Authors

Oren Etzioni

8 papers

Bridger Waleed Ammar

7 papers

Madeleine van Zuylen

7 papers

TL;DR

Abstract

Authors

References26 items

Extracting Scientific Figures with Distantly Supervised Neural Networks

Content-Based Citation Recommendation

The AI2 system at SemEval-2017 Task 10 (ScienceIE): semi-supervised end-to-end entity and relation extraction

Learning to Predict Citation-Based Impact Measures

Learning a Neural Semantic Parser from User Feedback

TL;DR

Abstract

Authors

References26 items

Extracting Scientific Figures with Distantly Supervised Neural Networks

Content-Based Citation Recommendation

The AI2 system at SemEval-2017 Task 10 (ScienceIE): semi-supervised end-to-end entity and relation extraction

Learning to Predict Citation-Based Impact Measures

Learning a Neural Semantic Parser from User Feedback

SemEval 2017 Task 10: ScienceIE - Extracting Keyphrases and Relations from Scientific Publications

Explicit Semantic Ranking for Academic Search via Knowledge Graph Embedding

Semi-supervised sequence tagging with bidirectional language models

MetaMap Lite: an evaluation of a new Java implementation of MetaMap

BioCreative V CDR task corpus: a resource for chemical disease relation extraction

Neural Architectures for Named Entity Recognition

TabEL: Entity Linking in Web Tables

Design Challenges for Entity Linking

Identifying Meaningful Citations

CHEMDNER: The drugs and chemical names extraction challenge

GloVe: Global Vectors for Word Representation

CiteSeerX: AI in a Digital Library Search Engine

Clinical review: Efficacy of antimicrobial-impregnated catheters in external ventricular drainage - a systematic review and meta-analysis

Search needs a shake-up

Exploiting MeSH indexing in MEDLINE to generate a data set for word sense disambiguation

Natural Language Processing (Almost) from Scratch

TAGME: on-the-fly annotation of short text fragments (by wikipedia entities)

Frustratingly Easy Domain Adaptation

Long Short-Term Memory

Swanson linking revisited: Accelerating literature-based discovery across domains using a conceptual influence graph

Author Disambiguation using Error-driven Machine Learning with a Ranking Loss Function

Field of Study

Venue Information

Name

Type

URL

Alternate Names