Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings

Published in

Conference on Empirical Methods in Natural Lang...(2015)

External Links:

Generate Graph DownloadPDF

TL;DR

A new corpus of Weibo messages annotated for both name and nominal mentions is presented and a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text is proposed.

Abstract

We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has focused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Additionally, we evaluate three types of neural embeddings for representing Chinese text. Finally, we propose a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text. Our methods yield a 9% improvement over a stateof-the-art baseline.

Authors

Nanyun Peng

28 papers

Mark Dredze

7 papers

References42 items

Joint Learning of Character and Word Embeddings

Learning Composition Models for Phrase Embeddings

Domain-Specific Product Named Entity Recognition from Chinese Microblog

Co-learning of Word Representations and Morpheme Representations

Improving Lexical Embeddings with Semantic Knowledge

Named Entity Recognition for Chinese Social Media with Jointly Trained Embeddings

Published in

Conference on Empirical Methods in Natural Lang...(2015)

External Links:

Generate Graph DownloadPDF

TL;DR

Abstract

Authors

Nanyun Peng

28 papers

Mark Dredze

7 papers

References42 items

Joint Learning of Character and Word Embeddings

Learning Composition Models for Phrase Embeddings

Domain-Specific Product Named Entity Recognition from Chinese Microblog

Co-learning of Word Representations and Morpheme Representations

TL;DR

Abstract

Authors

References42 items

Joint Learning of Character and Word Embeddings

Learning Composition Models for Phrase Embeddings

Domain-Specific Product Named Entity Recognition from Chinese Microblog

Co-learning of Word Representations and Morpheme Representations

Improving Lexical Embeddings with Semantic Knowledge

TL;DR

Abstract

Authors

References42 items

Joint Learning of Character and Word Embeddings

Learning Composition Models for Phrase Embeddings

Domain-Specific Product Named Entity Recognition from Chinese Microblog

Co-learning of Word Representations and Morpheme Representations

Improving Lexical Embeddings with Semantic Knowledge

A Comparison of the Events and Relations Across ACE, ERE, TAC-KBP, and FrameNet Annotation Standards

Max-Margin Tensor Neural Network for Chinese Word Segmentation

Recurrent conditional random field for language understanding

Crowdsourcing and annotating NER for Twitter #drift

Lexicon Infused Phrase Embeddings for Named Entity Resolution

Radical-Enhanced Chinese Character Embedding

Distributed Representations of Words and Phrases and their Compositionality

Exploring Representations from Unlabeled Data with Co-training for Chinese Word Segmentation

Deep Learning for Chinese Word Segmentation and POS Tagging

Microblogs as Parallel Corpora

The CIPS-SIGHAN CLP 2012 ChineseWord Segmentation onMicroBlog Corpora Bakeoff

The Task 2 of CIPS-SIGHAN 2012 Named Entity Recognition and Disambiguation in Chinese Bakeoff

TwiNER: named entity recognition in targeted twitter stream

Joint Inference of Named Entity Recognition and Normalization for Tweets

Named Entity Recognition in Tweets: An Experimental Study

Local and Global Algorithms for Disambiguation to Wikipedia

Recognizing Named Entities in Tweets

Natural Language Processing (Almost) from Scratch

Entity Disambiguation for Knowledge Base Population

Word Representations: A Simple and General Method for Semi-Supervised Learning

Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk

Annotating Named Entities in Twitter Data with Crowdsourcing

A unified architecture for natural language processing: deep neural networks with multitask learning

Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3

A Shortest Path Dependency Kernel for Relation Extraction

Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling

A Semi-Supervised Approach to Build Annotated Corpus for Chinese Named Entity Recognition

Single Character Chinese Named Entity Recognition

Early results for Named Entity Recognition with Conditional Random Fields, Feature Induction and Web-Enhanced Lexicons

The Unreasonable Effectiveness of Word Representations for Twitter Named Entity Recognition

Learning Character Representations for Chinese Word Segmentation

DEFT ERE Annotation Guidelines: Entities

Nerit: Named Entity Recognition for Informal Text

Chinese Word Segmentation and Named Entity Recognition Based on Conditional Random Fields

The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging

A survey of named entity recognition and classification

Unsupervised Models for Named Entity Classification

Field of Study

Venue Information

Name

Type

URL

Alternate Names