Large-Scale Mul… (2019-06-01T00:00:00.000000Z)

TL;DR

This work releases a new dataset of 57k legislative documents from EUR-LEX, annotated with ∼4.3k EUROVOC labels, suitable for LMTC, few- and zero-shot learning, and shows that BIGRUs with label-wise attention perform better than other current state of the art methods.

Abstract

We consider Large-Scale Multi-Label Text Classification (LMTC) in the legal domain. We release a new dataset of 57k legislative documents from EUR-LEX, annotated with ∼4.3k EUROVOC labels, which is suitable for LMTC, few- and zero-shot learning. Experimenting with several neural classifiers, we show that BIGRUs with label-wise attention perform better than other current state of the art methods. Domain-specific WORD2VEC and context-sensitive ELMO embeddings further improve performance. We also find that considering only particular zones of the documents is sufficient. This allows us to bypass BERT’s maximum text length limit and fine-tune BERT, obtaining the best results in all but zero-shot learning cases.

Authors

Prodromos Malakasiotis

6 papers

Ion Androutsopoulos

14 papers

Ilias Chalkidis

5 papers

References36 items

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

Deep learning in law: early adaptation and legal word embeddings trained on large corpora

AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks

A no-regret generalization of hierarchical softmax to extreme multi-label classification

Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces

TL;DR

Abstract

Authors

Prodromos Malakasiotis

6 papers

Ion Androutsopoulos

14 papers

Ilias Chalkidis

5 papers

TL;DR

Abstract

Authors

References36 items

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

Deep learning in law: early adaptation and legal word embeddings trained on large corpora

AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks

A no-regret generalization of hierarchical softmax to extreme multi-label classification

Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces

TL;DR

Abstract

Authors

References36 items

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context

Deep learning in law: early adaptation and legal word embeddings trained on large corpora

AttentionXML: Extreme Multi-Label Text Classification with Multi-Label Attention Based Recurrent Neural Networks

A no-regret generalization of hierarchical softmax to extreme multi-label classification

Few-Shot and Zero-Shot Multi-Label Learning for Structured Label Spaces

The Hitchhiker’s Guide to Testing Statistical Significance in Natural Language Processing

Obligation and Prohibition Extraction Using Hierarchical RNNs

Explainable Prediction of Medical Codes from Clinical Text

Deep Contextualized Word Representations

Deep Learning for Extreme Multi-label Text Classification

Extracting contract elements

Attention is All you Need

Deep Extreme Multi-label Learning

Neural Machine Translation in Linear Time

Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective

Hierarchical Attention Networks for Document Classification

MIMIC-III, a freely accessible critical care database

An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition

LSHTC: A Benchmark for Large-Scale Text Classification

Show, Attend and Tell: Neural Image Caption Generation with Visual Attention

GloVe: Global Vectors for Word Representation

Hidden factors and hidden topics: understanding rating dimensions with review text

Efficient Estimation of Word Representations in Vector Space

Enhancing Navigation on Wikipedia with Social Tags

Legal Docket Classification: Where Machine Learning Stumbles

RCV1: A New Benchmark Collection for Text Categorization Research

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Generalized Zero-Shot Learning with Deep Calibration Network

Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers)

An Introduction to Information Retrieval

An Evaluation of Efficient Multilabel Classification Algorithms for Large-Scale Problems in the Legal Domain

Both measures over-or under-estimate performance on documents whose number of gold la-Hyper

Computational Linguistics: Human Language Technologies , Minneapolis, MN, USA

The macro-averaged versions of R @ K and P @ K are deﬁned

t =1 t P @ K = 1 T T (cid:88) t =1 S t ( K ) K

P @ K does the same for documents with fewer than K gold labels

Field of Study

Venue Information

Name

Type

URL

Alternate Names