3260 papers • 126 benchmarks • 313 datasets
According to Wikipedia "In machine learning, multi-label classification and the strongly related problem of multi-output classification are variants of the classification problem where multiple labels may be assigned to each instance. Multi-label classification is a generalization of multiclass classification, which is the single-label problem of categorizing instances into precisely one of more than two classes; in the multi-label problem there is no constraint on how many of the classes the instance can be assigned to."
(Image credit: Papersgraph)
These leaderboards are used to track progress in multi-label-text-classification-1
Use these libraries to find multi-label-text-classification-1 models and implementations
No subtasks available.
An agreement score to evaluate the performance of routing processes at instance-level, an adaptive optimizer to enhance the reliability of routing, and capsule compression and partial routing to improve the scalability of capsule networks are introduced.
This work proposes three strategies to stabilize the dynamic routing process to alleviate the disturbance of some noise capsules which may contain “background” information or have not been successfully trained.
OBJECTIVE In multi-label text classification, each textual document is assigned 1 or more labels. As an important task that has broad applications in biomedicine, a number of different computational methods have been proposed. Many of these methods, however, have only modest accuracy or efficiency and limited success in practical use. We propose ML-Net, a novel end-to-end deep learning framework, for multi-label classification of biomedical texts. MATERIALS AND METHODS ML-Net combines a label prediction network with an automated label count prediction mechanism to provide an optimal set of labels. This is accomplished by leveraging both the predicted confidence score of each label and the deep contextual information (modeled by ELMo) in the target document. We evaluate ML-Net on 3 independent corpora in 2 text genres: biomedical literature and clinical notes. For evaluation, we use example-based measures, such as precision, recall, and the F measure. We also compare ML-Net with several competitive machine learning and deep learning baseline models. RESULTS Our benchmarking results show that ML-Net compares favorably to state-of-the-art methods in multi-label classification of biomedical text. ML-Net is also shown to be robust when evaluated on different text genres in biomedicine. CONCLUSION ML-Net is able to accuractely represent biomedical document context and dynamically estimate the label count in a more systematic and accurate manner. Unlike traditional machine learning methods, ML-Net does not require human effort for feature engineering and is a highly efficient and scalable approach to tasks with a large set of labels, so there is no need to build individual classifiers for each separate label.
This work proposes a new label tree-based deep learning model for XMTC, called AttentionXML, with two unique features: a multi-label attention mechanism with raw text as input, which allows to capture the most relevant part of text to each label; and a shallow and wide probabilistic label tree (PLT), which allow to handle millions of labels, especially for "tail labels".
MIMIC-III (‘Medical Information Mart for Intensive Care’) is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary care hospital. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more. The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework. Design Type(s) data integration objective Measurement Type(s) Demographics • clinical measurement • intervention • Billing • Medical History Dictionary • Pharmacotherapy • clinical laboratory test • medical data Technology Type(s) Electronic Medical Record • Medical Record • Electronic Billing System • Medical Coding Process Document • Free Text Format Factor Type(s) Sample Characteristic(s) Homo sapiens Design Type(s) data integration objective Measurement Type(s) Demographics • clinical measurement • intervention • Billing • Medical History Dictionary • Pharmacotherapy • clinical laboratory test • medical data Technology Type(s) Electronic Medical Record • Medical Record • Electronic Billing System • Medical Coding Process Document • Free Text Format Factor Type(s) Sample Characteristic(s) Homo sapiens Machine-accessible metadata file describing the reported data (ISA-Tab format)
The Correlation Networks (CorNet) architecture for the extreme multi-label text classification (XMTC) task, where the objective is to tag an input text sequence with the most relevant subset of labels from an extremely large label set, is developed.
deepRAM, an end-to-end deep learning tool that provides an implementation of a wide selection of architectures, finds that deeper more complex architectures provide a clear advantage with sufficient training data, and that hybrid CNN/RNN architectures outperform other methods in terms of accuracy.
X-Transformer is proposed, the first scalable approach to fine-tuning deep transformer models for the XMC problem and achieves new state-of-the-art results on four XMC benchmark datasets.
A graph attention network-based model is proposed to capture the attentive dependency structure among the labels in Multi-Label Text Classification to achieve similar or better performance compared to the previous state-of-the-art models.
Adding a benchmark result helps the community track progress.