POPDx: an automated framework for patient phenotyping across 392 246 individuals in the UK Biobank study (2022-08-23T00:00:00.000000Z)

TL;DR

POPDx can predict phenotypes that are rare or even unobserved in training and demonstrate substantial improvement of automated multiphenotype recognition across 22 disease categories, and its application in identifying key epidemiological features associated with each phenotype.

Abstract

Abstract Objective For the UK Biobank, standardized phenotype codes are associated with patients who have been hospitalized but are missing for many patients who have been treated exclusively in an outpatient setting. We describe a method for phenotype recognition that imputes phenotype codes for all UK Biobank participants. Materials and Methods POPDx (Population-based Objective Phenotyping by Deep Extrapolation) is a bilinear machine learning framework for simultaneously estimating the probabilities of 1538 phenotype codes. We extracted phenotypic and health-related information of 392 246 individuals from the UK Biobank for POPDx development and evaluation. A total of 12 803 ICD-10 diagnosis codes of the patients were converted to 1538 phecodes as gold standard labels. The POPDx framework was evaluated and compared to other available methods on automated multiphenotype recognition. Results POPDx can predict phenotypes that are rare or even unobserved in training. We demonstrate substantial improvement of automated multiphenotype recognition across 22 disease categories, and its application in identifying key epidemiological features associated with each phenotype. Conclusions POPDx helps provide well-defined cohorts for downstream studies. It is a general-purpose method that can be applied to other biobanks with diverse but incomplete data.

Authors

R. Altman

3 papers

Lu Yang

1 papers

Sheng Wang

1 papers

References46 items

Leveraging the Cell Ontology to classify unseen cell types

Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS.

Assessing the Uniformity of Uveitis Clinical Concepts and Associated ICD-10 Codes Across Health Care Systems Sharing the Same Electronic Health Records System.

TL;DR

Abstract

Authors

References46 items

Leveraging the Cell Ontology to classify unseen cell types

Using Phecodes for Research with the Electronic Health Record: From PheWAS to PheRS.

Assessing the Uniformity of Uveitis Clinical Concepts and Associated ICD-10 Codes Across Health Care Systems Sharing the Same Electronic Health Records System.

Captum: A unified and generic model interpretability library for PyTorch

A comprehensive study on disease risk predictions in machine learning

Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing

The use of machine learning in rare diseases: a scoping review

MARS: discovering novel cell types across heterogeneous single-cell experiments

PyTorch: An Imperative Style, High-Performance Deep Learning Library

Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation

Unifying single-cell annotations based on the Cell Ontology

Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database

MetaPheno: A critical evaluation of deep learning and machine learning in metagenome-based disease prediction.

A Novel Deep Neural Network Model for Multi-Label Chronic Disease Prediction

Publicly Available Clinical BERT Embeddings

BioBERT: a pre-trained biomedical language representation model for biomedical text mining

Human Disease Ontology 2018 update: classification, content and workflow expansion

Developing and Evaluating Mappings of ICD-10 and ICD-10-CM Codes to Phecodes

Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists

Artificial intelligence and deep learning in ophthalmology

The UK Biobank resource with deep phenotyping and genomic data

Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review

Combination Cancer Therapy Can Confer Benefit via Patient-to-Patient Variability without Drug Additivity or Synergy

CheXNet: Radiologist-Level Pneumonia Detection on Chest X-Rays with Deep Learning

Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record

Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis

Artificial Intelligence in Precision Cardiovascular Medicine.

Learning Important Features Through Propagating Activation Differences

An artificial intelligence platform for the multihospital collaborative management of congenital cataracts

Phenome-Wide Association Studies as a Tool to Advance Precision Medicine.

Extracting information from the text of electronic medical records to improve case detection: a systematic review

An intelligent system for diabetes prediction

Metrics and tools for consistent cohort discovery and financial analyses post-transition to ICD-10-CM

Enhancing patient safety and quality of care by improving the usability of electronic health record systems: recommendations from AMIA.

How do patients with rare diseases experience the medical encounter? Exploring role behavior and its impact on patient-physician interaction.

The NumPy Array: A Structure for Efficient Numerical Computation

The human disease network

Matplotlib: A 2D Graphics Environment

Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis

AI in medicine on its way from knowledge-intensive to data-intensive systems

HUMAN DISEASE

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Accuracy and Completeness of Clinical Coding Using ICD-10 for Ambulatory Visits

Visualizing Data using t-SNE

Singular value decomposition and principal component analysis

Model Interpretability for PyTorch using Captum

Field of Study

Journal Information

Name

Page

Volume