Research Connect
Research PapersAboutContact

Obtaining genetics insights from deep learning via explainable artificial intelligence

Published in Nature reviews genetics (2022-10-03)
aionlincourseaionlincourseaionlincourseaionlincourse
Generate Graph

On This Page

  • TL;DR
  • Abstract
  • Authors
  • Datasets
  • References
TL

TL;DR

Advances in deep learning approaches in genomics are described, whereby researchers are moving beyond the typical ‘black box’ nature of models to obtain biological insights through explainable artificial intelligence (xAI).

Abstract

Authors

Gherman Novakovsky

1 Paper

N. Dexter

1 Paper

Maxwell W. Libbrecht

1 Paper

W. Wasserman

1 Paper

S. Mostafavi

1 Paper

References111 items

1

A Survey

2

Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning

3

Deep Learning

4

Publisher's Note

5

Language Models are Few-Shot Learners

6

Navigating the pitfalls of applying machine learning in genomics

Research Impact

172

Citations

111

References

0

Datasets

5

7

Attention is All you Need

8

Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization

9

Improving neural networks by preventing co-adaptation of feature detectors

10

Maximum entropy methods for extracting the learned features of deep neural networks

11

Deep learning for computational biology

12

Deep learning of the tissue-regulated splicing code

13

A Unified Approach to Interpreting Model Predictions

14

Axiomatic Attribution for Deep Networks

15

Using deep learning to model the hierarchical structure and function of a cell

16

Opportunities and obstacles for deep learning in biology and medicine

17

Predicting effects of noncoding variants with deep learning–based sequence model

18

Regulatory

19

ExplaiNN: interpretable and transparent neural networks for genomics

20

DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers

21

Evaluating deep learning for predicting epigenomic profiles

22

The Shapley Value in Machine Learning

23

Towards More Realistic Simulated Datasets for Benchmarking Deep Learning Models in Regulatory Genomics

24

JASPAR 2022: the 9th release of the open-access database of transcription factor binding profiles

25

Accelerating in-silico saturation mutagenesis using compressed sensing

26

DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of enhancers

27

Biologically informed deep neural network for prostate cancer discovery

28

scBasset: sequence-based modeling of single-cell ATAC-seq using convolutional neural networks

29

Interpretable deep learning for chromatin-informed inference of transcriptional programs driven by somatic alterations across cancers

30

Reproducibility standards for machine learning in the life sciences

31

Perturbation-based methods for explaining deep neural networks: A survey

32

FastSHAP: Real-Time Shapley Value Estimation

33

Weakly supervised learning of RNA modifications from low-resolution epitranscriptome data

34

Global importance analysis: An interpretability method to quantify importance of genomic features in deep neural networks

35

Effective gene expression prediction from sequence by integrating long-range interactions

36

Chromatin interaction–aware gene regulatory modeling with graph attention networks

37

Sequence determinants of human gene regulatory elements

38

Discovering differential genome sequence activity with interpretable and efficient deep learning

39

Domain-adaptive neural networks improve cross-species prediction of transcription factor binding

40

Deep neural networks identify sequence context features predictive of transcription factor binding

41

The epigenetic basis of cellular heterogeneity

42

Explaining by Removing: A Unified Framework for Model Explanation

43

Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges

44

The dynamic, combinatorial cis-regulatory lexicon of epidermal differentiation

45

fastISM: Performant in-silico saturation mutagenesis for convolutional neural networks

46

Transparency and reproducibility in artificial intelligence

47

Deep learning of immune cell differentiation

48

DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome

49

AI for radiographic COVID-19 detection selects shortcuts over signal

50

Understanding the role of individual units in a deep neural network

51

Predicting 3D genome folding from DNA sequence with Akita

52

Enhancing the interpretability of transcription factor binding site prediction using attention mechanism

53

Deep learning decodes the principles of differential gene expression

54

Enhanced Integrated Gradients: improving interpretability of deep learning models using splicing codes as a case study

55

Improving representations of genomic sequence motifs in convolutional networks with exponential activations

56

Neural Additive Models: Interpretable Machine Learning with Neural Nets

57

Explaining Explanations: Axiomatic Feature Interactions for Deep Networks

58

A self-attention model for inferring cooperativity between regulatory features

59

Deep learning for inferring transcription factor binding sites.

60

Biophysical models of cis-regulation as interpretable neural networks

61

Knowledge-primed neural networks enable biologically interpretable deep learning on single-cell sequencing data

62

Base-resolution models of transcription factor binding reveal soft motif syntax

63

Deep learning: new computational modelling techniques for genomics

64

Explanations can be manipulated and geometry is to blame

65

Expectation pooling: an effective and interpretable pooling method for predicting DNA–protein binding

66

A Deep Neural Network for Predicting and Engineering Alternative Polyadenylation

67

Is Attention Interpretable?

68

Fully interpretable deep learning model of transcriptional control

69

Machine learning and complex biological data

70

Deep learning: new computational modelling techniques for genomics

71

An Attentive Survey of Attention Models

72

Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence

73

Chromatin accessibility and the regulatory epigenome

74

A primer on deep learning in genomics

75

Technical Note on Transcription Factor Motif Discovery from Importance Scores (TF-MoDISco) version 0.5.6.5

76

Organizational principles of 3D genome architecture

77

Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk

78

Discovering epistatic feature interactions from neural network models of regulatory DNA sequences

79

The Human Transcription Factors

80

Modeling Enhancer-Promoter Interactions with Attention-Based Neural Networks

81

Predicting enhancers with deep convolutional neural networks

82

Sequential regulatory activity prediction across chromosomes with convolutional neural networks

83

Learning Important Features Through Propagating Activation Differences

84

Gradients of Counterfactuals

85

Deep Motif Dashboard: Visualizing and Understanding Genomic Sequences Using Deep Neural Networks

86

Deep learning in bioinformatics

87

DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences

88

Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks

89

Evaluating the Visualization of What a Deep Neural Network Has Learned

90

Understanding Neural Networks Through Deep Visualization

91

The human splicing code reveals new insights into the genetic determinants of disease

92

Neural Machine Translation by Jointly Learning to Align and Translate

93

Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity

94

Organization of the Drosophila melanogaster SF1 insulator and its role in transcription regulation in transgenic lines

95

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

96

Determining the specificity of protein–DNA interactions

97

Addendum: Regularization and variable selection via the elastic net

98

Gene Ontology: tool for the unification of biology

99

OUP accepted manuscript

100

A pioneering paper that shows how non-linear relationship between motifs and contextdependent spacing can be derived using various post-hoc model interpretation techniques

101

This textbook provides an overview of approaches for interpreting machine learning models

102

This review paper provides a succinct overview of deep learning in genomics

103

500,000 random sequences

104

A technical paper that describes the DeepLIFT feature attribution method, one of the most widely used propagation-based methods in genomics

105

A machine learning textbook that focuses on DNN models

106

One of the first papers to use a sequenceto-activity neural network for a broad class of regulatory genomics tasks

107

Author contributions All authors contributed to all aspects of the article

108

A first paper that introduces transformers and attention mechanism for improved prediction of gene expression from large input sequences

109

A paper that proposes one of the first hybrid CNN-RNN models in genomics applications

110

A first paper describing how occlusion can be used to detect significant motif-motif epistasis

111

or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s)

Authors

Field of Study

Medicine

Journal Information

Name

Nature Reviews Genetics

Volume

24

Venue Information

Name

Nature reviews genetics

Type

journal

URL

https://www.nature.com/nrg/

Alternate Names

  • Nature Reviews Genetics
  • Nat rev genet
  • Nat Rev Genet