1
Deep learning for genomics using Janggu
2
Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning
3
Prediction of the sequence-specific cleavage activity of Cas9 variants
4
Systematic evaluation of machine learning methods for identifying human-pathogen protein-protein interactions
5
Accurate multistage prediction of protein crystallization propensity using deep-cascade forest with sequence-based features
6
PASSION: an ensemble neural network approach for identifying the binding sites of RBPs on circRNAs
7
Sequence-based Detection of DNA-binding Proteins using Multiple-View Features Allied with Feature Selection.
8
Identification of Protein Lysine Crotonylation Sites by a Deep Learning Framework With Convolutional Neural Networks
9
PyTorch: An Imperative Style, High-Performance Deep Learning Library
10
Iterative feature representations improve N4-methylcytosine site prediction
11
Multimodal deep representation learning for protein interaction identification and protein family classification
12
Prediction of drug-target interaction based on protein features using undersampling and feature selection techniques with boosting.
13
Machine learning techniques for protein function prediction
14
Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences
15
4mCpred-EL: An Ensemble Learning Framework for Identification of DNA N4-Methylcytosine Sites in the Mouse Genome
16
BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches
17
BioSeq-Analysis: a platform for DNA, RNA and protein sequence analysis based on machine learning approaches
18
The Kipoi repository accelerates community exchange and reuse of predictive models for genomics
19
DeepFunc: A Deep Learning Framework for Accurate Prediction of Protein Functions from Protein Sequences and Interactions
20
Whole-genome deep-learning analysis identifies contribution of noncoding mutations to autism risk
21
mCSM-PPI2: predicting the effects of mutations on protein–protein interactions
22
ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks
23
Cellular functions of long noncoding RNAs
24
iLearn : an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data
25
DNAPred: Accurate Identification of DNA-Binding Sites from Protein Sequence by Ensembled Hyperplane-Distance-Based Support Vector Machines
26
Selene: a PyTorch-based deep learning library for sequence data
27
Hot spot prediction in protein-protein interactions by an ensemble system
28
Integration of A Deep Learning Classifier with A Random Forest Approach for Predicting Malonylation Sites
29
Protein Family Classification with Multi-Layer Graph Convolutional Networks
30
M6AMRFS: Robust Prediction of N6-Methyladenosine Sites With Sequence-Based Features in Multiple Species
31
Large-scale comparative assessment of computational predictors for lysine post-translational modification sites
32
High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features
33
LncFinder: an integrated platform for long non-coding RNA identification utilizing sequence intrinsic composition, structural information and physicochemical property
34
Identification and analysis of adenine N6-methylation sites in the rice genome
35
4mCPred: machine learning methods for DNA N4‐methylcytosine sites prediction
36
Quantitative Crotonylome Analysis Expands the Roles of p300 in the Regulation of Lysine Crotonylation Pathway
37
Combinatorial Targeting by MicroRNAs Co-ordinates Post-transcriptional Control of EMT.
38
DeepAffinity: Interpretable Deep Learning of Compound-Protein Affinity through Unified Recurrent and Convolutional Neural Networks
39
ACPred-FL: a sequence-based predictor using effective feature representation to improve the prediction of anti-cancer peptides
40
Deep learning sequence-based ab initio prediction of variant effects on expression and disease risk
41
PANNZER2: a rapid functional annotation web server
42
A comprehensive review and comparison of different computational methods for protein remote homology detection
43
MusiteDeep: a deep‐learning framework for general and kinase‐specific phosphorylation site prediction
44
LightGBM: A Highly Efficient Gradient Boosting Decision Tree
45
A deep learning method for lincRNA detection using auto-encoder algorithm
46
Ultradeep Lysine Crotonylome Reveals the Crotonylation Enhancement on Both Histones and Nonhistone Proteins by SAHA Treatment.
47
Capturing non‐local interactions by long short‐term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility
48
SucStruct: Prediction of succinylated lysine residues by using structural properties of amino acids.
49
Global profiling of crotonylation on non-histone proteins
50
Large-Scale Identification of Protein Crotonylation Reveals Its Role in Multiple Cellular Functions.
51
Improving protein disorder prediction by deep bidirectional long short‐term memory recurrent neural networks
52
Recent Progress in Machine Learning-Based Methods for Protein Fold Recognition
53
Convolutional neural network architectures for predicting DNA–protein binding
54
Deep learning in bioinformatics
55
XGBoost: A Scalable Tree Boosting System
56
SRAMP: prediction of mammalian N6-methyladenosine (m6A) sites based on sequence-derived features
57
iEnhancer-2L: a two-layer predictor for identifying enhancers and their strength by pseudo k-tuple nucleotide composition
58
Deep Residual Learning for Image Recognition
59
Optimized sgRNA design to maximize activity and minimize off-target effects of CRISPR-Cas9
60
Harnessing Computational Biology for Exact Linear B-Cell Epitope Prediction: A Novel Amino Acid Composition-Based Feature Descriptor
61
Identification and analysis of the N6-methyladenosine in the Saccharomyces cerevisiae transcriptome
62
Predicting effects of noncoding variants with deep learning–based sequence model
63
Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences
64
Machine learning applications in genetics and genomics
65
Advances in protein contact map prediction based on machine learning.
66
repDNA: a Python package to generate various modes of feature vectors for DNA sequences by incorporating user-defined physicochemical properties and sequence-order effects
67
Adam: A Method for Stochastic Optimization
68
Oncogenic role of long noncoding RNA AF118081 in anti-benzo[a]pyrene-trans-7,8-dihydrodiol-9,10-epoxide-transformed 16HBE cells.
69
iDNA-Prot|dis: Identifying DNA-Binding Proteins by Incorporating Amino Acid Distance-Pairs and Reduced Alphabet Profile into the General Pseudo Amino Acid Composition
70
iHSP-PseRAAAC: Identifying the heat shock protein families using pseudo reduced amino acid alphabet composition.
71
hCKSAAP_UbSite: improved prediction of human ubiquitination sites by exploiting amino acid pattern and properties.
72
Metalearning: a survey of trends and technologies
73
Incorporating key position and amino acid residue features to identify general and species-specific Ubiquitin conjugation sites
74
CPAT: Coding-Potential Assessment Tool using an alignment-free logistic regression model
75
SUMOhydro: A Novel Method for the Prediction of Sumoylation Sites Based on Hydrophobic Properties
76
Protein-RNA interface residue prediction using machine learning: an assessment of the state of the art
77
Identification of 67 Histone Marks and Histone Lysine Crotonylation as a New Type of Histone Modification
78
Discriminative prediction of mammalian enhancers from DNA sequence.
79
Prediction of Ubiquitination Sites by Using the Composition of k-Spaced Amino Acid Pairs
80
Incorporating Distant Sequence Features and Radial Basis Function Networks to Identify Ubiquitin Conjugation Sites
81
Scikit-learn: Machine Learning in Python
82
The WEKA data mining software: an update
83
A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation
84
Prediction of integral membrane protein type by collocated hydrophobic amino acid pairs
85
Machine Learning in Bioinformatics
86
Data clustering: 50 years beyond K-means
87
Predicting Human Nucleosome Occupancy from Primary Sequence
88
Computational identification of ubiquitylation sites from protein sequences
89
Using support vector machine combined with auto covariance to predict protein–protein interactions from protein sequences
90
Matplotlib: A 2D Graphics Environment
91
Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs
92
Predicting protein–protein interactions based only on sequences information
93
Clustering by Passing Messages Between Data Points
94
A coding measure scheme employing electron-ion interaction pseudopotential (EIIP)
95
Statistical Models: Theory and Practice: References
96
Predicting the in vivo signature of human gene regulatory sequence
97
Discriminant Analysis and Statistical Pattern Recognition: McLachlan/Discriminant Analysis & Pattern Recog
98
Prediction of protein subcellular locations by GO-FunD-PseAA predictor.
99
Classification of Nuclear Receptors Based on Amino Acid Composition and Dipeptide Composition*
100
Enzyme family classification by support vector machines
101
Comparison of various algorithms for recognizing short coding sequences of human genes
102
Prediction of RNA-binding proteins from primary sequence by a support vector machine approach.
103
Tackling the Poor Assumptions of Naive Bayes Text Classifiers
104
Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy
105
SVM-Prot: web-based support vector machine software for functional classification of a protein from its primary sequence
106
Amino acid encoding schemes from protein structure alignments: multi-dimensional vectors to describe residue types.
107
An efficient algorithm for large-scale detection of protein families.
108
Greedy function approximation: A gradient boosting machine.
109
Prediction of protein cellular attributes using pseudo‐amino acid composition
110
Accurate Prediction of Protein Secondary Structural Content
111
New techniques for extracting features from protein sequences
112
Prediction of protein subcellular locations by incorporating quasi-sequence-order effect.
113
Prediction of Membrane Protein Types Based on the Hydrophobic Index of Amino Acids
114
Data clustering: a review
115
Recognition of a protein fold in the context of the SCOP classification
116
Using a neural network to backtranslate amino acid sequences
117
Long Short-Term Memory
119
Prediction of protein folding class using global description of amino acid sequence.
120
Mean Shift, Mode Seeking, and Clustering
121
The rational design of amino acid sequences by artificial neural networks and simulated molecular evolution: de novo design of an idealized leader peptidase cleavage site.
122
An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression
123
Discriminant Analysis and Statistical Pattern Recognition
124
Francis Galton's Account of the Invention of Correlation
125
Prediction of protein helix content from an autocorrelation analysis of sequence hydrophobicities
126
A method of comparing the areas under receiver operating characteristic curves derived from the same cases.
127
LIII. On lines and planes of closest fit to systems of points in space
128
Data Mining And Knowledge Discovery Handbook
129
e60 matrix with deep residual neural networks
130
iPromoter-2L: a two-layer predictor for identifying promoters and their types by multi-window-based PseKNC
131
PseKRAAC: a flexible web server for generating pseudo K-tuple reduced amino acids composition
132
Points of Significance: Classification and regression trees
134
Machine learning methods for microRNA gene prediction.
135
Accelerating t-SNE using tree-based algorithms
136
Latent Dirichlet Allocation
137
Pattern Recognition. (4th edn)
138
AdaBoost and the Super Bowl of Classifiers A Tutorial Introduction to Adaptive Boosting
139
Clustering Algorithms II: Hierarchical Algorithms
140
A survey of kernel and spectral methods for clustering
141
Population structure inferred by local spatial autocorrelation: an example from an Amerindian tribal population.
142
Using amphiphilic pseudo amino acid composition to predict enzyme subfamily classes
144
Support-vector networks
145
Statistical Methods for Psychology. Duxbury/Thomson Learning, Pacific
147
Recognition of a protein fold in the context of the Structural Classification of Proteins (SCOP) classification.
148
The global average DNA base composition of coding regions may be determined by the electron-ion interaction potential.
149
Supplementary Supplementary Supplementary Supplementary Methods Methods Methods Methods Comparison Comparison Comparison Comparison of of of of Cnci Cnci Cnci Cnci Performance Performance Performance Performance with with with with Cpc Cpc Cpc Cpc and and and and Phylocsf Phylocsf Phylocsf Phylocsf
150
i) To the best of our knowledge, iLearnPlus is the first GUI-based platform that facilitates machine learning-based analysis and prediction of biological sequences
151
(iii) iLearnPlus provides a variety of ways to visualize the user-defined data and prediction