1
OpenFold: Retraining AlphaFold2 yields new insights into its learning mechanisms and capacity for generalization
2
EquiFold: Protein Structure Prediction with a Novel Coarse-Grained Structure Representation
3
Protein structure generation via folding diffusion
4
PiFold: Toward effective and efficient protein inverse folding
5
Accurate prediction of nucleic acid and protein-nucleic acid complexes using RoseTTAFoldNA
6
SE(3) Equivalent Graph Attention Network as an Energy-Based Model for Protein Side Chain Conformation
7
Learning inverse folding from millions of predicted structures
8
Uni-Fold: An Open-Source Platform for Developing Protein Folding Models beyond AlphaFold
9
PeTriBERT : Augmenting BERT with tridimensional encoding for inverse protein folding and design
10
Atomic protein structure refinement using all-atom graph representations and SE(3)-equivariant graph neural networks
11
SPEACH_AF: Sampling protein ensembles and conformational heterogeneity with Alphafold2
12
ProtGPT2 is a deep unsupervised language model for protein design
13
High-resolution de novo structure prediction from primary sequence
14
Uncertainty-aware Multi-modal Learning via Cross-modal Random Network Prediction
15
Predicting the structure of large protein complexes using AlphaFold and Monte Carlo tree search
16
HelixFold: An Efficient Implementation of AlphaFold2 using PaddlePaddle
17
Accurate RNA 3D structure prediction using a language model-based deep learning approach
18
ProGen2: Exploring the Boundaries of Protein Language Models
19
PSP: Million-level Protein Sequence Dataset for Protein Structure Prediction
20
Equiformer: Equivariant Graph Attention Transformer for 3D Atomistic Graphs
21
State-of-the-Art Estimation of Protein Model Accuracy Using AlphaFold
22
PEER: A Comprehensive and Multi-Task Benchmark for Protein Sequence Understanding
23
Robust deep learning based protein sequence design using ProteinMPNN
24
NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning
25
ColabFold: making protein folding accessible to all
26
Tranception: protein fitness prediction with autoregressive transformers and inference-time retrieval
27
RITA: a Study on Scaling Up Generative Protein Sequence Models
28
Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment
29
Structure-aware protein self-supervised learning
30
Few Shot Protein Generation
31
Training Compute-Optimal Large Language Models
32
AlphaFold encodes the principles to identify high affinity peptide binders
33
Protein Representation Learning by Geometric Structure Pretraining
34
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours
35
Learning functional properties of proteins with language models
36
Transformer Quality in Linear Time
37
SimGRACE: A Simple Framework for Graph Contrastive Learning without Data Augmentation
38
Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies
39
AlphaDesign: A graph protein design method and benchmark on AlphaFoldDB
40
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
41
OntoProtein: Protein Pretraining With Gene Ontology Embedding
42
Single-sequence protein structure prediction using supervised transformer protein language models
43
Controllable protein design with language models
44
Using metagenomic data to boost protein structure prediction and discovery
45
Can AlphaFold2 predict the impact of missense mutations on structure?
46
High-Resolution Image Synthesis with Latent Diffusion Models
47
AlphaFill: enriching the AlphaFold models with ligands and co-factors
48
Contrastive learning on protein embeddings enlightens midnight zone
49
Identification of Enzymatic Active Sites with Unsupervised Language Modeling
50
Pre-training Co-evolutionary Protein Representation via A Pairwise Masked Language Model
51
Colossal-AI: A Unified Deep Learning System For Large-Scale Parallel Training
52
Pre-trained Language Models in Biomedical Domain: A Systematic Survey
53
Improved prediction of protein-protein interactions using AlphaFold2
54
Protein complex prediction with AlphaFold-Multimer
55
A structural biology community assessment of AlphaFold2 applications
56
Using AlphaFold to predict the impact of single mutations on protein stability and function
57
Toward More General Embeddings for Protein Design: Harnessing Joint Representations of Sequence and Structure
58
Single-sequence protein structure prediction using language models from deep learning
59
Faculty Opinions recommendation of Accurate prediction of protein structures and interactions using a three-track neural network.
60
Deep neural language modeling enables functional protein generation across families
61
Highly accurate protein structure prediction with AlphaFold
62
Language models enable zero-shot prediction of the effects of mutations on protein function
63
Fast and effective protein model refinement using deep graph neural networks
64
SPOT-Contact-Single: Improving Single-Sequence-Based Prediction of Protein Contact Map using a Transformer Language Model
65
Learning the protein language: Evolution, structure, and function.
66
BEiT: BERT Pre-Training of Image Transformers
67
CRASH: Raw Audio Score-based Generative Modeling for Controllable High-resolution Drum Sound Synthesis
68
Pre-Trained Models: Past, Present and Future
69
Evolutionary velocity with protein language models
70
Distillation of MSA Embeddings to Folded Protein Structures with Graph Transformers
71
ProteinBERT: a universal deep-learning model of protein sequence and function
72
SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning
73
Fold2Seq: A Joint Sequence(1D)-Fold(3D) Embedding-based Generative Model for Protein Design
74
Bidirectional Language Modeling: A Systematic Literature Review
75
Prediction of RNA–protein interactions using a nucleotide language model
76
The language of proteins: NLP, machine learning & protein sequences
77
Deep Generative Modelling: A Comparative Review of VAEs, GANs, Normalizing Flows, Energy-Based and Autoregressive Models
78
Accurate prediction of inter-protein residue–residue contacts for homo-oligomeric protein complexes
79
Short-Term Traffic Flow Prediction for Urban Road Sections Based on Time Series Analysis and LSTM_BILSTM Method
81
Multi-task deep learning for concurrent prediction of protein structural properties
82
Adversarial Contrastive Pre-training for Protein Sequences
83
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
84
Single Layers of Attention Suffice to Predict Protein Contacts
85
Transformer protein language models are unsupervised structure learners
86
A multi-task deep-learning system for predicting membrane associations and secondary structures of proteins
87
Profile Prediction: An Alignment-Based Pre-Training Task for Protein Sequence Models
88
REALDIST: Real-valued protein distance prediction
89
Study of Real-Valued Distance Prediction For Protein Structure Prediction with Deep Learning
90
Deep learning-based prediction of protein structure using learned representations of multiple sequence alignments
91
Protein Structural Alignments From Sequence
92
Two-Stage Distance Feature-based Optimization Algorithm for De novo Protein Structure Prediction
93
Scaling Hidden Markov Language Models
94
Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks
95
Masked Label Prediction: Unified Massage Passing Model for Semi-Supervised Classification
96
Self-Supervised Contrastive Learning of Protein Representations By Mutual Information Maximization
97
Combination of deep neural network with attention mechanism enhances the explainability of protein contact prediction
98
Learning from Protein Structure with Geometric Vector Perceptrons
99
Improved protein structure refinement guided by deep learning based accuracy estimation
100
ProtTrans: Towards Cracking the Language of Life’s Code Through Self-Supervised Deep Learning and High Performance Computing
101
DeepSpeed: System Optimizations Enable Training Deep Learning Models with Over 100 Billion Parameters
102
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
103
Knowledge Distillation: A Survey
104
Template-based prediction of protein structure with deep learning
105
Language Models are Few-Shot Learners
106
Secondary Structure and Contact Guided Differential Evolution for Protein Structure Prediction
107
A fully open-source framework for deep learning protein real-valued distances
108
Energy-based models for atomic-resolution protein conformations
109
ProteinGCN: Protein model quality assessment using Graph Convolutional Networks
110
DeepDist: real-value inter-residue distance prediction with deep residual convolutional network
111
ProGen: Language Modeling for Protein Generation
112
Directional Message Passing for Molecular Graphs
113
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
114
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
115
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
116
A Simple Framework for Contrastive Learning of Visual Representations
117
Scaling Laws for Neural Language Models
118
Deep learning methods in protein structure prediction
119
Improved protein structure prediction using potentials from deep learning
120
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
121
Improved protein structure prediction using predicted interresidue orientations
122
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
123
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
124
Critical assessment of methods of protein structure prediction (CASP)—Round XIII
125
Unified rational protein engineering with sequence-based deep representation learning
126
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
127
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
128
Axial Attention in Multidimensional Transformers
129
K-BERT: Enabling Language Representation with Knowledge Graph
130
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
131
Automatically Extracting Challenge Sets for Non-Local Phenomena in Neural Machine Translation
132
CTRL: A Conditional Transformer Language Model for Controllable Generation
133
Knowledge Enhanced Contextual Word Representations
134
FinBERT: Financial Sentiment Analysis with Pre-trained Language Models
135
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
136
VisualBERT: A Simple and Performant Baseline for Vision and Language
137
PconsC4: fast, accurate and hassle-free contact predictions
138
RoBERTa: A Robustly Optimized BERT Pretraining Approach
139
SpanBERT: Improving Pre-training by Representing and Predicting Spans
140
DeepPrime2Sec: Deep Learning for Protein Secondary Structure Prediction from the Primary Sequences
141
UDSMProt: universal deep sequence models for protein classification
142
Evaluating Protein Transfer Learning with TAPE
143
XLNet: Generalized Autoregressive Pretraining for Language Understanding
144
Defending Against Neural Fake News
145
ERNIE: Enhanced Language Representation with Informative Entities
146
MASS: Masked Sequence to Sequence Pre-training for Language Generation
147
Protein Structure Determination in Living Cells
148
Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences
149
Improving Multi-Task Deep Neural Networks via Knowledge Distillation for Natural Language Understanding
150
Modeling the language of life – Deep Learning Protein Sequences
151
ERNIE: Enhanced Representation through Knowledge Integration
152
75 Languages, 1 Model: Parsing Universal Dependencies Universally
153
VideoBERT: A Joint Model for Video and Language Representation Learning
154
Generative Models for Graph-Based Protein Design
155
DESTINI: A deep-learning approach to contact-driven protein structure prediction
156
HH-suite3 for fast remote homology detection and deep protein annotation
157
Learning protein sequence embeddings using information from structure
158
Multi-Task Deep Neural Networks for Natural Language Understanding
159
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
160
Cross-lingual Language Model Pretraining
161
Transformer-XL: Attentive Language Models beyond a Fixed-Length Context
162
Integrated NMR and cryo-EM atomic-resolution structure determination of a half-megadalton enzyme complex
163
A benchmark study of sequence alignment methods for protein clustering
164
Deep learning extends de novo protein modelling coverage of genomes using iteratively predicted structural constraints
165
Distance-based protein folding powered by deep learning
166
UniProt: a worldwide hub of protein knowledge
167
The Pfam protein families database in 2019
168
Protein Data Bank: the single global archive for 3D macromolecular structure data
169
Single‐sequence‐based prediction of protein secondary structures and solvent accessibility by deep whole‐sequence learning
170
High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features
171
Relational Graph Attention Networks
172
DeepCDpred: Inter-residue distance and contact prediction for improved prediction of protein structure
173
Protein-level assembly increases protein sequence recovery from metagenomic samples manyfold
174
Learned protein embeddings for machine learning
175
Accurate prediction of protein contact maps by coupling residual two-dimensional bidirectional long short-term memory with convolutional neural networks
176
NetSurfP-2.0: improved prediction of protein structural features by integrated deep learning
177
Deep Contextualized Word Representations
178
End-to-end differentiable learning of protein structure
179
Generative modeling for protein structures
180
Bayesian statistical approach for protein residue-residue contact prediction
181
Enhancing Evolutionary Couplings with Deep Convolutional Neural Networks
182
DNCON2: improved protein contact prediction using two-level deep convolutional neural networks
183
Generative Recurrent Networks for De Novo Drug Design
184
Improved protein contact predictions with the MetaPSICOV2 server in CASP12
185
Capturing non‐local interactions by long short‐term memory bidirectional recurrent neural networks for improving prediction of protein secondary structure, backbone angles, contact numbers and solvent accessibility
186
Conformational Space Sampling Method Using Multi-Subpopulation Differential Evolution for De novo Protein Structure Prediction
187
Recent Trends in Deep Learning Based Natural Language Processing
188
Attention is All you Need
189
A hybrid method for prediction of protein secondary structure based on multiple artificial neural networks
190
Balancing exploration and exploitation in population‐based sampling improves fragment‐based de novo protein structure prediction
191
Learning to Generate Reviews and Discovering Sentiment
192
A Structured Self-attentive Sentence Embedding
193
SCOPe: Manual Curation and Artifact Removal in the Structural Classification of Proteins - extended Database.
194
Clustering huge protein sequence sets in linear time
195
High GC content causes orphan proteins to be intrinsically disordered
196
Uniclust databases of clustered and deeply annotated protein sequences and alignments
197
Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning
198
Multiplicative LSTM for sequence modelling
199
Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model
200
Generative Topic Embedding: a Continuous Representation of Documents
201
UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling
202
RaptorX-Property: a web server for protein structure property prediction
203
Identity Mappings in Deep Residual Networks
204
Neural Architectures for Named Entity Recognition
205
Improved de novo structure prediction in CASP11 by incorporating coevolution information into Rosetta
206
MUST-CNN: A Multilayer Shift-and-Stitch Deep Convolutional Architecture for Sequence-Based Protein Structure Prediction
207
Protein Secondary Structure Prediction Using Deep Convolutional Neural Fields
208
Combining Evolutionary Information and an Iterative Sampling Strategy for Accurate Protein Structure Prediction
209
The master algorithm: how the quest for the ultimate learning machine will remake our world
210
CONFOLD: Residue‐residue contact‐guided ab initio protein folding
211
Novor: Real-Time Peptide de Novo Sequencing Software
212
Improving prediction of secondary structure, local backbone angles, and solvent accessible surface area of proteins by iterative deep learning
213
RBO Aleph: leveraging novel information sources for protein structure prediction
214
Protein Secondary Structure Prediction with Long Short Term Memory Networks
215
UniRef clusters: a comprehensive and scalable alternative for improving sequence similarity searches
216
SSpro/ACCpro 5: almost perfect prediction of protein secondary structure and relative solvent accessibility using profiles, machine learning and structural similarity
217
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
218
ImageNet Large Scale Visual Recognition Challenge
219
Neural Machine Translation by Jointly Learning to Align and Translate
220
De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts
221
Practical aspects of protein co-evolution
222
Deep Supervised and Convolutional Generative Stochastic Network for Protein Secondary Structure Prediction
223
Protein Secondary Structure Prediction Using Support Vector Machines (SVMs)
224
SOLVING ION CHANNEL KINETICS WITH THE QuB SOFTWARE
225
Dropout Improves Recurrent Neural Networks for Handwriting Recognition
226
Semantic Parsing as Machine Translation
227
DNA motif elucidation using belief propagation
228
Efficient Estimation of Word Representations in Vector Space
229
Update on activities at the Universal Protein Resource (UniProt) in 2013
231
A Probabilistic Fragment-Based Protein Structure Prediction Algorithm
232
Ab initio protein structure assembly using continuous structure fragments and optimized knowledge‐based force field
233
Genomics-aided structure prediction
234
Protein structure determination from pseudocontact shifts using ROSETTA.
235
De Novo Sequencing and Homology Searching‡‡*
236
The Complex Folding Network of Single Calmodulin Molecules
237
A Survey on Transfer Learning
238
PyRosetta: a script-based interface for implementing molecular modeling algorithms using Rosetta
239
Identification of direct residue contacts in protein–protein interaction by message passing
240
A dynamic Bayesian network approach to protein secondary structure prediction
241
Fast model-based protein homology detection without alignment
242
Protein structure determination from NMR chemical shifts
243
PISCES: recent improvements to a PDB sequence culling server
244
TM-align: a protein structure alignment algorithm based on the TM-score
245
Predicting protein quaternary structure by pseudo amino acid composition
246
TOUCHSTONE II: a new approach to ab initio protein structure prediction.
247
A Neural Probabilistic Language Model
248
Gene Ontology: tool for the unification of biology
249
SCOP: a structural classification of proteins database
250
Learning to Learn: Introduction and Overview
251
CATH--a hierarchic classification of protein domain structures.
252
Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions.
253
Funnels, pathways, and the energy landscape of protein folding: A synthesis
254
Improved prediction of protein secondary structure by use of sequence profiles and neural networks.
255
Protein folding funnels: a kinetic approach to the sequence-structure relationship.
256
Maximum likelihood alignment of DNA sequences.
257
Generative De Novo Protein Design with Global Context
258
Pre-training Graph Neural Networks for Molecular Representations: Retrospect and Prospect
260
Improve the Protein Complex Prediction with Protein 1 Language Models
261
Advancing protein language models with linguistics: a roadmap for improved interpretability
262
Evidence for and Applications of Physics-Based Reasoning in AlphaFold
263
Exploring evolution-based & -free protein language models as protein function predictors
264
Few-Shot Learning of Accurate Folding Landscape for Protein Structure Prediction
265
A Review of Protein Structure Prediction using Deep Learning
267
Electra: Pre-training text encoders as discriminators rather than generators
268
Self-supervised representation learning of protein tertiary structures (ptsrep): Protein engineering as a case study
269
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
270
Language Models are Unsupervised Multitask Learners
271
Porter 5: fast, state-of-the-art ab initio prediction of protein secondary structure in 3 and 8 classes
272
Improving Language Understanding by Generative Pre-Training
273
Biomolecular Structure Prediction via Coevolutionary Analysis: A Guide to the Statistical Framework
274
Protein contact maps: A binary depiction of protein 3D structures
276
A Deep Learning Network Approach to ab initio Protein Secondary Structure Prediction
277
First Links in the Markov Chain
279
A novel connectionist system for improved unconstrained handwriting recognition
280
The Application of Hidden Markov Models in Speech Recognition
281
Orthologs, paralogs, and evolutionary genomics.
282
Evolino: Hybrid Neuroevolution / Optimal Linear Search for Sequence Prediction
283
A threading approach to protein structure prediction: studies on TNF-like molecules, Rev proteins, and protein kinases
284
Protein Structure Prediction Using Rosetta
285
In Advances in Neural Information Processing Systems
286
Untersuchungen zu dynamischen neuronalen Netzen
288
Protein secondary structure prediction with a neural network.
289
Experimental and theoretical aspects of protein folding.
291
The arrangement of amino acids in proteins.
292
Language models of protein sequences at the scale of evolution enable accurate structure prediction