3260 papers • 126 benchmarks • 313 datasets
This task has no description! Would you like to contribute one?
(Image credit: Papersgraph)
These leaderboards are used to track progress in multiple-sequence-alignment-10
Use these libraries to find multiple-sequence-alignment-10 models and implementations
No datasets available.
No subtasks available.
This work demonstrates state-of-the-art protein structure prediction (PSP) results using embeddings and deep learning models for prediction of backbone atom distance matrices and torsion angles, and creates a new gold standard dataset of proteins which is comprehensive and easy to use.
A novel generative language model is introduced, MSA-Augmenter, which leverages protein-specific attention mechanisms and large-scale MSAs to generate useful, novel protein sequences not currently found in databases, which supplement shallow MSAs, enhancing the accuracy of structural property predictions.
This work presents an extension of Felsenstein's algorithm to indel models defined on entire sequences, without the need to condition on one multiple alignment, using a hierarchical stochastic approximation technique which makes this algorithm tractable for alignment analyses of reasonable size.
Experimental results on benchmark data sets demonstrate that the prediction performance of ESM2 feature representation comprehensively outperforms evolutionary information-based hidden Markov model (HMM) features.
UPP is presented, a multiple sequence alignment method that uses a new machine learning technique, the ensemble of hidden Markov models, which produces highly accurate alignments for both nucleotide and amino acid sequences, even on ultra-large datasets or datasets containing fragmentary sequences.
The presented algorithm has similar execution times to ClustalΩ, and is orders of magnitude faster than full consistency approaches, like MSAProbs or PicXAA, which make QuickProbs 2 an excellent tool for aligning families ranging from few, to hundreds of proteins.
Align-RUDDER is introduced, which is RUDDER with two major modifications, which replaces RUDder's LSTM model by a profile model that is obtained from multiple sequence alignment of demonstrations, which considerably reduces the delay of rewards, thus speeding up learning.
This paper trained a decision-making model based on the convolutional neural networks and bi-directional long short term memory networks, and progressively aligned the input protein sequences by calculating different posterior probability matrices to improve the accuracy of progressive multiple protein sequence alignment method.
A protein language model which takes as input a set of sequences in the form of a multiple sequence alignment and is trained with a variant of the masked language modeling objective across many protein families surpasses current state-of-the-art unsupervised structure learning methods by a wide margin.
Adding a benchmark result helps the community track progress.