4. Protein Search, Alignment, & DB Management Tools
This page of BioMoDes lists state-of-the-art and emerging tools for Protein Structure Search, Alignment, and DB Management.
4.1. Sequence and Structure Alignment
2024 (Click to collapse/expand)
- Foldseek-Multimer: A method and webserver for fast large-scale structural alignment of protein complexes. Foldseek-Multimer is orders of magnitude faster than US-align with similar accuracy, is sensitive and suitable for complexes with low seq ID, is capable of all-vs-all searches (billions of complex-pairs in 24h).
Posted: April 14, 2024
Preprint | Code (GitHub) | Webserver
4.2 Protein Search
2024
-
PS-GO: A parametric protein search engine that integrates protein structure, sequence information, and computable parameters. PS-GO is inspired by the classic Google’s PageRank algorithm to offer a protein search engine to effectively retrieve, compare, analyse, and interpret the growing body of protein data. PS-GO provides an interface that enables users to give specific parameters to search dbs for protein sequences and structures, and optimize/redesign proteins via parametric protein design approach. PS-GO supports range-based parameter search, such as RC.Score, hydrophobicity, instability, size, isoelectric point, and SA. PS-GO also supports natural language queries describing desired protein properties or functions, powered by OpenAI’s GPT-4 (to convert the natural language query into structured parameter conditions). For its full functionality, PS-GO links with PROFASA (Protein Fragment And Structure Analysis), another web-based resource from the same research group.
Published: April 8, 2024
Paper | Webserver (PS-GO) | Webserver (PROFASA) -
PLMSearch/PLMAlign: A fast protein language model-based method for homologous protein search from sequence input. PLMSearch searches millions of query-target pairs in seconds similar to MMseqs2 but with 3x increase in sensitivity comparable to SOTA structure-based search methods. PLMSearch also captures remote homology pairs similar to structure-based methods. PLMSearch integrates 3 main steps/modules: 1. PfamClan filters for similarity based on Pfam clan domain, 2. SS-predictor predicts structural similarity for all query-target pairs, 3. PLMSearch-PLMAlign provides sequence alignment and scores.
Published: March 30, 2024
Paper | Code (GitHub) - PLMSearch | Code (Code Ocean) - PLMSearch | Webserver (PLMSearch) | Code (GitHub) - PLMAlign | Webserver (PLMAlign)
I try my best to make the information on this website as accurate as possible. If you find any errors in the contents of this page or any other page on this website, I would greatly appreciate that you kindly get in touch with me at contact@abeebyekeen.com.
If you are interested in joining my free weekly “BioMoDes and Top Reads” newsletter, please subscribe below.