This page of BioMoDes lists state-of-the-art and emerging tools for Protein Structure Search, Alignment, and DB Management.

4.1. Sequence and Structure Alignment

  • Foldseek-Multimer: A method and webserver for fast large-scale structural alignment of protein complexes. Foldseek-Multimer is orders of magnitude faster than US-align with similar accuracy, is sensitive and suitable for complexes with low seq ID, is capable of all-vs-all searches (billions of complex-pairs in 24h).
    Posted: April 14, 2024
    Preprint | Code (GitHub) | Webserver
  • PS-GO: A parametric protein search engine that integrates protein structure, sequence information, and computable parameters. PS-GO is inspired by the classic Google’s PageRank algorithm to offer a protein search engine to effectively retrieve, compare, analyse, and interpret the growing body of protein data. PS-GO provides an interface that enables users to give specific parameters to search dbs for protein sequences and structures, and optimize/redesign proteins via parametric protein design approach. PS-GO supports range-based parameter search, such as RC.Score, hydrophobicity, instability, size, isoelectric point, and SA. PS-GO also supports natural language queries describing desired protein properties or functions, powered by OpenAI’s GPT-4 (to convert the natural language query into structured parameter conditions). For its full functionality, PS-GO links with PROFASA (Protein Fragment And Structure Analysis), another web-based resource from the same research group.
    Published: April 8, 2024
    Paper | Webserver (PS-GO) | Webserver (PROFASA)

  • PLMSearch/PLMAlign: A fast protein language model-based method for homologous protein search from sequence input. PLMSearch searches millions of query-target pairs in seconds similar to MMseqs2 but with 3x increase in sensitivity comparable to SOTA structure-based search methods. PLMSearch also captures remote homology pairs similar to structure-based methods. PLMSearch integrates 3 main steps/modules: 1. PfamClan filters for similarity based on Pfam clan domain, 2. SS-predictor predicts structural similarity for all query-target pairs, 3. PLMSearch-PLMAlign provides sequence alignment and scores.
    Published: March 30, 2024
    Paper | Code (GitHub) - PLMSearch | Code (Code Ocean) - PLMSearch | Webserver (PLMSearch) | Code (GitHub) - PLMAlign | Webserver (PLMAlign)

