Contents

2. Biomolecular Design Tools


This page of BioMoDes lists state-of-the-art and emerging tools for Biomolecular Design.


2.1. Protein Sequence Design

2024 (Click to collapse/expand)
  • AF2seq-MPNN: A protocol, based on AF2seq and ProteinMPNN, for the computational design of topologically complex protein folds and soluble analogues of membrane proteins.
    Published: June 19, 2024
    Paper | Preprint | AF2seq Code (GitHub) | Colab Notebook 1 | Colab Notebook 2

  • CarbonDesign: A protein sequence design method that adapts the success ingredients of AF2. CarbonDesign utilizes Inverseformer, a network architecture adapted from AlphaFold’s Evoformer. Based on scTM score, sequence recovery rate, and BLOSUM score, CarbonDesign outperformed other published methods, including versions of ProteinMPNN and ESM-IF, on CAMEO and CASP15 test sets, as well as RFdiffusion-generated backbones. CarbonDesign is also capable of predicting the effects of mutations on protein function.
    Published: May 23, 2024
    Paper | Preprint | Code (GitHub) | Code (Code Ocean)

  • AntiFold: An inverse folding model for antibody sequence design based on ESM-IF1.
    Posted: May 06, 2024
    Preprint | Code (GitHub) | Webserver | Colab Notebook

  • SPDesign: A method that combines structural sequence profile, fast shape recognition, and pre-trained language models for protein sequence design.
    Published: April 09, 2024
    Paper | Webserver | Code

  • Evo: A long-context foundation model that generalizes across the central dogma of biology: DNA, RNA, and proteins. Evo is a 7 billion parameter model trained to generate DNA sequences and is capable of prediction and generative tasks, from molecules to whole genomes.
    Posted: March 06, 2024
    Preprint | Code (GitHub) | Code (PyPI) | Blog | Playground | Colab Notebook

  • PocketGen: A method for generating full-atom ligand-binding pockets to design small molecule-binding proteins. PocketGen uses a co-design strategy that, given the ligand and the scaffold, simultaneously designs the sequence and structure of the protein pocket.
    Posted: Feb 28, 2024
    Preprint | Code (GitHub) | Blog

  • CoVES: A “simple” model to design functional and diverse combinatorial protein variants.
    Published: Feb 22, 2024
    Published: Jan 06, 2024
    Paper | Code (GitHub) | Code (Zenodo)

  • Multiflow: A generative model for protein sequence and structure co-design based on the DFM and FrameFlow models.
    Posted: Feb 07, 2024
    Preprint | Code (GitHub)

2023
2022
  • ESM-IF1: An inverse folding model that generates amino acid sequences from given protein backbone. ESM-IF1 was trained with 12 million AlphaFold2-predicted protein structures and achieves 51% native sequence recovery. ESM-IF1 generalizes to other tasks like design of protein complexes, partially masked structures, binding interfaces, and multiple states.
    Posted: Sep 06, 2022
    Preprint | Code (GitHub) | Colab Notebook

2.2. Protein Structure Design/Generation

2024
  • Proteus: A diffusion model for de novo protein backbone generation that, based on multiple metrics, surpasses other leading methods, including RFdiffusion, Genie, FrameDiff, and Chroma.
    Posted: Feb 12, 2024
    Preprint | Code (GitHub)

  • Multiflow: A generative model for protein sequence and structure co-design based on the DFM and FrameFlow models.
    Posted: Feb 07, 2024
    Preprint | Code (GitHub)

  • FrameFlow: FrameFlow is an SE(3) flow matching model for fast protein backbone generation, adapted from the diffusion-based FrameDiff model. FrameFlow has also been extended to the motif-scaffolding task. Posted: Jan 08, 2024 | Oct 10, 2023
    Preprint 1 | Preprint 2 | Code (GitHub)

  • RFdiffusion All-Atom (RFdiffusionAA): A diffusion model that generates de novo protein structures around small molecules and other non-protein targets. RFdiffusionAA was developed by fine-tuning RosettaFold All-Atom (RFAA) on structure denoising.
    Published: March 07, 2024
    Paper | Code (GitHub)


2.3. RNA Sequence and Structure Design

2024
  • RNAFlow: A flow-matching model for simultaneous design of RNA structure and sequence conditioned on protein structure and sequence. The denoising network of RNAFlow comprises: 1. an RNA inverse folding model, and 2. a pre-trained RosettaFold2NA network. In each design cycle, RNAFlow first designs a RNA sequence for a given noised protein-RNA complex, and then uses RosettaFold2NA to predict a denoised RNA structure.
    Posted: May 29, 2024
    Preprint | Code (GitHub)

  • gRNAde: The “ProteinMPNN for RNA sequence design”. gRNAde is a geometric DL-based model for computational design of RNA sequences given 3D RNA backbone structures. gRNAde enables both single- and multi-state fixed-bb sequence design by generating candidates RNA sequences conditioned on one or more bb 3D conformations.
    Posted: April 01, 2024
    Preprint | Code (GitHub) | Colab Notebook 1 | Colab Notebook 2

  • Evo: A long-context foundation model that generalizes across the central dogma of biology: DNA, RNA, and proteins. Evo is a 7 billion parameter model trained to generate DNA sequences and is capable of prediction and generative tasks, from molecules to whole genomes.
    Posted: March 06, 2024
    Preprint | Code (GitHub) | Code (PyPI) | Blog | Playground | Colab Notebook

  • GenerRNA: A generative pre-trained language model for de novo RNA design. Posted: Feb 08, 2024
    Preprint | Code (GitHub)

  • RfamGen: A generative model for designing functional RNA family sequences. RfamGen incorporates alignment and consensus secondary structure information and generates novel and functional RNA family sequences.
    Published: Jan 18, 2024
    Paper | Code (GitHub)


2.4. DNA Sequence Design

2024
  • Evo: A long-context foundation model that generalizes across the central dogma of biology: DNA, RNA, and proteins. Evo is a 7 billion parameter model trained to generate DNA sequences and is capable of prediction and generative tasks, from molecules to whole genomes.
    Posted: March 06, 2024
    Preprint | Code (GitHub) | Code (PyPI) | Blog | Playground | Colab Notebook

  • DNA-Diffusion: A diffusion model to generate context-/cell type-specific DNA regulatory sequences.
    Posted: Feb 01, 2024
    Preprint | Code (GitHub) | Documentation

2.5 Antibody Sequence/Structure Design

2024
  • GeoAB: A method for computational design and optimization (affinity maturation) of antibody. GeoAB utilizes a co-design strategy, predicting the structure of a CDR and optimized 1D sequences for structure.
    Posted: May 17, 2024
    Preprint | Code (GitHub)

  • IgDiff: A model for de novo antibody design (structure generation) that adapted FrameDiff (a diffusion-based model for protein backbone generation) by fine-tuning it on artificial antibody structures.
    Posted: May 13, 2024
    Preprint | Code (Zenodo)

  • AntiFold: An inverse folding model for antibody sequence design based on ESM-IF1.
    Posted: May 06, 2024
    Preprint | Code (GitHub) | Webserver | Colab Notebook

  • tFold-Ab and tFold-Ag: Methods for antibody and antibody-antigen complex modelling and design by Tencent.
    Posted: Feb 08, 2024
    Preprint | Code (GitHub)



I try my best to make the information on this website as accurate as possible. If you find any errors in the contents of this page or any other page on this website, I would greatly appreciate that you kindly get in touch with me at contact@abeebyekeen.com.


If you are interested in joining my free weekly “BioMoDes and Top Reads” newsletter, please subscribe below.

* indicates required