Poster
in
Workshop: Structured Probabilistic Inference and Generative Modeling
Discrete Diffusion Posterior Sampling for Protein Design
Mert Cemri · Ajil Jalal · Kannan Ramchandran
Keywords: [ Protein Generation ] [ Posterior Sampling ] [ Diffusion ] [ inverse problems ]
Designing new protein sequences that exhibit desirable functionality carries significant implications for medicine and biotechnology. Traditional methods for protein design have prominently comprised of experimental methods, such as in vitro-screening or animal experiments, which are costly and time-consuming. We propose a generative model based approach to protein sequence generation using guided discrete diffusion. We introduce a novel diffusion-based posterior sampling algorithm which uses a BERT-like transformer model to iteratively denoise discrete protein sequences. This approach demonstrates an efficient way to leverage an oracle that is trained to predict the desired functionality and can guide the protein generation procedure. Our experiments demonstrate that our method outperforms the state of the art, achieving higher functionality scores as well as higher ProtGPT2 likelihood scores.