Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Sampling and Optimization in Discrete Space

Optimizing protein fitness using Gibbs sampling with Graph-based Smoothing

Andrew Kirjner · Jason Yim · Raman Samusevich · Tommi Jaakkola · Regina Barzilay · Ila R. Fiete


Abstract:

The ability to design novel proteins with higher fitness on a given task would be revolutionary for many fields of medicine. However, brute-force search through the combinatorially large space of sequences is infeasible. Prior methods constrain search to a small mutational radius from a reference sequence, but such heuristics drastically limit the design space. Our work seeks to remove the restriction on mutational distance while enabling efficient exploration. We propose Gibbs sampling with Graph-based Smoothing (GGS) which iteratively applies Gibbs with gradients to propose advantageous mutations using graph-based smoothing to remove noisy gradients that lead to false positives. Our method is state-of-the-art in discovering high-fitness proteins with up to 8 mutations from the training set. We study the GFP and AAV design problems, ablations, and baselines to elucidate the results. Code: https://github.com/kirjner/GGS

Chat is not available.