Proximal Exploration for Model-guided Protein Sequence Design

Zhizhou Ren · Jiahan Li · Fan Ding · Yuan Zhou · Jianzhu Ma · Jian Peng

Hall E #120

Keywords: [ APP: Genetics, Cell Biology, etc ] [ APP: Chemistry and Drug Discovery ] [ Applications ]

[ Abstract ]
[ Poster [ Paper PDF
Wed 20 Jul 3:30 p.m. PDT — 5:30 p.m. PDT
Spotlight presentation: Applications
Wed 20 Jul 7:30 a.m. PDT — 9 a.m. PDT


Designing protein sequences with a particular biological function is a long-lasting challenge for protein engineering. Recent advances in machine-learning-guided approaches focus on building a surrogate sequence-function model to reduce the burden of expensive in-lab experiments. In this paper, we study the exploration mechanism of model-guided sequence design. We leverage a natural property of protein fitness landscape that a concise set of mutations upon the wild-type sequence are usually sufficient to enhance the desired function. By utilizing this property, we propose Proximal Exploration (PEX) algorithm that prioritizes the evolutionary search for high-fitness mutants with low mutation counts. In addition, we develop a specialized model architecture, called Mutation Factorization Network (MuFacNet), to predict low-order mutational effects, which further improves the sample efficiency of model-guided evolution. In experiments, we extensively evaluate our method on a suite of in-silico protein sequence design tasks and demonstrate substantial improvement over baseline algorithms.

Chat is not available.