Poster
in
Workshop: Sampling and Optimization in Discrete Space
Annealed Biological Sequence Optimization
Yuxuan Song · Botian Wang · Hao Zhou · Wei-Ying Ma
Designing biological sequences with desired properties is an impactful research problem with various application scenarios such as protein engineering, anti-body design, and drug discovery. Machine learning algorithms could be applied either to fit the property landscape with supervised learning or generatively propose reasonable candidates to reduce wet lab efforts. From the learning perspective, the key challenges lie in the sharp property landscape, i.e. several mutations could dramatically change the protein property and the large biological sequence space. In this paper, we propose annealed sequence optimization (ANSO) and aim to simultaneously take the two main challenges into account by a paired surrogate model training paradigm and sequence sampling procedure. The extensive experiments on a series of protein sequence design tasks have demonstrated the effectiveness over several advanced baselines.