Timezone: »
We consider design problems wherein the goal is to maximize or specify the value of one or more properties of interest. For example, in protein design, one may wish to find the protein sequence which maximizes its fluorescence. We assume access to one or more black box stochastic "oracle" predictive functions, each of which maps from an input (e.g., protein sequences or images) design space to a distribution over a property of interest (e.g., protein fluorescence or image content). Given such stochastic oracles, our problem is to find an input that best achieves our goal. At first glance, this problem can be framed as one of optimizing the oracle with respect to the input. However, in most real world settings, the oracle will not exactly capture the ground truth, and critically, may catastrophically fail to do so in extrapolation space. Thus, we frame the goal as one modelling the density of some original set of training data (e.g., a set of real protein sequences), and then conditioning this distribution on the desired properties, which yields an annealed adaptive sampling method which is also well-suited to rare conditioning events. We demonstrate experimentally that our approach outperforms other recently presented methods for tackling similar problems.
Author Information
David Brookes (University of California, Berkeley)
Jennifer Listgarten (University of California, Berkeley)
Related Events (a corresponding poster, oral, or spotlight)
-
2019 Poster: Conditioning by adaptive sampling for robust design »
Thu. Jun 13th 01:30 -- 04:00 AM Room Pacific Ballroom
More from the Same Authors
-
2020 : "Machine learning-based design (of proteins, small molecules and beyond)" »
Jennifer Listgarten -
2019 Workshop: ICML 2019 Workshop on Computational Biology »
Donna Pe'er · Sandhya Prabhakaran · Elham Azizi · Abdoulaye BanirĂ© Diallo · Anshul Kundaje · Barbara Engelhardt · Wajdi Dhifli · Engelbert MEPHU NGUIFO · Wesley Tansey · Julia Vogt · Jennifer Listgarten · Cassandra Burdziak · Workshop CompBio