Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Next Generation of Sequence Modeling Architectures

Recurrent VAE with Gaussian Process Decoders for De novo Molecular Generation

Vidhi Lalchand · David Lines · Neil Lawrence


Abstract:

This work proposes a variational sequential architecture based on recurrent neural nets for de novo drug design. The variational autoencoding framework induces a compressed continuous representation of discrete molecules through a low-dimensional latent space. The continuous latent space allows for optimisation, interpolation, unconditional and conditional generation of novel molecules through gradient-based techniques. However, the success of gradient-based optimisation is tied to the structure and smoothness of the latent space and this is precisely what we target through our generative architecture. Beyond structure generation we leverage non-parametric Gaussian process (GP) decoders for the auxiliary task of property prediction on the shared latent space. Training the architecture on shared latent embeddings for both structure and property generation enforces a soft stratification of the latent space as a function of the properties making it amenable to gradient-based optimisation of objectives tied to molecular properties. We moderate the smoothness of the non-parametric GP decoder with the choice of the kernel function. We demonstrate several capabilities of our generative architecture on widely used benchmark datasets of small drug-like molecules — the ZINC-250K and the QM9 dataset with fewer than nine heavy atoms.

Chat is not available.