Talks
in
Workshop: ML for Life and Material Science: From Theory to Industry Applications
Improving Fragment-Based Deep Molecular Generative Models
Presented by: Panukorn Taleongpong, Brooks Paige
Deep molecular generative models have shown promising results and paved a new way for drug discovery. Their ability to explore the molecular space, estimated to be 1060, is significantly greater than traditional methods used for the virtual screening of existing databases. We introduce a novel fragmentation algorithm particularly suitable for use in deep generative models. In contrast to existing fragmentation algorithms, our procedure sequentially breaks a molecule along BRIC bonds in such a manner that the linearization of fragments is directly invertible, guaranteed to be able to reconstruct the original molecule from the fragment sequence. This makes it appropriate for use in deep generative models trained with sequential models as likelihoods. We compare with previous fragment-based SMILES VAE methods and observe that our approach significantly enhances coverage of the molecular space and outperforms on distribution learning benchmarks.