Poster
in
Workshop: The Synergy of Scientific and Machine Learning Modelling (SynS & ML) Workshop
Repurposing Density Functional Theory to Suit Deep Learning
Alexander Mathiasen · Hatem Helal · Paul Balanca · Kerstin Klaeser · Josef Dean · Carlo Luschi · Dominique Beaini · Andrew Fitzgibbon · Dominic Masters
Keywords: [ dataset creation quantum chemistry density functional theory neural networks ]
Density Functional Theory (DFT) accurately predicts the properties of molecules given their atom types and positions, and often serves as ground truth for molecular property prediction tasks. Neural Networks (NN) are popular tools for such tasks and are trained on DFT datasets, with the aim to approximate DFT at a fraction of the computational cost. Research in other areas of machine learning has shown that generalisation performance of NNs tends to improve with increased dataset size, however, the computational cost of DFT limits the size of DFT datasets. We present PySCFIPU, a DFT library that allows us to iterate on both dataset generation and NN training. We create QM10X, a dataset with 100M conformers, in 13 hours, on which we subsequently train SchNet in 12 hours. We show that the predictions of SchNet improve solely by increasing training data without incorporating further inductive biases.