Poster
in
Affinity Workshop: LatinX in AI (LXAI) Research Workshop
Towards Learning Activity Cliff-Aware Molecular Representations
César Miguel Valdez Córdova
Keywords: [ ChEMBL ] [ Molecular Property Prediction ] [ Pre-training ] [ Activity Cliffs ] [ Siamese Networks ]
Current deep learning based methods for molecular property prediction show pronounced shortcomings when predicting molecular properties in the presence of activity cliffs (AC): pairs of structurally similar molecules with significant differences in potency. We investigate how inductive biases of increasing complexity, from simple Multilayer Perceptrons (MLPs) to self-supervised models, impact the learning of representations from Extended-connectivity Fingerprints (ECFPs). Leveraging the Matched Molecular Pair (MMP) abstraction, we explore various pre-training schemes designed to capture AC relationships.While simple models remain competitive, we show extensive differences and avenues for potential improvement in performance across different inductive bias choices and pre-training strategies, paving the way for AC-aware and consequently, chemically robust model design.