DISSOLVR: An Interpretable and Fast Framework for Aqueous and Organic Solubility Prediction
Vansh Ramani ⋅ Har Ashish Arora ⋅ Dhairya Kuchhal ⋅ Sayan Ranu ⋅ Tarak Karmakar
Abstract
High-fidelity solubility prediction is fundamental to pharmaceutical development and environmental partitioning, where accurate modeling must couple molecular structure with thermodynamic behavior across diverse chemical environments. However, recent advancements have been dominated by deep learning architectures that often sacrifice physical interpretability for predictive power. We challenge this trend by showing that state-of-the-art performance does not require such non-transparent architectures. To address this, we introduce $Dissolvr$, a transparent framework for molecular solubility prediction. In addition, we perform a comprehensive literature review and a benchmarking study against various methods. We show that $Dissolvr$ approaches the aleatoric limit of experimental uncertainty and achieves OOD generalization through structural invariance derived by mapping molecules to physically-grounded descriptors. Then, we present an LLM-assisted post-hoc explanation pipeline that bridges the gap between symbolic model artifacts and chemically grounded narratives. Finally, a comparative benchmark of a survey involving 22 expert chemists reveals that expert evaluators provide deep insights.
Successful Page Load