CoarseBind: Fast and Accurate Binding Affinity Prediction through Coarse Structural Representations
Matteo Rossi ⋅ Ryan Pederson ⋅ Miles Wang-Henderson ⋅ Benjamin Kaufman ⋅ Edward Williams ⋅ Carl Underkoffler ⋅ Owen Howell ⋅ Adrian Layer ⋅ Stephan Thaler ⋅ Narbe Mardirossian ⋅ John Parkhill
Abstract
We present CoarseBind, a foundation model for protein-ligand structure and binding affinity prediction that achieves 26$\times$ faster inference than state-of-the-art methods while improving affinity prediction accuracy by up to 20\%. Current deep learning approaches to structure-based drug design rely on expensive all-atom diffusion to generate 3D coordinates, creating inference bottlenecks that render large-scale compound screening computationally intractable. We challenge this paradigm with the hypothesis: full all-atom resolution is unnecessary for accurate small molecule pose and binding affinity prediction. CoarseBind tests this hypothesis through a coarse pocket-level representation (protein C$_\beta$ atoms and ligand heavy atoms only) within a multimodal architecture combining pretrained molecular encoders and ESM-2 protein embeddings that learns rich structural representations, which are used in a diffusion-free optimization module for pose generation and a binding affinity likelihood prediction module. On structure prediction benchmarks, CoarseBind matches diffusion-based baselines in ligand pose accuracy. For binding affinity, CoarseBind outperforms Boltz-2 by 16-20\% in Pearson correlation on both a public benchmark (CASP16) and a diverse private dataset (18 assays). The affinity module also provides well-calibrated uncertainty estimates, addressing a critical gap in compound prioritization for drug discovery. Furthermore, this module enables a continual learning framework and a hedged batch selection strategy that, in simulated drug discovery cycles, achieves 6$\times$ greater affinity improvement over greedy approaches.
Successful Page Load