Support-Proximity Augmented Diffusion Estimation for Offline Black-Box Optimization
Yonghan Yang ⋅ Ye Yuan ⋅ Zipeng SUN ⋅ Linfeng Du ⋅ Bowei He ⋅ Haolun Wu ⋅ Can Chen ⋅ Xue Liu
Abstract
Offline black-box optimization aims to discover novel designs with high property scores using only a static dataset, a task fundamentally challenged by the out-of-distribution (OOD) extrapolation problem. Existing approaches typically bifurcate into inverse methods, which struggle with the ill-posed nature of mapping scores to designs, and forward methods, which often lack the distributional expressivity to quantify uncertainty effectively. In this work, we propose \textbf{SPADE} (\textbf{S}upport-\textbf{P}roximity \textbf{A}ugmented \textbf{D}iffusion \textbf{E}stimation), a novel framework that reimagines forward surrogate modeling through the lens of conditional generative modeling. SPADE models the forward likelihood $p(y|\boldsymbol{x})$ using a diffusion model, but with two critical enhancements to tailor it for optimization: (1) a \emph{Calibrated Diffusion Estimation} module that enforces global consistency in statistical moments and pairwise rankings, and (2) a \emph{Support-Proximity Regularization} mechanism that implicitly internalizes the data manifold constraint $p(\boldsymbol{x})$ via kNN-based density estimation. Theoretically, we prove that our regularization is first-order equivalent to maximizing a Bayesian posterior with a valid design prior. Empirically, SPADE achieves state-of-the-art performance across Design-Bench tasks and an LLM data mixture optimization benchmark. Our code is available through the anonymous repo \href{https://anonymous.4open.science/r/diffsurr-icml2026-C4FD/}{here}.
Successful Page Load