Generative Large Neighborhood Search: Scalable Set Cover Optimization via Discrete Diffusion
Abstract
Large-scale Set Cover Problems (SCP) with millions of variables and complex cost structures require high-quality solutions within seconds, yet remain beyond the reach of exact solvers and pose severe generalization challenges for neural methods. Such problems necessitate decomposition into bounded subproblems; however, when the induced subproblem topology differs from that observed during training, existing neural approaches often fail to transfer reliably. We introduce Generative Large Neighborhood Search (GLNS), which reframes neighborhood selection as generation using a discrete diffusion model. Our key insight is that the diffusion denoising trajectory exposes variables exhibiting high prediction instability across timesteps and identifies regions where local repair yields downstream improvement. GLNS exploits this trajectory-level signal to construct high-impact neighborhoods via a localized, bounded-complexity generative sampling procedure, enabling robust neighborhood selection without retraining. As a result, GLNS transfers effectively across cost regimes and instance scales within SCP. Under tight and equal wall-clock budgets, GLNS consistently outperforms established neural baselines and achieves competitive performance with state-of-the-art MIP solvers. These results demonstrate trajectory-guided generation as a scalable framework for large-scale SCP and suggest potential relevance to other constrained optimization settings.