LeakGFN: Robust Molecular Generation in Generative Flow Networks via Flow Decomposition
Abstract
Generative Flow Networks (GFlowNets) have emerged as a powerful framework for molecular generation, sampling diverse candidates proportionally to a reward function. However, the vast chemical space necessitates truncating trajectory length, forcing models to treat incomplete molecular fragments as terminal states alongside valid molecules. This conflation distorts the learned distribution by allocating probability mass to chemically meaningless states. We propose LeakGFN, a dual-head architecture that decomposes flow into two components: a chemical head modeling flow over the full chemical space, and a valid head estimating the fraction of flow reaching valid molecules within the truncation boundary. Through this decomposition, the valid head implicitly learns molecular reachability without explicit supervision. We prove that LeakGFN recovers the correct distribution over accessible molecules under mild assumptions. Experiments on five molecular optimization tasks demonstrate consistent improvements over flow matching baselines, achieving state-of-the-art performance on four out of five tasks. Our module integrates as a plug-and-play enhancement into existing frameworks, improving performance on both pocket-conditioned and multi-objective generation tasks.