Compositional Generative Modeling from Decentralized Data
Abstract
The physical world is fundamentally compositional, yet empirical data are often fragmented across decentralized silos that cannot be aggregated due to privacy, legal, or economic constraints. Such scenarios pose a fundamental challenge for generative modeling: learning models that collectively cover the union of these sources while enabling compositional generalization when the factors required for composition are distributed across isolated data sources. We introduce Decentralized Compositional Flow Matching (DCFM), a framework for learning generative models from decentralized private data without exchanging raw samples. DCFM enforces structural constraints that induce conditional independence across the global set of generative factors. As a result, DCFM allows novel combinations to emerge through interactions across peers, even when no single data source contains sufficient information to support composition on its own. Empirically, DCFM substantially outperforms federated learning and mixture-of-experts baselines across conditional image generation, robotic spatial planning and medical attribute co-occurrence modeling.