Poster Thu, Jul 9, 2026 • 10:30 AM – 12:15 PM KST Coex: HALL A

Path-Coupled Bellman Flows for Distributional Reinforcement Learning

Boyang Xu ⋅ Qing Zou ⋅ Siqin Yang ⋅ Hao Yan

Project Page

Abstract

Distributional RL models the full return distribution, but common categorical/quantile approaches rely on projection and independently sampled Bellman targets, which ignore the Bellman operator’s affine transport structure and yield high-variance learning signals. We introduce Path-Coupled Bellman Flows, a flow-matching framework that shares base noise to couple the generative trajectories of consecutive states, inducing a geometric Bellman scaling law between their velocity fields. This geometry motivates a $\lambda$-family of Bellman-flow objectives that functions as a control variate, reducing variance while retaining the same Bellman-consistent fixed point. Across toy diagnostics and offline RL benchmarks (OGBench, D4RL), our method improves training stability and achieves competitive or improved performance relative to prior distributional baselines.