Learning Interpretable Options by Identifying Reward Diffusion Bottlenecks in Reinforcement Learning
Abstract
Bottleneck states, which connect distinct regions of the state space, provide a principled and interpretable basis for constructing temporal abstractions in Hierarchical Reinforcement Learning (HRL). However, existing bottleneck identification methods primarily rely on topological analysis of the state-transition graph, limiting their scalability to high-dimensional or continuous domains. To address this challenge, we introduce Value Power Strength (VPS), a value function-based metric inspired by the analogy between the Bellman equation and Kirchhoff’s current law, to quantify bottleneck property via the diffusion of reward in Markov Decision Processes (MDPs). VPS is estimated efficiently using value functions learned from random reward signals and captures reward diffusion bottlenecks in both discrete and continuous state spaces. Leveraging VPS, we design options that guide agents toward or away from bottleneck regions. Experimental results on classic tabular domains, MiniGrid, and Atari 2600 games demonstrate that the VPS-based framework discovers semantically meaningful subgoals and substantially improves exploration efficiency.