InfraRL: A Benchmark for Constrained Resource Allocation in Large-Scale Infrastructure Asset Management
Abstract
Optimizing maintenance strategies for large-scale infrastructure is a critical sequential decision-making problem, exemplified by the high-stakes domain of bridge management. While Reinforcement Learning (RL) offers a theoretical framework for such problems, practical deployment necessitates offline constrained RL—learning policies solely from static historical datasets under rigid budgetary limits without dangerous on-policy exploration.However, current research is hindered by benchmarks that fail to capture the confluence of distributional shift and hard constraints typical of real-world assets. We introduce InfraRL, a high-fidelity benchmark that uses bridge maintenance as a rigorous testbed for general infrastructure asset management challenges.Constructed from the U.S. National Bridge Inventory, InfraRL defines a rigorous offline task for optimizing maintenance strategies under hard budgetary constraints. We benchmark a diverse suite of baselines, ranging from industry-standard heuristics to SOTA single-agent and multi-agent offline RL algorithms. Through a comprehensive evaluation protocol, we analyze performance across structural utility, constraint adherence, and behavioral fidelity, revealing critical trade-offs between safety and long-term efficiency. Our code and data are available at https://anonymous.4open.science/r/ICML-6656