Quantum Robust Inner Minimization for Reinforcement Learning with Quadratic Speed-Up in Query Complexity
Hyun Lee ⋅ Joongheon Kim ⋅ Sung Whan Yoon
Abstract
Robust reinforcement learning (RRL) aims to tackle unexpected environmental changes by optimizing policies against the worst case. However, RRL remains impractical due to the cost of the Max-Min optimization, where it suffers from the exhaustive query complexity for finding the worst-case (dubbed 'Min') within the environmental uncertainty set $\mathcal{U}$, i.e., $\mathcal{O}(|\mathcal{U}|)$. By viewing this via a lens of quantum perspective, we raise a pivotal question: *If we can query from the environment with quantum superpositions, is it possible to accelerate the Max-Min optimization of RRL?* Our answer is 'Yes'. Our method, called quantum robust inner minimization (QRIM), encodes the uncertainty set with quantum superposition and amplifies low-return cases, thus enabling RL for solving the robust (i.e., worst-case) Bellman equation. Importantly, QRIM achieves a quadratic speed-up in query complexity without altering the outer RL pipeline, i.e., $\mathcal{O}(\sqrt{|\mathcal{U}|})$. Validated through classical simulations to real quantum hardware execution, QRIM learns more robust policies with quadratically reduced queries than classical RL.
Successful Page Load