Mind the budget: Accelerating Deep Reinforcement Learning using Early Exit Neural Networks
Abstract
Early exit neural networks, which adapt computation to input complexity, have proven effective in supervised learning but remain largely unexplored in deep reinforcement learning (DRL). In this paper, we propose the use of Budgeted EXit Actor (BEXA), which is a novel actor-critic architecture that integrates early exit branches into the actor network. These branches are trained via the underlying DRL method and use a constrained value-based criterion to decide when to exit, allowing the policy to dynamically adjust its computation. BEXA is general, easy to tune and compatible with any off-policy actor-critic method. We evaluate BEXA using different DRL methods such as SAC and TD3 on a suite of MuJoCo tasks. Our results demonstrate a substantial improvement in inference efficiency with minimal or no loss in performance. These findings highlight early exits as a promising direction for improving computational efficiency in DRL.