Neuro-evolutionary Continual Reinforcement Learning
Abstract
Deploying robots in open‑ended real‑world environments demands continual learning capabilities to adapt to an ever-expanding range of tasks. This requires retaining previously acquired skills without forgetting while effectively leveraging prior knowledge to learn new ones. Inspired by neuroscience, we propose Neuro-evolutionary Continual Reinforcement Learning (Nevo-CRL). Nevo-CRL maintains a fixed-capacity monolithic policy network, solving tasks by optimizing inter-layer connectivity and neuron parameter. For each new task, Nevo-CRL constructs a mask population to selectively activate the outputs of each hidden layer, thereby forming a task-specific policy population. Upon completing each task, the best-performing mask is stored, and its activated neurons are frozen to prevent catastrophic forgetting. To facilitate knowledge transfer, Nevo-CRL reuses neurons from acquired skills based on semantic similarity between tasks, while dynamically allocating additional neurons for task-specific adaptation. In the learning process, Nevo-CRL iteratively adjusts masks via importance-based crossover to optimize the policy network connectivity. To improve neuron utilization, we prune low-activity connections to recycle neurons. The experiments demonstrate that Nevo-CRL significantly outperforms existing continual RL methods and multi-task learning methods in terms of overall performance, forgetting reduction, generalization ability.