Timezone: »

GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
Cédric Colas · Olivier Sigaud · Pierre-Yves Oudeyer

Thu Jul 12 09:15 AM -- 12:00 PM (PDT) @ Hall B #174

In continuous action domains, standard deep reinforcement learning algorithms like DDPG suffer from inefficient exploration when facing sparse or deceptive reward problems. Conversely, evolutionary and developmental methods focusing on exploration like Novelty Search, Quality-Diversity or Goal Exploration Processes explore more robustly but are less efficient at fine-tuning policies using gradient-descent. In this paper, we present the GEP-PG approach, taking the best of both worlds by sequentially combining a Goal Exploration Process and two variants of DDPG . We study the learning performance of these components and their combination on a low dimensional deceptive reward problem and on the larger Half-Cheetah benchmark. We show that DDPG fails on the former and that GEP-PG improves over the best DDPG variant in both environments.

Author Information

Cédric Colas (Inria)
Olivier Sigaud (Sorbonne University)
Pierre-Yves Oudeyer (Inria)

Dr. Pierre-Yves Oudeyer is Research Director (DR1) at Inria and head of the Inria and Ensta-ParisTech FLOWERS team (France). Before, he has been a permanent researcher in Sony Computer Science Laboratory for 8 years (1999-2007). After working on computational models of language evolution, he is now working on developmental and social robotics, focusing on sensorimotor development, language acquisition and life-long learning in robots. Strongly inspired by infant development, the mechanisms he studies include artificial curiosity, intrinsic motivation, the role of morphology in learning motor control, human-robot interfaces, joint attention and joint intentional understanding, and imitation learning. He has published a book, more than 80 papers in international journals and conferences, holds 8 patents, gave several invited keynote lectures in international conferences, and received several prizes for his work in developmental robotics and on the origins of language. In particular, he is laureate of the ERC Starting Grant EXPLORERS. He is editor of the IEEE CIS Newsletter on Autonomous Mental Development, and associate editor of IEEE Transactions on Autonomous Mental Development, Frontiers in Neurorobotics, and of the International Journal of Social Robotics. He is also working actively for the diffusion of science towards the general public, through the writing of popular science articles and participation to radio and TV programs as well as science exhibitions. Web:http://www.pyoudeyer.com and http://flowers.inria.fr

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors