Timezone: »

Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Victor Campos · Alexander Trott · Caiming Xiong · Richard Socher · Xavier Giro-i-Nieto · Jordi Torres

Tue Jul 14 11:00 AM -- 11:45 AM & Tue Jul 14 11:00 PM -- 11:45 PM (PDT) @ None #None

Acquiring abilities in the absence of a task-oriented reward function is at the frontier of reinforcement learning research. This problem has been studied through the lens of empowerment, which draws a connection between option discovery and information theory. Information-theoretic skill discovery methods have garnered much interest from the community, but little research has been conducted in understanding their limitations. Through theoretical analysis and empirical evidence, we show that existing algorithms suffer from a common limitation -- they discover options that provide a poor coverage of the state space. In light of this, we propose Explore, Discover and Learn (EDL), an alternative approach to information-theoretic skill discovery. Crucially, EDL optimizes the same information-theoretic objective derived from the empowerment literature, but addresses the optimization problem using different machinery. We perform an extensive evaluation of skill discovery methods on controlled environments and show that EDL offers significant advantages, such as overcoming the coverage problem, reducing the dependence of learned skills on the initial state, and allowing the user to define a prior over which behaviors should be learned.

Author Information

Victor Campos (Barcelona Supercomputing Center)
Alexander Trott (Salesforce Research)
Caiming Xiong (Salesforce)
Richard Socher (Salesforce)
Xavier Giro-i-Nieto (Universitat Politecnica de Catalunya)

Xavier Giro-i-Nieto is an associate professor at the Universitat Politecnica de Catalunya (UPC) in Barcelona, as member of the Intelligent Data Science and Artificial Intelligence Research Center (IDEAI-UPC) and Image Processing Group (GPI), and also a visiting researcher at Barcelona Supercomputing Center (BSC). He graduated in Telecommuncations Engineering at ETSETB (UPC) in 2000, after completing his master thesis on image compression at the Vrije Universiteit in Brussels (VUB) with Prof. Peter Schelkens. After working one year in Sony Brussels, he started a Phd on computer vision, supervised by Prof. Ferran Marqués. In parallel, he designed and taught courses at the ESEIAAT (video content delivery) and ETSETB (deep learning) schools at UPC, as well as the Master in Computer Vision of Barcelona (video analysis). He visited multiple times the Digital Video and MultiMedia laboratory directed by Prof. Shih-Fu Chang at Columbia University in New York between 2008-2014, with whom keeps collaborating. He also works closely with the Insight Center of Data Analytics at Dublin City University, as well as his industrial partners at Vilynx, Mediapro, and Crisalix. He serves as associate editor at IEEE Transactions in Multimedia and reviews for top tier conferences in machine learning (NeurIPS, ICML), computer vision (CVPR, ECCV, ICCV) and multimedia (ACMMM, ICMR).

Jordi Torres (Barcelona Supercomputing Center)

More from the Same Authors