COLLIE: Guiding Skill Discovery in Semantically Coherent Latent Space
Abstract
Unsupervised skill discovery (USD) aims to learn diverse behaviors without reward functions, but often results in task-irrelevant or hazardous behaviors due to uniform exploration. Guided skill discovery (GSD) addresses this issue by incorporating human intent to focus exploration on meaningful regions. However, existing GSD methods typically require training additional guidance models from scratch, which can be ineffective with sparse human feedback. To tackle this, we propose COLLIE, a GSD framework that utilizes sparse human feedback effectively by constructing a semantically coherent skill latent space. The semantical coherence property enables a training-free guidance signal construction, eliminating the need for additional model training beyond skill learning. Furthermore, as this property is derived from dense unsupervised data, the latent space is well-structured, ensuring reliability even with sparse human feedback. Theoretical analysis justifies the effectiveness of our training-free guidance signal, while experiments across diverse state-based and pixel-based tasks show that COLLIE learns diverse, human-aligned skills, avoids hazardous behaviors, and achieves superior downstream performance with minimal human feedback.