Making Learner Weakness Actionable for Learning from Demonstration with Novice Teachers
Abstract
Learning from demonstration can be an effective way to teach robots task-oriented policies. However, in an interactive setting when demonstrations are limited by time or other budgetary constraints, it is challenging to find those that fix the learner's (remaining) errors. This is especially difficult for novice teachers: they may provide task-valid trajectories, often these fail to meaningfully improve the policy due to their lack of knowledge of learning mechanisms internal to the robot. This paper introduces CLASP (Collaborative Learning with Anchored State-space Partitions), which summarises the teaching process as a compact map of behavioural regions anchored in the teacher's own demonstrations. The map connects task failure to actionable changes to demonstrations by indicating what is going wrong in an intuitive way. It also enables difficulty-aware training that emphasises regions where learning is failing. Across diverse benchmarks, CLASP improves success by up to 20\% over offline and interactive baselines under the same demonstration budget, improves robustness under distribution shift by 14–20\%, and preserves behavioural diversity.