Skip to yearly menu bar Skip to main content


Poster
in
Workshop: DMLR Workshop: Data-centric Machine Learning Research

How to Improve Imitation Learning Performance with Sub-optimal Supplementary Data?

Ziniu Li · Tian Xu · Zeyu Qin · Yang Yu · Zhiquan Luo


Abstract:

Imitation learning (IL) is a machine learning technique that involves learning from examples provided by an expert. IL algorithms can solve the sequential decision-making tasks but their performance usually suffer when the amount of expert data is limited. To address this challenge, a new data-centric framework called (offline) IL with supplementary data has emerged, which \emph{additionally} utilizes an imperfect dataset inexpensively collected from sub-optimal policies. However, the supplementary data may contain out-of-expert-distribution samples, making it tricky to utilize the supplementary data to improve performance. In this paper, we focus on a classic offline IL algorithm called behavioral cloning (BC) and its variants, studying the imitation gap bounds in the context of IL with supplementary data. Our theoretical results show that a naive method, which applies BC on the union of expert and supplementary data, has a non-vanishing imitation error. As a result, its performance may be worse than BC which relies solely on the expert data. To address this issue, we propose an importance-sampling-based approach for selecting in-expert-distribution samples from the supplementary dataset. The proposed method theoretically eliminates the gap of the naive method. Empirical studies demonstrate that our method can perform better than prior state-of-the-art methods on tasks including locomotion control, Atari games, and object recognition.

Chat is not available.