Timezone: »

Efficient Variance Reduction for Meta-learning
Hansi Yang · James Kwok

Tue Jul 19 08:30 AM -- 08:35 AM (PDT) @ None

Meta-learning tries to learn meta-knowledge from a large number of tasks. However, the stochastic meta-gradient can have large variance due to data sampling (from each task) and task sampling (from the whole task distribution), leading to slow convergence. In this paper, we propose a novel approach that integrates variance reduction with first-order meta-learning algorithms such as Reptile. It retains the bilevel formulation which better captures the structure of meta-learning, but does not require storing the vast number of task-specific parameters in general bilevel variance reduction methods. Theoretical results show that it has fast convergence rate due to variance reduction. Experiments on benchmark few-shot classification data sets demonstrate its effectiveness over state-of-the-art meta-learning algorithms with and without variance reduction.

Author Information

Hansi Yang (The Hong Kong University of Science and Technology)
James Kwok (Hong Kong University of Science and Technology)

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors