Spotlight
in
Workshop: Understanding and Improving Generalization in Deep Learning
Zero-Shot Learning from scratch: leveraging local compositional representations
Authors: Tristan Sylvain, Linda Petrini and Devon Hjelm
Abstract: Zero-shot classification is a task focused on generalization where no instance from the target classes is seen during training. To allow for test-time transfer, each class is annotated with semantic information, commonly in the form of attributes or text descriptions. While classical zero-shot learning does not specify how this problem should be solved, the most successful approaches rely on features extracted from encoders pre-trained on large datasets, most commonly Imagenet. This approach raises important questions that might otherwise distract researchers from answering fundamental questions about representation learning and generalization. For instance, one should wonder to what extent these methods actually learn representations robust with respect to the task, rather than simply exploiting information stored in the encoder. To remove these distractors, we propose a more challenging setting: Zero-Shot Learning from scratch, which effectively forbids the use encoders fine-tuned on other datasets. Our analysis on this setting highlights the importance of local information, and compositional representations.