Timezone: »

 
UOTA: Unsupervised Open-Set Task Adaptation Using a Vision-Language Foundation Model
Youngjo Min · Kwangrok Ryoo · Bumsoo Kim · Taesup Kim
Event URL: https://openreview.net/forum?id=aXOLXSu2B7 »

Human-labeled data is essential for deep learning models, but annotation costs hinder their use in real-world applications. Recently, however, models such as CLIP have shown remarkable zero-shot capabilities through vision-language pre-training. Although fine-tuning with human-labeled data can further improve the performance of zero-shot models, it is often impractical in low-budget real-world scenarios. In this paper, we propose an alternative algorithm, dubbed Unsupervised Open-Set Task Adaptation (UOTA), which fully leverages the large amounts of open-set unlabeled data collected in the wild to improve pre-trained zero-shot models in real-world scenarios.

Author Information

Youngjo Min (Seoul National University)
Kwangrok Ryoo (LG AI Research)
Bumsoo Kim (LG AI Research)
Taesup Kim (Seoul National University)

More from the Same Authors