Timezone: »
Vision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has been largely achieved by scaling up the dataset with noisy image-text pairs collected from the web, which is a suboptimal source of supervision. In this paper, we propose BLIP, a new VLP framework which transfers flexibly to both vision-language understanding and generation tasks. BLIP effectively utilizes the noisy web data by bootstrapping the captions, where a captioner generates synthetic captions and a filter removes the noisy ones. We achieve state-of-the-art results on a wide range of vision-language tasks, such as image-text retrieval (+2.7% in average recall@1), image captioning (+2.8% in CIDEr), and VQA (+1.6% in VQA score). BLIP also demonstrates strong generalization ability when directly transferred to video-language tasks in a zero-shot manner. Code and models are available at https://github.com/salesforce/BLIP.
Author Information
Junnan Li (Salesforce)
DONGXU LI (Salesforce)
Caiming Xiong (Salesforce)
Steven Hoi (Salesforce)
Related Events (a corresponding poster, oral, or spotlight)
-
2022 Spotlight: BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation »
Wed. Jul 20th 08:40 -- 08:45 PM Room Hall F
More from the Same Authors
-
2021 : Policy Finetuning: Bridging Sample-Efficient Offline and Online Reinforcement Learning »
Tengyang Xie · Nan Jiang · Huan Wang · Caiming Xiong · Yu Bai -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2023 Poster: BLIP-2: Bootstrapping Language-Image Pre-training with Frozen Image Encoders and Large Language Models »
Junnan Li · DONGXU LI · Silvio Savarese · Steven Hoi -
2021 : Sample-Efficient Learning of Stackelberg Equilibria in General-Sum Games »
Yu Bai · Chi Jin · Huan Wang · Caiming Xiong -
2021 Poster: Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization »
Stanislaw Jastrzebski · Devansh Arpit · Oliver Astrand · Giancarlo Kerg · Huan Wang · Caiming Xiong · Richard Socher · Kyunghyun Cho · Krzysztof J Geras -
2021 Poster: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Spotlight: Catastrophic Fisher Explosion: Early Phase Fisher Matrix Impacts Generalization »
Stanislaw Jastrzebski · Devansh Arpit · Oliver Astrand · Giancarlo Kerg · Huan Wang · Caiming Xiong · Richard Socher · Kyunghyun Cho · Krzysztof J Geras -
2021 Spotlight: How Important is the Train-Validation Split in Meta-Learning? »
Yu Bai · Minshuo Chen · Pan Zhou · Tuo Zhao · Jason Lee · Sham Kakade · Huan Wang · Caiming Xiong -
2021 Poster: Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification »
Yu Bai · Song Mei · Huan Wang · Caiming Xiong -
2021 Spotlight: Don’t Just Blame Over-parametrization for Over-confidence: Theoretical Analysis of Calibration in Binary Classification »
Yu Bai · Song Mei · Huan Wang · Caiming Xiong -
2020 Poster: Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills »
Victor Campos · Alexander Trott · Caiming Xiong · Richard Socher · Xavier Giro-i-Nieto · Jordi Torres -
2019 Poster: Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting »
Xilai Li · Yingbo Zhou · Tianfu Wu · Richard Socher · Caiming Xiong -
2019 Poster: Taming MAML: Efficient unbiased meta-reinforcement learning »
Hao Liu · Richard Socher · Caiming Xiong -
2019 Poster: On the Generalization Gap in Reparameterizable Reinforcement Learning »
Huan Wang · Stephan Zheng · Caiming Xiong · Richard Socher -
2019 Oral: Learn to Grow: A Continual Structure Learning Framework for Overcoming Catastrophic Forgetting »
Xilai Li · Yingbo Zhou · Tianfu Wu · Richard Socher · Caiming Xiong -
2019 Oral: On the Generalization Gap in Reparameterizable Reinforcement Learning »
Huan Wang · Stephan Zheng · Caiming Xiong · Richard Socher -
2019 Oral: Taming MAML: Efficient unbiased meta-reinforcement learning »
Hao Liu · Richard Socher · Caiming Xiong