Timezone: »
Poster
$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
CHENGYUE WU · Teng Wang · Yixiao Ge · Zeyu Lu · Ruisong Zhou · Ying Shan · Ping Luo
Foundation models have achieved great advances in multi-task learning with a unified interface of unimodal and multimodal tasks. However, the potential of such multi-task learners has not been exploited during transfer learning. In this work, we present a universal parameter-efficient transfer learning method, termed Predict-Interpolate Tuning ($\pi$-Tuning), for vision, language, and vision-language tasks. It aggregates the parameters of lightweight task-specific experts learned from similar tasks to aid the target downstream task. The task similarities are predicted in a unified modality-independent space, yielding a scalable graph to demonstrate task relationships. $\pi$-Tuning has several appealing benefits. First, it flexibly explores both intra- and inter-modal transferability between similar tasks to improve the accuracy and robustness of transfer learning, especially in data-scarce scenarios. Second, it offers a systematical solution for transfer learning with multi-task prediction-and-then-interpolation, compatible with diverse types of parameter-efficient experts, such as prompt and adapter. Third, an extensive study of task-level mutual benefits on 14 unimodal and 6 multimodal datasets shows that $\pi$-Tuning surpasses fine-tuning and other parameter-efficient transfer learning methods both in full-shot and low-shot regimes. The task graph also enables an in-depth interpretable analysis of task transferability across modalities. The code will be available at https://github.com/TencentARC/pi-Tuning.
Author Information
CHENGYUE WU (The University of Hong Kong)
Teng Wang (Southern University of Science and Technology)
Yixiao Ge (Tencent)
Zeyu Lu (Shanghai Jiaotong University)
Ruisong Zhou (Fudan University)
Ying Shan (Center of Applied Research, Tencent PCG)
Ping Luo (The University of Hong Kong)
More from the Same Authors
-
2023 Poster: DeSRA: Detect and Delete the Artifacts of GAN-based Real-World Super-Resolution Models »
Liangbin Xie · Xintao Wang · Xiangyu Chen · Gen Li · Ying Shan · Jiantao Zhou · Chao Dong -
2023 Poster: AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners »
Zhixuan Liang · Yao Mu · Mingyu Ding · Fei Ni · Masayoshi Tomizuka · Ping Luo -
2023 Poster: ChiPFormer: Transferable Chip Placement via Offline Decision Transformer »
Yao LAI · Jinxin Liu · Zhentao Tang · Bin Wang · Jianye Hao · Ping Luo -
2023 Oral: AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners »
Zhixuan Liang · Yao Mu · Mingyu Ding · Fei Ni · Masayoshi Tomizuka · Ping Luo -
2022 Poster: Flow-based Recurrent Belief State Learning for POMDPs »
Xiaoyu Chen · Yao Mu · Ping Luo · Shengbo Li · Jianyu Chen -
2022 Spotlight: Flow-based Recurrent Belief State Learning for POMDPs »
Xiaoyu Chen · Yao Mu · Ping Luo · Shengbo Li · Jianyu Chen -
2022 Poster: VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix »
Teng Wang · Wenhao Jiang · Zhichao Lu · Feng Zheng · Ran Cheng · chengguo yin · Ping Luo -
2022 Spotlight: VLMixer: Unpaired Vision-Language Pre-training via Cross-Modal CutMix »
Teng Wang · Wenhao Jiang · Zhichao Lu · Feng Zheng · Ran Cheng · chengguo yin · Ping Luo -
2017 Poster: Learning Deep Architectures via Generalized Whitened Neural Networks »
Ping Luo -
2017 Talk: Learning Deep Architectures via Generalized Whitened Neural Networks »
Ping Luo