Timezone: »

 
Poster
$\pi$-Tuning: Transferring Multimodal Foundation Models with Optimal Multi-task Interpolation
CHENGYUE WU · Teng Wang · Yixiao Ge · Zeyu Lu · Ruisong Zhou · Ying Shan · Ping Luo

Thu Jul 27 04:30 PM -- 06:00 PM (PDT) @ Exhibit Hall 1 #316
Foundation models have achieved great advances in multi-task learning with a unified interface of unimodal and multimodal tasks. However, the potential of such multi-task learners has not been exploited during transfer learning. In this work, we present a universal parameter-efficient transfer learning method, termed Predict-Interpolate Tuning ($\pi$-Tuning), for vision, language, and vision-language tasks. It aggregates the parameters of lightweight task-specific experts learned from similar tasks to aid the target downstream task. The task similarities are predicted in a unified modality-independent space, yielding a scalable graph to demonstrate task relationships. $\pi$-Tuning has several appealing benefits. First, it flexibly explores both intra- and inter-modal transferability between similar tasks to improve the accuracy and robustness of transfer learning, especially in data-scarce scenarios. Second, it offers a systematical solution for transfer learning with multi-task prediction-and-then-interpolation, compatible with diverse types of parameter-efficient experts, such as prompt and adapter. Third, an extensive study of task-level mutual benefits on 14 unimodal and 6 multimodal datasets shows that $\pi$-Tuning surpasses fine-tuning and other parameter-efficient transfer learning methods both in full-shot and low-shot regimes. The task graph also enables an in-depth interpretable analysis of task transferability across modalities. The code will be available at https://github.com/TencentARC/pi-Tuning.

Author Information

CHENGYUE WU (The University of Hong Kong)
Teng Wang (Southern University of Science and Technology)
Yixiao Ge (Tencent)
Zeyu Lu (Shanghai Jiaotong University)
Ruisong Zhou (Fudan University)
Ying Shan (Center of Applied Research, Tencent PCG)
Ping Luo (The University of Hong Kong)

More from the Same Authors