Timezone: »
Fine-tuning large-scale pretrained models has led to tremendous progress in well-studied modalities such as vision and NLP. However, similar gains have not been observed in many other modalities due to a lack of relevant pretrained models. In this work, we propose ORCA, a general cross-modal fine-tuning framework that extends the applicability of a single large-scale pretrained model to diverse modalities. ORCA adapts to a target task via an align-then-refine workflow: given the target input, ORCA first learns an embedding network that aligns the embedded feature distribution with the pretraining modality. The pretrained model is then fine-tuned on the embedded data to exploit the knowledge shared across modalities. Through extensive experiments, we show that ORCA obtains state-of-the-art results on 3 benchmarks containing over 60 datasets from 12 modalities, outperforming a wide range of hand-designed, AutoML, general-purpose, and task-specific cross-modal methods. We highlight the importance of data alignment via a series of ablation studies and exemplify ORCA's utility in data-limited regimes.
Author Information
Junhong Shen (Carnegie Mellon University)
Liam Li (Hewlett Packard Enterprise)
Lucio Dery (Carnegie Mellon University)
Corey Staten (Ohio State University, Columbus)
Mikhail Khodak (CMU)
Graham Neubig (Carnegie Mellon University)
Ameet Talwalkar (Carnegie Mellon University)
Related Events (a corresponding poster, oral, or spotlight)
-
2023 Oral: Cross-Modal Fine-Tuning: Align then Refine »
Fri. Jul 28th 01:32 -- 01:40 AM Room Ballroom A
More from the Same Authors
-
2021 : Interpretable Machine Learning: Moving From Mythos to Diagnostics »
Valerie Chen · Jeffrey Li · Joon Kim · Gregory Plumb · Ameet Talwalkar -
2022 : Meta-Learning Adversarial Bandits »
Nina Balcan · Keegan Harris · Mikhail Khodak · Steven Wu -
2022 : SimpleSpot and Evaluating Systemic Errors using Synthetic Image Datasets »
Gregory Plumb · Nari Johnson · Ángel Alexander Cabrera · Marco Ribeiro · Ameet Talwalkar -
2022 : Perspectives on Incorporating Expert Feedback into Model Updates »
Valerie Chen · Umang Bhatt · Hoda Heidari · Adrian Weller · Ameet Talwalkar -
2023 : Where Does My Model Underperform?: A Human Evaluation of Slice Discovery Algorithms »
Nari Johnson · Ángel Alexander Cabrera · Gregory Plumb · Ameet Talwalkar -
2023 : Learning-augmented private algorithms for multiple quantile release »
Mikhail Khodak · Kareem Amin · Travis Dick · Sergei Vassilvitskii -
2023 Poster: Learning-augmented private algorithms for multiple quantile release »
Mikhail Khodak · Kareem Amin · Travis Dick · Sergei Vassilvitskii -
2023 Poster: PAL: Program-aided Language Models »
Luyu Gao · Aman Madaan · Shuyan Zhou · Uri Alon · Pengfei Liu · Yiming Yang · Jamie Callan · Graham Neubig -
2023 Poster: Why do Nearest Neighbor Language Models Work? »
Frank Xu · Uri Alon · Graham Neubig -
2022 Poster: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval »
Uri Alon · Frank Xu · Junxian He · Sudipta Sengupta · Dan Roth · Graham Neubig -
2022 Spotlight: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval »
Uri Alon · Frank Xu · Junxian He · Sudipta Sengupta · Dan Roth · Graham Neubig -
2022 Poster: Sanity Simulations for Saliency Methods »
Joon Kim · Gregory Plumb · Ameet Talwalkar -
2022 Poster: Symmetric Machine Theory of Mind »
Melanie Sclar · Graham Neubig · Yonatan Bisk -
2022 Spotlight: Symmetric Machine Theory of Mind »
Melanie Sclar · Graham Neubig · Yonatan Bisk -
2022 Spotlight: Sanity Simulations for Saliency Methods »
Joon Kim · Gregory Plumb · Ameet Talwalkar -
2021 : Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing (Q&A) »
Ameet Talwalkar -
2021 : Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing »
Ameet Talwalkar -
2021 Poster: Examining and Combating Spurious Features under Distribution Shift »
Chunting Zhou · Xuezhe Ma · Paul Michel · Graham Neubig -
2021 Poster: Few-shot Language Coordination by Modeling Theory of Mind »
Hao Zhu · Graham Neubig · Yonatan Bisk -
2021 Spotlight: Few-shot Language Coordination by Modeling Theory of Mind »
Hao Zhu · Graham Neubig · Yonatan Bisk -
2021 Spotlight: Examining and Combating Spurious Features under Distribution Shift »
Chunting Zhou · Xuezhe Ma · Paul Michel · Graham Neubig -
2020 : Lightning Talks Session 2 »
Jichan Chung · Saurav Prakash · Mikhail Khodak · Ravi Rahman · Vaikkunth Mugunthan · xinwei zhang · Hossein Hosseini -
2020 : 2.7 A Simple Setting for Understanding Neural Architecture Search with Weight-Sharing »
Mikhail Khodak -
2020 Poster: Optimizing Data Usage via Differentiable Rewards »
Xinyi Wang · Hieu Pham · Paul Michel · Antonios Anastasopoulos · Jaime Carbonell · Graham Neubig -
2020 Poster: FACT: A Diagnostic for Group Fairness Trade-offs »
Joon Kim · Jiahao Chen · Ameet Talwalkar -
2020 Poster: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation »
Junjie Hu · Sebastian Ruder · Aditya Siddhant · Graham Neubig · Orhan Firat · Melvin Johnson -
2020 Poster: A Sample Complexity Separation between Non-Convex and Convex Meta-Learning »
Nikunj Umesh Saunshi · Yi Zhang · Mikhail Khodak · Sanjeev Arora -
2020 Poster: Explaining Groups of Points in Low-Dimensional Representations »
Gregory Plumb · Jonathan Terhorst · Sriram Sankararaman · Ameet Talwalkar -
2019 : ARUBA: Efficient and Adaptive Meta-Learning with Provable Guarantees (Ameet Talwalkar) »
Ameet Talwalkar -
2019 Workshop: Adaptive and Multitask Learning: Algorithms & Systems »
Maruan Al-Shedivat · Anthony Platanios · Otilia Stretcu · Jacob Andreas · Ameet Talwalkar · Rich Caruana · Tom Mitchell · Eric Xing -
2019 : Panel Discussion »
Wenpeng Zhang · Charles Sutton · Liam Li · Rachel Thomas · Erin LeDell -
2019 : Contributed Talk 3: Random Search and Reproducibility for Neural Architecture Search »
Liam Li -
2019 : Poster Session 1 (all papers) »
Matilde Gargiani · Yochai Zur · Chaim Baskin · Evgenii Zheltonozhskii · Liam Li · Ameet Talwalkar · Xuedong Shang · Harkirat Singh Behl · Atilim Gunes Baydin · Ivo Couckuyt · Tom Dhaene · Chieh Lin · Wei Wei · Min Sun · Orchid Majumder · Michele Donini · Yoshihiko Ozaki · Ryan P. Adams · Christian Geißler · Ping Luo · zhanglin peng · · Ruimao Zhang · John Langford · Rich Caruana · Debadeepta Dey · Charles Weill · Xavi Gonzalvo · Scott Yang · Scott Yak · Eugen Hotaj · Vladimir Macko · Mehryar Mohri · Corinna Cortes · Stefan Webb · Jonathan Chen · Martin Jankowiak · Noah Goodman · Aaron Klein · Frank Hutter · Mojan Javaheripi · Mohammad Samragh · Sungbin Lim · Taesup Kim · SUNGWOONG KIM · Michael Volpp · Iddo Drori · Yamuna Krishnamurthy · Kyunghyun Cho · Stanislaw Jastrzebski · Quentin de Laroussilhe · Mingxing Tan · Xiao Ma · Neil Houlsby · Andrea Gesmundo · Zalán Borsos · Krzysztof Maziarz · Felipe Petroski Such · Joel Lehman · Kenneth Stanley · Jeff Clune · Pieter Gijsbers · Joaquin Vanschoren · Felix Mohr · Eyke Hüllermeier · Zheng Xiong · Wenpeng Zhang · Wenwu Zhu · Weijia Shao · Aleksandra Faust · Michal Valko · Michael Y Li · Hugo Jair Escalante · Marcel Wever · Andrey Khorlin · Tara Javidi · Anthony Francis · Saurajit Mukherjee · Jungtaek Kim · Michael McCourt · Saehoon Kim · Tackgeun You · Seungjin Choi · Nicolas Knudde · Alexander Tornede · Ghassen Jerfel -
2019 Poster: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Oral: A Theoretical Analysis of Contrastive Unsupervised Representation Learning »
Nikunj Umesh Saunshi · Orestis Plevrakis · Sanjeev Arora · Mikhail Khodak · Hrishikesh Khandeparkar -
2019 Poster: Provable Guarantees for Gradient-Based Meta-Learning »
Nina Balcan · Mikhail Khodak · Ameet Talwalkar -
2019 Oral: Provable Guarantees for Gradient-Based Meta-Learning »
Nina Balcan · Mikhail Khodak · Ameet Talwalkar