Poster
in
Workshop: ICML 2024 Workshop on Foundation Models in the Wild
When Do Language Models Need to Be Large?
Zhixun Chen · Yali Du · David Mguni
Keywords: [ large language models ] [ optimal switching ] [ budget ]
Many leading language models (LMs) use high-intensity computational resources both during training and execution. This poses the challenge of lowering resource costs for deployment and faster execution in decision-making tasks among others. We introduce a novel plug \& play LM framework named Language OptimisingNetwork Distribution (LONDI). LONDI learns to selectively employ large LMs only where complex decision-making and reasoning are required while using low-resource LMs (i.e. LMs require less GPU usage, but may not be able to solve the problem alone) everywhere else. LONDI consists of a system of two (off-)policy networks, an LM, a large LM (LLM), and a reinforcement learning module that uses switching controls to quickly learn in which system states to call the LLM. We then introduce a variant of LONDI that maintains budget constraints on LLM calls and hence its resource usage. We test LONDI's performance in a range of tasks in ScienceWorld and BabyAI-Text and demonstrate that LONDI can solve tasks only solvable by resource-intensive LLMs while reducing GPU usage by up to 30\%.