Timezone: »
Large language models (LLMs) have demonstrated an impressive ability to perform arithmetic and symbolic reasoning tasks, when provided with a few examples at test time ("few-shot prompting"). Much of this success can be attributed to prompting methods such as "chain-of-thought", which employ LLMs for both understanding the problem description by decomposing it into steps, as well as solving each step of the problem. While LLMs seem to be adept at this sort of step-by-step decomposition, LLMs often make logical and arithmetic mistakes in the solution part, even when the problem is decomposed correctly. In this paper, we present Program-Aided Language models (PAL): a novel approach that uses the LLM to read natural language problems and generate programs as the intermediate reasoning steps, but offloads the solution step to a runtime such as a Python interpreter. With PAL, decomposing the natural language problem into runnable steps remains the only learning task for the LLM, while solving is delegated to the interpreter. We demonstrate this synergy between a neural LLM and a symbolic interpreter across 13 mathematical, symbolic, and algorithmic reasoning tasks from BIG-Bench Hard and others. In all these natural language reasoning tasks, generating code using an LLM and reasoning using a Python interpreter leads to more accurate results than much larger models. For example, PAL using Codex achieves state-of-the-art few-shot accuracy on GSM8K, surpassing PaLM which uses chain-of-thought by absolute 15% top-1.
Author Information
Luyu Gao
Aman Madaan (Carnegie Mellon University)
Shuyan Zhou (Carnegie Mellon University)
Uri Alon (Carnegie Mellon University)
Pengfei Liu (Carnegie Mellon University)
Yiming Yang (Carnegie Mellon University)
Jamie Callan (Carnegie Mellon University)
Graham Neubig (Carnegie Mellon University)
More from the Same Authors
-
2023 : Accelerating Diffusion-based Combinatorial Optimization Solvers by Progressive Distillation »
Junwei Huang · Zhiqing Sun · Yiming Yang -
2023 Oral: Cross-Modal Fine-Tuning: Align then Refine »
Junhong Shen · Liam Li · Lucio Dery · Corey Staten · Mikhail Khodak · Graham Neubig · Ameet Talwalkar -
2023 Poster: Cross-Modal Fine-Tuning: Align then Refine »
Junhong Shen · Liam Li · Lucio Dery · Corey Staten · Mikhail Khodak · Graham Neubig · Ameet Talwalkar -
2023 Poster: A Neural PDE Solver with Temporal Stencil Modeling »
Zhiqing Sun · Yiming Yang · Shinjae Yoo -
2023 Poster: Why do Nearest Neighbor Language Models Work? »
Frank Xu · Uri Alon · Graham Neubig -
2022 : FLOWGEN: Fast and slow graph generation »
Aman Madaan · Yiming Yang -
2022 Poster: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval »
Uri Alon · Frank Xu · Junxian He · Sudipta Sengupta · Dan Roth · Graham Neubig -
2022 Spotlight: Neuro-Symbolic Language Modeling with Automaton-augmented Retrieval »
Uri Alon · Frank Xu · Junxian He · Sudipta Sengupta · Dan Roth · Graham Neubig -
2022 Poster: Symmetric Machine Theory of Mind »
Melanie Sclar · Graham Neubig · Yonatan Bisk -
2022 Spotlight: Symmetric Machine Theory of Mind »
Melanie Sclar · Graham Neubig · Yonatan Bisk -
2021 Poster: Examining and Combating Spurious Features under Distribution Shift »
Chunting Zhou · Xuezhe Ma · Paul Michel · Graham Neubig -
2021 Poster: Few-shot Language Coordination by Modeling Theory of Mind »
Hao Zhu · Graham Neubig · Yonatan Bisk -
2021 Spotlight: Few-shot Language Coordination by Modeling Theory of Mind »
Hao Zhu · Graham Neubig · Yonatan Bisk -
2021 Spotlight: Examining and Combating Spurious Features under Distribution Shift »
Chunting Zhou · Xuezhe Ma · Paul Michel · Graham Neubig -
2020 Poster: Optimizing Data Usage via Differentiable Rewards »
Xinyi Wang · Hieu Pham · Paul Michel · Antonios Anastasopoulos · Jaime Carbonell · Graham Neubig -
2020 Poster: XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalisation »
Junjie Hu · Sebastian Ruder · Aditya Siddhant · Graham Neubig · Orhan Firat · Melvin Johnson -
2020 Poster: An EM Approach to Non-autoregressive Conditional Sequence Generation »
Zhiqing Sun · Yiming Yang -
2017 Poster: Analogical Inference for Multi-relational Embeddings »
Hanxiao Liu · Yuexin Wu · Yiming Yang -
2017 Talk: Analogical Inference for Multi-relational Embeddings »
Hanxiao Liu · Yuexin Wu · Yiming Yang