Timezone: »
Spreadsheet formula prediction has been an important program synthesis problem with many real-world applications. Previous works typically utilize input-output examples as the specification for spreadsheet formula synthesis, where each input-output pair simulates a separate row in the spreadsheet. However, this formulation does not fully capture the rich context in real-world spreadsheets. First, spreadsheet data entries are organized as tables, thus rows and columns are not necessarily independent from each other. In addition, many spreadsheet tables include headers, which provide high-level descriptions of the cell data. However, previous synthesis approaches do not consider headers as part of the specification. In this work, we present the first approach for synthesizing spreadsheet formulas from tabular context, which includes both headers and semi-structured tabular data. In particular, we propose SpreadsheetCoder, a BERT-based model architecture to represent the tabular context in both row-based and column-based formats. We train our model on a large dataset of spreadsheets, and demonstrate that SpreadsheetCoder achieves top-1 prediction accuracy of 42.51%, which is a considerable improvement over baselines that do not employ rich tabular context. Compared to the rule-based system, SpreadsheetCoder assists 82% more users in composing formulas on Google Sheets.
Author Information
Xinyun Chen (UC Berkeley)
Petros Maniatis (Google Research)
Rishabh Singh (Google Brain)
Charles Sutton (Google)
Hanjun Dai (Google Brain)
Max Lin
Denny Zhou (Google Brain)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: SpreadsheetCoder: Formula Prediction from Semi-structured Context »
Thu. Jul 22nd 02:45 -- 02:50 AM Room
More from the Same Authors
-
2023 : DISCS: A Benchmark for Discrete Sampling »
Katayoon Goshvadi · Haoran Sun · Xingchao Liu · Azade Nova · Ruqi Zhang · Will Grathwohl · Dale Schuurmans · Hanjun Dai -
2023 Workshop: Sampling and Optimization in Discrete Space »
Haoran Sun · Hanjun Dai · Priyank Jaini · Ruqi Zhang · Ellen Vitercik -
2023 Poster: Revisiting Sampling for Combinatorial Optimization »
Haoran Sun · Katayoon Goshvadi · Azade Nova · Dale Schuurmans · Hanjun Dai -
2023 Poster: Not All Semantics are Created Equal: Contrastive Self-supervised Learning with Automatic Temperature Individualization »
Zi-Hao Qiu · Quanqi Hu · Zhuoning Yuan · Denny Zhou · Lijun Zhang · Tianbao Yang -
2023 Poster: Can Large Language Models Reason about Program Invariants? »
Kexin Pei · David Bieber · Kensen Shi · Charles Sutton · Pengcheng Yin -
2023 Poster: The Flan Collection: Designing Data and Methods for Effective Instruction Tuning »
Shayne Longpre · Le Hou · Tu Vu · Albert Webson · Hyung Won Chung · Yi Tay · Denny Zhou · Quoc Le · Barret Zoph · Jason Wei · Adam Roberts -
2023 Poster: Large Language Models Can Be Easily Distracted by Irrelevant Context »
Haoyue Shi · Xinyun Chen · Kanishka Misra · Nathan Scales · David Dohan · Ed Chi · Nathanael Schärli · Denny Zhou -
2023 Poster: Gradient-Free Structured Pruning with Unlabeled Data »
Azade Nova · Hanjun Dai · Dale Schuurmans -
2022 : Program Synthesis, Program Semantics, and Large Language Models »
Charles Sutton -
2022 : Session 3: New Computational Technologies for Reasoning »
Armando Solar-Lezama · Guy Van den Broeck · Jan-Willem van de Meent · Charles Sutton -
2022 Poster: Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance »
Zhuoning Yuan · Yuexin Wu · Zi-Hao Qiu · Xianzhi Du · Lijun Zhang · Denny Zhou · Tianbao Yang -
2022 Spotlight: Provable Stochastic Optimization for Global Contrastive Learning: Small Batch Does Not Harm Performance »
Zhuoning Yuan · Yuexin Wu · Zi-Hao Qiu · Xianzhi Du · Lijun Zhang · Denny Zhou · Tianbao Yang -
2022 Poster: Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization »
Hanjun Dai · Mengjiao Yang · Yuan Xue · Dale Schuurmans · Bo Dai -
2022 Spotlight: Marginal Distribution Adaptation for Discrete Sets via Module-Oriented Divergence Minimization »
Hanjun Dai · Mengjiao Yang · Yuan Xue · Dale Schuurmans · Bo Dai -
2021 Workshop: Workshop on Socially Responsible Machine Learning »
Chaowei Xiao · Animashree Anandkumar · Mingyan Liu · Dawn Song · Raquel Urtasun · Jieyu Zhao · Xueru Zhang · Cihang Xie · Xinyun Chen · Bo Li -
2021 Poster: Learn-to-Share: A Hardware-friendly Transfer Learning Framework Exploiting Computation and Parameter Sharing »
Cheng Fu · Hanxian Huang · Xinyun Chen · Yuandong Tian · Jishen Zhao -
2021 Oral: Learn-to-Share: A Hardware-friendly Transfer Learning Framework Exploiting Computation and Parameter Sharing »
Cheng Fu · Hanxian Huang · Xinyun Chen · Yuandong Tian · Jishen Zhao -
2021 Poster: LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs »
Hongyu Ren · Hanjun Dai · Bo Dai · Xinyun Chen · Michihiro Yasunaga · Haitian Sun · Dale Schuurmans · Jure Leskovec · Denny Zhou -
2021 Poster: Latent Programmer: Discrete Latent Codes for Program Synthesis »
Joey Hong · David Dohan · Rishabh Singh · Charles Sutton · Manzil Zaheer -
2021 Spotlight: LEGO: Latent Execution-Guided Reasoning for Multi-Hop Question Answering on Knowledge Graphs »
Hongyu Ren · Hanjun Dai · Bo Dai · Xinyun Chen · Michihiro Yasunaga · Haitian Sun · Dale Schuurmans · Jure Leskovec · Denny Zhou -
2021 Oral: Latent Programmer: Discrete Latent Codes for Program Synthesis »
Joey Hong · David Dohan · Rishabh Singh · Charles Sutton · Manzil Zaheer -
2020 Poster: Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection »
Mao Ye · Chengyue Gong · Lizhen Nie · Denny Zhou · Adam Klivans · Qiang Liu -
2020 Poster: Energy-Based Processes for Exchangeable Data »
Mengjiao Yang · Bo Dai · Hanjun Dai · Dale Schuurmans -
2020 Poster: Go Wide, Then Narrow: Efficient Training of Deep Thin Networks »
Denny Zhou · Mao Ye · Chen Chen · Tianjian Meng · Mingxing Tan · Xiaodan Song · Quoc Le · Qiang Liu · Dale Schuurmans -
2020 Poster: Retro*: Learning Retrosynthetic Planning with Neural Guided A* Search »
Binghong Chen · Chengtao Li · Hanjun Dai · Le Song -
2020 Poster: Learning and Evaluating Contextual Embedding of Source Code »
Aditya Kanade · Petros Maniatis · Gogul Balakrishnan · Kensen Shi -
2020 Poster: Incremental Sampling Without Replacement for Sequence Models »
Kensen Shi · David Bieber · Charles Sutton -
2020 Poster: Learning To Stop While Learning To Predict »
Xinshi Chen · Hanjun Dai · Yu Li · Xin Gao · Le Song -
2020 Poster: Scalable Deep Generative Modeling for Sparse Graphs »
Hanjun Dai · Azade Nova · Yujia Li · Bo Dai · Dale Schuurmans -
2020 Poster: Generating Programmatic Referring Expressions via Program Synthesis »
Jiani Huang · Calvin Smith · Osbert Bastani · Rishabh Singh · Aws Albarghouthi · Mayur Naik -
2019 : Panel Discussion »
Wenpeng Zhang · Charles Sutton · Liam Li · Rachel Thomas · Erin LeDell -
2019 : Keynote by Charles Sutton: Towards Semi-Automated Machine Learning »
Charles Sutton -
2019 Poster: Variational Russian Roulette for Deep Bayesian Nonparametrics »
Kai Xu · Akash Srivastava · Charles Sutton -
2019 Oral: Variational Russian Roulette for Deep Bayesian Nonparametrics »
Kai Xu · Akash Srivastava · Charles Sutton -
2017 Poster: Learning Continuous Semantic Representations of Symbolic Expressions »
Miltiadis Allamanis · pankajan Chanthirasegaran · Pushmeet Kohli · Charles Sutton -
2017 Talk: Learning Continuous Semantic Representations of Symbolic Expressions »
Miltiadis Allamanis · pankajan Chanthirasegaran · Pushmeet Kohli · Charles Sutton