Timezone: »
Prediction sets have recently been shown to be a promising strategy for quantifying the uncertainty of deep neural networks in a way that provides theoretical guarantees. However, existing techniques have largely targeted settings where the space of labels is simple, so prediction sets can be arbitrary subsets of labels. For structured prediction problems where the space of labels is exponential in size, even prediction sets containing a small fraction of all labels can be exponentially large. In the context of code generation, we propose a solution that considers a restricted set of prediction sets that can compactly be represented as partial programs, which are programs with portions replaced with holes. Given a trained code generation model, our algorithm leverages a programming language's abstract syntax tree to generate a set of programs such that the correct program is in the set with high-confidence. Valuable applications of our algorithm include a Codex-style code generator with holes in uncertain parts of the generated code, which provides a partial program with theoretical guarantees. We evaluate our approach on PICARD (a T5 model for SQL semantic parsing) and Codex (a GPT model for over a dozen programming languages, including Python), demonstrating that our approach generates compact PAC prediction sets. This is the first research contribution that generates PAC prediction sets for generative code models.
Author Information
Adam Khakhar (University of Pennsylvania)
Stephen Mell (Department of Computer and Information Science, School of Engineering and Applied Science)
Osbert Bastani (University of Pennsylvania)
More from the Same Authors
-
2021 : Robust Generalization of Quadratic Neural Networks via Function Identification »
Kan Xu · Hamsa Bastani · Osbert Bastani -
2021 : Mind the Gap: Safely Bridging Offline and Online Reinforcement Learning »
Wanqiao Xu · Kan Xu · Hamsa Bastani · Osbert Bastani -
2021 : Mind the Gap: Safely Bridging Offline and Online Reinforcement Learning »
Wanqiao Xu · Kan Xu · Hamsa Bastani · Osbert Bastani -
2021 : Improving Human Decision-Making with Machine Learning »
Hamsa Bastani · Osbert Bastani · Wichinpong Sinchaisri -
2021 : Improving Human Decision-Making with Machine Learning »
Hamsa Bastani · Osbert Bastani · Wichinpong Sinchaisri -
2023 : TRAC: Trustworthy Retrieval Augmented Chatbot »
Shuo Li · Sangdon Park · Insup Lee · Osbert Bastani -
2023 : TRAC: Trustworthy Retrieval Augmented Chatbot »
Shuo Li · Sangdon Park · Insup Lee · Osbert Bastani -
2023 Poster: LIV: Language-Image Representations and Rewards for Robotic Control »
Yecheng Jason Ma · Vikash Kumar · Amy Zhang · Osbert Bastani · Dinesh Jayaraman -
2023 Poster: Robust Subtask Learning for Compositional Generalization »
Kishor Jothimurugan · Steve Hsu · Osbert Bastani · Rajeev Alur -
2022 : Spotlight Presentations »
Adrian Weller · Osbert Bastani · Jake Snell · Tal Schuster · Stephen Bates · Zhendong Wang · Margaux Zaffran · Danielle Rasooly · Varun Babbar -
2022 Poster: Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching »
Yecheng Jason Ma · Andrew Shen · Dinesh Jayaraman · Osbert Bastani -
2022 Spotlight: Versatile Offline Imitation from Observations and Examples via Regularized State-Occupancy Matching »
Yecheng Jason Ma · Andrew Shen · Dinesh Jayaraman · Osbert Bastani -
2022 Poster: Understanding Robust Generalization in Learning Regular Languages »
Soham Dan · Osbert Bastani · Dan Roth -
2022 Spotlight: Understanding Robust Generalization in Learning Regular Languages »
Soham Dan · Osbert Bastani · Dan Roth -
2022 Poster: Sequential Covariate Shift Detection Using Classifier Two-Sample Tests »
Sooyong Jang · Sangdon Park · Insup Lee · Osbert Bastani -
2022 Spotlight: Sequential Covariate Shift Detection Using Classifier Two-Sample Tests »
Sooyong Jang · Sangdon Park · Insup Lee · Osbert Bastani -
2021 Poster: Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings »
Kan Xu · Xuanyi Zhao · Hamsa Bastani · Osbert Bastani -
2021 Spotlight: Group-Sparse Matrix Factorization for Transfer Learning of Word Embeddings »
Kan Xu · Xuanyi Zhao · Hamsa Bastani · Osbert Bastani -
2020 Poster: Robust and Stable Black Box Explanations »
Hima Lakkaraju · Nino Arsov · Osbert Bastani -
2020 Poster: Generating Programmatic Referring Expressions via Program Synthesis »
Jiani Huang · Calvin Smith · Osbert Bastani · Rishabh Singh · Aws Albarghouthi · Mayur Naik -
2019 Poster: Learning Neurosymbolic Generative Models via Program Synthesis »
Halley R Young · Osbert Bastani · Mayur Naik -
2019 Oral: Learning Neurosymbolic Generative Models via Program Synthesis »
Halley R Young · Osbert Bastani · Mayur Naik