Skip to yearly menu bar Skip to main content


Spotlight Poster

Exploiting Code Symmetries for Learning Program Semantics

Kexin Pei · Weichen Li · Qirui Jin · Shuyang Liu · Scott Geng · Lorenzo Cavallaro · Junfeng Yang · Suman Jana

Hall C 4-9 #1000
[ ] [ Paper PDF ]
[ Slides [ Poster
Wed 24 Jul 2:30 a.m. PDT — 4 a.m. PDT

Abstract:

This paper tackles the challenge of teaching code semantics to Large Language Models (LLMs) for program analysis by incorporating code symmetries into the model architecture. We introduce a group-theoretic framework that defines code symmetries as semantics-preserving transformations, where forming a code symmetry group enables precise and efficient reasoning of code semantics. Our solution, SymC, develops a novel variant of self-attention that is provably equivariant to code symmetries from the permutation group defined over the program dependence graph. SymC obtains superior performance on five program analysis tasks, outperforming state-of-the-art code models, including GPT-4, without any pre-training. Our results suggest that code LLMs that encode the code structural prior via the code symmetry group generalize better and faster.

Chat is not available.