Skip to yearly menu bar Skip to main content


Poster
in
Workshop: 2nd Workshop on Formal Verification of Machine Learning

Meaning in Language Models: A Formal Semantics Approach

Charles Jin · Martin Rinard


Abstract:

We present a framework for studying the emergence of meaning in language models based on the formal semantics of programs. Working with programs enables us to precisely define concepts relevant to meaning in language (e.g., correctness and semantics), making this domain well-suited as an intermediate testbed for characterizing the presence (or absence) of meaning in language models. Specifically, we first train a Transformer model on the corpus of programs, then probe the trained model's hidden states as it completes a program given a specification. Our findings include evidence that (1) the model states linear encode an abstraction of the program semantics, (2) such encodings emerge nearly in lockstep with the ability of the model to generate correct code during training, and (3) the model learns to generate correct programs that are, on average, shorter than those in the training set. In summary, this paper does not propose any new techniques for improving language models, but develops an experimental framework for and provides insights into the acquisition and representation of (formal) meaning in language models.

Chat is not available.