Tutorial Mon, Jul 14, 2025 • 1:30 PM – 4:00 PM PDT

The Underlying Logic of Language Models

Jiaoda Li · Ryan Cotterell · Franz Nowak · Anej Svete

Abstract

The formal basis of the theory of computation lies in the study of languages, subsets of Σ, the set of all strings over an alphabet Σ. Models of computation can be taxonomized into the languages they can decide on, i.e., which languages a model can be used to determine membership of. For instance, finite-state automata can decide membership in the regular languages. Language models are probabilistic generalizations of language where the notion of a set is relaxed into one of a probability distribution over Σ. Recently, language models parameterized using recurrent neural networks, transformers, and state-space models have achieved enormous success in natural language processing. Similarly to how theorists have taxonomized models of deterministic computation, researchers have been made to taxonomize the expressivity of language models based on various architectures in terms of the distributions over strings they can represent. This tutorial presents a self-contained overview of the formal methods used to taxonomize the expressivity of language models, which encompass formal language and automata theory, various forms of formal logic, circuit complexity, and programming languages such as RASP. For example, we illustrate how transformers, under varying assumptions, can be characterized by different fragments of formal logic.

Video

Chat is not available.

Schedule

Timezone: America/Los_Angeles

1:30 PM

Introduction Intro

Anej Svete

Video

1:40 PM

Transformers and Formal Logics Talk

Jiaoda Li

Video

2:30 PM

Circuit Complexity Talk

Anej Svete

Video

3:10 PM

Structural Decomposition Talk

Franz Nowak

Video