Skip to yearly menu bar Skip to main content


Tutorial

The Underlying Logic of Language Models


Abstract:

The formal basis of the theory of computation lies in the study of languages, subsets of Σ, the set of all strings over an alphabet Σ. Models of computation can be taxonomized into the languages they can decide on, i.e., which languages a model can be used to determine membership of. For instance, finite-state automata can decide membership in the regular languages. Language models are probabilistic generalizations of language where the notion of a set is relaxed into one of a probability distribution over Σ. Recently, language models parameterized using recurrent neural networks, transformers, and state-space models have achieved enormous success in natural language processing. Similarly to how theorists have taxonomized models of deterministic computation, researchers have been made to taxonomize the expressivity of language models based on various architectures in terms of the distributions over strings they can represent. This tutorial presents a self-contained overview of the formal methods used to taxonomize the expressivity of language models, which encompass formal language and automata theory, various forms of formal logic, circuit complexity, and programming languages such as RASP. For example, we illustrate how transformers, under varying assumptions, can be characterized by different fragments of formal logic.

Live content is unavailable. Log in and register to view live content