Language Modeling in People and Machines
in
Workshop: ICML Workshop on Large Language Models and Cognition
Abstract
Decades of scientific research has revealed that humans are sensitive to the statistical regularities of linguistic inputs (Saffran et al., 1996; Smith and Levy, 2013), implying that they learn and compute probabilities over the strings of their language. In this talk, I ask how close are contemporary LLMs to the human language model? In the first half, I focus on language processing and discuss the relationship between LLMs’ outputs and reading times of individual words. I show that models’ surprisal values (i.e., negative log probabilities) are well-correlated to human word-by-word reading times across languages, but that this correlation breaks down when people read ungrammatical texts. In the second half, I turn to language acquisition. I discuss the results of the BabyLM challenge, an ongoing shared task that asks participants to train an LM on 100 million words or less, roughly the amount of linguistic experience available to a typical child. Although scaled-down models achieve impressive performance at learning the structure of language, they fall short of human-level competence. I conclude by identifying future opportunities and challenges for cognitive modeling with LLMs.