Timezone: »

Looped Transformers are Better at Learning Learning Algorithms
Liu Yang · Kangwook Lee · Robert Nowak · Dimitris Papailiopoulos
Event URL: https://openreview.net/forum?id=XpVoUnPuYV »

Transformers can “learn” to solve data-fitting problems generated by a variety of (latent) models, including linear models, sparse linear models, decision trees, and neural networks, as demonstrated by Garg et al. (2022). These tasks, which fall under well-defined function class learning problems, can be solved using iterative algorithms that involve repeatedly applying the same function to the input potentially an infinite number of times. In this work, we aim to train a transformer to emulate this iterative behavior by utilizing a looped transformer architecture (Giannou et al., 2023). Our experimental results reveal that the looped transformer performs equally well as the unlooped transformer in solving these numerical tasks, while also offering the advantage of having much fewer parameters

Author Information

Liu Yang (University of Wisconsin - Madison)
Kangwook Lee (KAIST)
Robert Nowak (University of Wisconsion-Madison)
Robert Nowak

Robert Nowak holds the Nosbusch Professorship in Engineering at the University of Wisconsin-Madison, where his research focuses on signal processing, machine learning, optimization, and statistics.

Dimitris Papailiopoulos (University of Wisconsin-Madison)

More from the Same Authors