Unifying Attention and Diffusion with Kan Extension Transformers: Structured Deep Learning with Diagrammatic Backpropagation
Abstract
Modern foundation models are powerful, but their representations, training dynamics, and agentic workflows remain difficult to audit, compose, and trust. This tutorial presents a categorical and geometric framework for trustworthy foundation-model systems. The major scientific components of the tutorial include
- Diagrammatic Backpropagation (DB), which generalizes deep learning to include curvature loss function over categorical diagrams
- Infinitesimal Causality (IC), which generalizes the chain rule in calculus to functors in tangent categories
- Kan Extension Transformers (KET), which define a structured computation substrate, unifying attention and diffusion, and providing a universal machine learning framework for mapping finite experience into infinite futures
- Universal Decision Learning (UDL), which is a rigorous categorical framework for building foundries, or building blocks of foundation models
- Lie-algebra based neural adapters (ALLORA), which shows how to compose LoRa adapters by detecting non-commutativity using Lie-Brackets
- Agentic skill optimization using Lie Algebroids(LASKO), which formalizes optimization over tangent Markdown categories
- Odyssey: a demonstration system for automatic foundry construction.
The tutorial is designed as a conceptual 2.5-hour overview. Technical details are deferred to associated arXiv papers and the Categories for AGI book. Participants will leave with a solid understanding of a powerful categorical and geometric design language for foundation-model systems that learn locally, transfer cautiously, expose obstructions, and glue global conclusions only when the evidence permits.