Low-dimensional topology of deep neural networks
Junyu Ren ⋅ Lek-Heng Lim
Abstract
We study layered models, including feedforward networks and transformers, by limiting each layer to a width of $d = 3$ neurons, i.e., a representation space of $\mathbb{R}^3$. This allows us to examine how a neural network changes low-dimensional topological invariants like links and knots, as well as more sophisticated measures like Milnor's $\mu$-invariant, through the layers. Note that one may simplify or even trivialize just about any topological structure by simply increasing dimension; for example, any knot is equivalent to an unknot in $\mathbb{R}^4$. By restricting to $\mathbb{R}^3$, we not only isolate the effects of activation and depth from that of width, we work in a space that lends itself to easy visualization. We provide full mathematical proofs and empirical experiments to justify the following insights: When measured by their power to effect topological changes, ResNets are as powerful as transformers; both are strictly more powerful than feedforward neural networks, which are in turn more powerful than invertible models like flow-based models; but using a non-monotone activation would allow the feedforward networks to become as powerful as ResNets and transformers. These results suggest that low-dimensional topology can be an important tool to guide future designs of AI architectures. We then generalize our results from $d = 3$ to arbitrary $d > 3$.
Successful Page Load