Poster
in
Workshop: Workshop on Theoretical Foundations of Foundation Models (TF2M)
Sparse Neural Architectures and Deterministic Ramanujan Graphs
Arindam Biswas · Suryam Arnav Kalra · Pabitra Mitra · Biswajit Basu
Abstract:
We introduce a sparsely connected neural network architecture inspired by Ramanujan graphs, which achieves performance comparable to dense networks. They are constructed from Cayley graphs of specific algebraic groups or as Ramanujan $r$-coverings of the full $(k,l)$ bi-regular bipartite graph with $k + l$ vertices. This novel method employs zero-shot, data-independent, deterministic pruning at initialization, facilitating early identification of winning lottery tickets. Unlike traditional methods that rely on iterative processes to find these tickets, our technique identifies them at the outset. Our ultimate goal is to construct sparse, scalable Foundation Models. Experimental results demonstrate that our proposed architecture achieves competitive accuracy and sparsity ratios comparable to those obtained by previous pre-training pruning algorithms.
Chat is not available.