HelioX: A GPU-Native Framework for Simulation and Training of Biophysically Detailed Networks
Abstract
Biophysically detailed neural networks represent a promising frontier for brain-inspired AI, offering intrinsic spatio-temporal dynamics to enhance the expressivity and computational density of deep learning systems. However, general-purpose deep learning frameworks suffer from a fundamental mismatch between their dense parallel optimizations and the irregular, tree-structured complexity of biological mechanisms. In this work, we propose HelioX, a GPU-native framework designed to unify high-performance simulation with scalable training. Unlike approaches that adapt biology to existing deep learning tools, HelioX adopts a "GPU-to-Biophysics" paradigm. We tailor the underlying GPU parallelism to biological structures by implementing custom-fused CUDA kernels for both the Dendritic Hierarchical Scheduling (DHS) algorithm and its gradient propagation. This design eliminates the runtime overhead of generic automatic differentiation and enables multi-stream concurrency for spike generation and equation assembly. Experimental results demonstrate that HelioX outperforms standard simulators (NEURON) by orders of magnitude and surpasses prior GPU-based solvers in both speed and scalability. We successfully train deep biophysical MLPs and organism-scale biophysical neural networks (e.g., the BAAIWorm C. elegans model) on a single consumer-grade GPU. HelioX establishes a new standard for computational efficiency, enabling the training of biophysically detailed models at scales previously unattainable.