Information dynamics and Memory in Neural Networks through Fisher Information Diffusion
Abstract
We present a general theoretical framework for analyzing how information about past inputs is encoded in recurrent networks into evolving dynamics rather than being represented as convergence to static attractors. Using dynamic mean-field theory and diffusion from physics, we derive a Fisher information diffusion operator that links network connectivity structure to the time-resolved propagation of information across interacting subpopulations. The analysis reveals that operating near criticality (spectral radius near one) is necessary but not sufficient for reliable memory in structured or non-normal recurrent networks; effective information retention requires alignment between input–output structure and stable dynamical subspaces. The theory yields principled initialization rules that balance stability and sensitivity, mitigating vanishing and exploding gradients. Experiments on the copy task and sequential MNIST show faster convergence and higher accuracy than standard random initialization. Together, these results provide both principled design guidelines for recurrent networks and new theoretical insight into how information can be preserved over time in their dynamics.