Skip to yearly menu bar Skip to main content


Poster

ReLU Network with Width $d+\mathcal{O}(1)$ Can Achieve Optimal Approximation Rate

Chenghao LIU · Minghua Chen


Abstract: The prevalent employment of narrow neural networks, characterized by their minimal parameter count per layer, has led to a surge in research exploring their potential as universal function approximators.A notable result in this field states that networks with just a width of $d+1$ can approximate any continuous function for input dimension $d$ arbitrarily well. However, the optimal approximation rate for these narrowest networks, i.e., the optimal relation between the count of tunable parameters and the approximation error, remained unclear.In this paper, we address this gap by proving that ReLU networks with width $d+1$ can achieve the optimal approximation rate for continuous functions over the domain $[0,1]^d$ under $L^p$ norm for $p\in[1,\infty)$. We further show that for the uniform norm, a width of $d+11$ is sufficient. We also extend the results to narrow feed-forward networks with various activations, confirming their capability to approximate at the optimal rate. This work adds to the understanding of universal approximation of narrow networks.

Live content is unavailable. Log in and register to view live content