Skip to yearly menu bar Skip to main content


Invited talk
in
Workshop: Continuous Time Perspectives in Machine Learning

ResNet after all? How (not) to design continuous neural network architectures

Katharina Ott


Abstract:

Can Neural ODE architectures provide a continuous-time extension of residual neural networks? I will show that this depends on the specific numerical solver chosen for training Neural ODE models. If the trained model is supposed to be a flow generated from an ODE, it should be possible to choose another numerical solver with equal or smaller numerical error without loss of performance. But if training relies on a solver with overly coarse discretization, then testing with another solver of equal or smaller numerical error results in a sharp drop in accuracy. In such cases, the combination of vector field and numerical method cannot be interpreted as a flow generated from an ODE, which arguably poses a fatal breakdown of the continuous-in-time concept. I will examine the specific effects which lead to this breakdown and discuss how to ensure that the model maintains continuous-time properties.

Chat is not available.