Timezone: »
Poster
"Hey, that's not an ODE": Faster ODE Adjoints via Seminorms
Patrick Kidger · Ricky T. Q. Chen · Terry Lyons
Neural differential equations may be trained by backpropagating gradients via the adjoint method, which is another differential equation typically solved using an adaptive-step-size numerical differential equation solver. A proposed step is accepted if its error, \emph{relative to some norm}, is sufficiently small; else it is rejected, the step is shrunk, and the process is repeated. Here, we demonstrate that the particular structure of the adjoint equations makes the usual choices of norm (such as $L^2$) unnecessarily stringent. By replacing it with a more appropriate (semi)norm, fewer steps are unnecessarily rejected and the backpropagation is made faster. This requires only minor code modifications.
Experiments on a wide range of tasks---including time series, generative modeling, and physical control---demonstrate a median improvement of 40\% fewer function evaluations. On some problems we see as much as 62\% fewer function evaluations, so that the overall training time is roughly halved.
Author Information
Patrick Kidger (University of Oxford)
Maths+ML PhD student at Oxford. Neural ODEs+SDEs+CDEs, time series, rough analysis. (Also ice skating, martial arts and scuba diving!)
Ricky T. Q. Chen (U of Toronto)
Terry Lyons (University of Oxford)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Spotlight: "Hey, that's not an ODE": Faster ODE Adjoints via Seminorms »
Tue. Jul 20th 02:30 -- 02:35 PM Room
More from the Same Authors
-
2023 Poster: Sampling-based Nyström Approximation and Kernel Quadrature »
Satoshi Hayakawa · Harald Oberhauser · Terry Lyons -
2021 Workshop: INNF+: Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models »
Chin-Wei Huang · David Krueger · Rianne Van den Berg · George Papamakarios · Ricky T. Q. Chen · Danilo J. Rezende -
2021 Poster: SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data »
Maud Lemercier · Cristopher Salvi · Thomas Cass · Edwin V. Bonilla · Theodoros Damoulas · Terry Lyons -
2021 Spotlight: SigGPDE: Scaling Sparse Gaussian Processes on Sequential Data »
Maud Lemercier · Cristopher Salvi · Thomas Cass · Edwin V. Bonilla · Theodoros Damoulas · Terry Lyons -
2021 Poster: Neural SDEs as Infinite-Dimensional GANs »
Patrick Kidger · James Foster · Xuechen Li · Terry Lyons -
2021 Spotlight: Neural SDEs as Infinite-Dimensional GANs »
Patrick Kidger · James Foster · Xuechen Li · Terry Lyons -
2021 Poster: Neural Rough Differential Equations for Long Time Series »
James Morrill · Cristopher Salvi · Patrick Kidger · James Foster -
2021 Spotlight: Neural Rough Differential Equations for Long Time Series »
James Morrill · Cristopher Salvi · Patrick Kidger · James Foster -
2020 Workshop: INNF+: Invertible Neural Networks, Normalizing Flows, and Explicit Likelihood Models »
Chin-Wei Huang · David Krueger · Rianne Van den Berg · George Papamakarios · Chris Cremer · Ricky T. Q. Chen · Danilo J. Rezende -
2019 Workshop: Invertible Neural Networks and Normalizing Flows »
Chin-Wei Huang · David Krueger · Rianne Van den Berg · George Papamakarios · Aidan Gomez · Chris Cremer · Aaron Courville · Ricky T. Q. Chen · Danilo J. Rezende -
2019 Poster: Invertible Residual Networks »
Jens Behrmann · Will Grathwohl · Ricky T. Q. Chen · David Duvenaud · Joern-Henrik Jacobsen -
2019 Oral: Invertible Residual Networks »
Jens Behrmann · Will Grathwohl · Ricky T. Q. Chen · David Duvenaud · Joern-Henrik Jacobsen