A Fully First-Order Layer for Differentiable Optimization
Zihao Zhao ⋅ Kai-Chia Mo ⋅ Shing-Hei Ho ⋅ Brandon Amos ⋅ Kai Wang
Abstract
Differentiable optimization studies how to embed a mathematical program as a differentiable layer in machine learning pipelines. However, existing approaches typically rely on implicit differentiation, involving expensive Hessian computation while differentiating through optimality conditions. To address this challenge, we formulate the differentiable optimization problem as a bilevel optimization instance. We construct a new active-set Lagrangian as a proxy to compute an $\epsilon$-approximate hypergradient using only near-constant $O(\log (1/\epsilon))$ first-order information. We also show that applying this efficient hypergradient oracle to constrained bilevel optimization improves the overall gradient complexity to $\tilde{O}(\delta^{-1}\epsilon^{-3})$ to reach a $(\delta, \epsilon)$-Goldstein stationary point. We implement our method `FFOLayer`, as a drop-in Python library compatible with existing differentiable optimization solvers. Our algorithm shows significantly faster computation with similar convergence compared to other existing solvers. Our code is available [here](https://anonymous.4open.science/anonymize/FFOLayer-B78B).
Successful Page Load