Timezone: »

Debiasing a First-order Heuristic for Approximate Bi-level Optimization
Valerii Likhosherstov · Xingyou Song · Krzysztof Choromanski · Jared Quincy Davis · Adrian Weller

Wed Jul 21 05:20 AM -- 05:25 AM (PDT) @ None
Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length $r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic which omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM's popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM's gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of $r$. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

Author Information

Valerii Likhosherstov (University of Cambridge)
Xingyou Song (Google Brain)
Krzysztof Choromanski (Google Brain Robotics)
Jared Quincy Davis (DeepMind & Stanford University)
Adrian Weller (University of Cambridge, Alan Turing Institute)

Adrian Weller is a Senior Research Fellow in the Machine Learning Group at the University of Cambridge, a Faculty Fellow at the Alan Turing Institute for data science and an Executive Fellow at the Leverhulme Centre for the Future of Intelligence (CFI). He is very interested in all aspects of artificial intelligence, its commercial applications and how it may be used to benefit society. At the CFI, he leads their project on Trust and Transparency. Previously, Adrian held senior roles in finance. He received a PhD in computer science from Columbia University, and an undergraduate degree in mathematics from Trinity College, Cambridge.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors