Abstract:
We explore whether quantum advantages can be found for the zeroth-order feedback online exp-concave optimization problem, which is also known as bandit exp-concave optimization with multi-point feedback. We present quantum online quasi-Newton methods to tackle the problem and show that there exists quantum advantages for such problems. Our method approximates the Hessian by quantum estimated inexact gradient and can achieve $O(n\log T)$ regret with $O(1)$ queries at each round, where $n$ is the dimension of the decision set and $T$ is the total decision rounds. Such regret improves the optimal classical algorithm by a factor of $T^{2/3}$.
Chat is not available.