Abstract:
We consider the setting of online convex optimization with adversarial time-varying constraints in which actions must be feasible w.r.t. a fixed constraint set, and are also required on average to approximately satisfy additional time-varying constraints. Motivated by scenarios in which the fixed feasible set (hard constraint) is difficult to project on, we consider projection-free algorithms that access this set only through a linear optimization oracle (LOO). We present an algorithm that, on a sequence of length and using overall calls to the LOO, guarantees regret w.r.t. the losses and constraints violation (ignoring all quantities except for ). In particular, these bounds hold w.r.t. any interval of the sequence. This algorithm however also requires access to an oracle for minimizing a strongly convex nonsmooth function over a Euclidean ball. We present a more efficient algorithm that does not require the latter optimization oracle but only first-order access to the time-varying constraints, and achieves similar bounds w.r.t. the entire sequence. We extend the latter to the setting of bandit feedback and obtain similar bounds (as a function of ) in expectation.
Chat is not available.