Paper ID: 199
Title: Adaptive Algorithms for Online Convex Optimization with Long-term Constraints
Review #1
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): This paper proposed an adaptive version of the online learning algorithms in Mahdavi et al. (2012a) for solving the online convex optimization problem with long-term constraints. The proposed algorithm shows faster empirical convergence, new general regret bounds, related domain for regret bounds of O(T^2/3), and tighter bounds for strongly convex losses.
Clarity - Justification: The paper is well written, with basic problem introduction and its backgrounds in the introduction section. The problem statement and adaptive algorithm is well formulated and presented, with references to its most related work. The detailed theoretical analysis is solid. And the final empirical evaluation is clear and convincing.
Significance - Justification: The major novelty of this paper are the adaptive algorithm, and the new general regret bounds, considerably improving the previous work of Mahdavi et al. (2012a).
Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): 1. The paper presented an adaptive online algorithm for solving the OCO problem with long-term constraints as proposed in Mahdavi et al.(2012a). This seems to be new for solving the OCO problem in this specific settings, although adaptive algorithms have been used commonly in solving other OCO problems. 2. The authors figured out more general regret bounds for the proposed adaptive algorithm, with its non-adaptive algorithm as a special case. 3. The restriction of polyhedral sets for the regret bounds of O(T^2/3) is released. 4. Tighter bounds for strongly convex losses are obtained in this paper. 5. The empirical study is clear and quite convincing. 6. A minor limitation is that the authors did not well explain why the new algorithm is able to achieve a faster convergence, as observed from an empirical view.
=====
Review #2
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): This paper focuses on the minimization of regret on a compact and convex action set defined by some inequalities, but on which the projections are prohibitively expensive to compute. Regret bounds of the order T^{max(b,1-b)} while the overall errors with respect to the constraint scales as T^{1-b/2}. For instance, for b=1/2, it recovers the classic regret rate of T^{1/2} and an error of the order of T^{3/4}.
Clarity - Justification: This paper is rather clear, even though the algorithms employed somehow lacks motivations and/or explanation and/or intuition in the choice of parameters.
Significance - Justification: This paper improves the results of a precedent paper of Mahdavi et al. and they somehow use the same techniques (combined with those of Hazan et al.). Yet the results are interesting and not that easy to prove with different techniques from online learning (for instance, it reminds me of regret minimization under global/path constraint, global cost, approachability etc. of Mannor and his co-authors. For that reason, it is a bit disappointing that the probably optimal rates of T^{1/2} for regret and T^{1/2} for the constraints is not attained (or any lower bounds provided). I think it could be a really great improvement is such results are obtained.
Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): As already said, this paper is a bit incremental but still interesting. My guess is that the rates are not optimal but that is still ok with me. For the final version, I would recommend more details and intuitions on the choices of the parameters of their algorithms.
=====
Review #3
=====
Summary of the paper (Summarize the main claims/contributions of the paper.): The paper considers the online convex optimization problem with globle constraints of the form \sum_t g_i(x_t)\leq 0, where g_i is a convex function. The paper is a followup of Mahdavi et al 2012 where only linear constraints were considered (or equivalently, polyhedra constraint). The paper provides a primal-dual saddle-point formulation by relaxing the constraints as part of the objective function and perform online projected gradient descent alternately for the primal and dual variables. The idea and the analysis are very similar to Mahdavi et al. as well. The main result is a sublinear regret bound, with a sublinear voilation of the constraints (there is user specified parameter \beta to tradeoff these two quantities). By setting \beta to some specific values, the paper recovers some bound obtained in the previous work. The experiments test the algorithm for two specific convex constraints, one is the set of doubly stochastic matrices (a polytope constraint) and the other is a elastic net convex constaint. The algorithm proposed in the paper outperforms the mirror prox in Mahdavi et al. experimentally (even they have the similar asymptotic regret bounds) for the first problem, and is better than the OGD (Mahdavi et al.) for both problems.
Clarity - Justification: The paper is generally well-written. Sec 3.3. is a bit sketchy.
Significance - Justification: In general, I think the extension to general convex constraints is nice. From the technical view point, the paper seems be to qutie similar to the prior work and it is unclear what the critical technical innovation is. Nevertheless, the proofs seems to be sound and not quite straightfoward.
Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): It may be helpful to explain the meaning of assumption (4). There are several recent related work on bandit with global constraints. It is useful to discuss those work. For example, Bandits with Knapsacks Fast algorithms for online stochastic convex programming
=====