To reviewers 5 and 6: we first address Thm 3.2: $ - Regarding thm 3.2 and lem 3.1: we apologize about the confusion regarding the space over which G is defined. It is true that G is originally defined over a vector x with entries x_i and x_ij for all a_i and all a_ij and that it is then differed over a vector x with entries that are only x_ij for all a_ij. We note that it is possible to keep G defined over the same original space (both x_i’s and x_ij’s), the mathematical program (6) then has the exact same objective function as in (1) which is defined over the same space, and the constraints have been equivalently reformulated with the x_i unconstrained. This solves both issues: - (1) multilinearity: G is multilinear by Lem 3.1 since we have not changed G. - (2) the feasible region is still downward closed by adding back the x_i’s since the x_i’s are unconstrained in (6), and therefore trivially downward closed. We originally removed the x_i's from the space of G for clarity, but following reviewer 6's comments we realize this is a source of confusion; we will add the x_i's to avoid confusion. We hope this (as well as comments below) addresses your main concerns and will greatly appreciate if you revise your scores. Reviewer 5:thank you for interesting extensions and questions: - Re running time: running time for the general continuous approach is indeed a high-degree polynomial (n^2 iterations and n^6 samples are required at each iteration). The number of samples could be decreased for this method to run faster. For coverage function, the LP approach is much faster than continuous greedy. Note that the local search algorithm has near quadratic running time in n and that in most applications k, l, and m are much smaller than n, and we found it to be very practical. - Different cardinality constraints: all our results indeed extend to such a setting; we will include this. - Non-monotone funcs: The unified continuous greedy (Feldman et al., 11) which builds upon the original continuous greedy can be applied (apx ratio in this case is 1/e instead of 1- 1/e). It is thus possible to obtain an apx ratio arbitrarily close to 1/e in the case of non monotone submodular functions; we will include this as well. The LP and local search approaches offer guarantees for coverage functions which are always monotone. Reviewer 6:thank you for detailed comments and questions. - motivation for k < l: in several multi-objective data summarization tasks, we wish to limit differently the number of elements used by all the categories (l) and the number of elements used by each category (k). The reason is that for each category, we want elements that represent it very well, and we also may want to limit the total number of elements picked (l) to be less than m*k. If we used all the elements in S to represent each of the category (k = l), then there is a high chance that for some categories, there will be elements that do not represent that category well at all. - k >= 311: this is a typo and it should be k < 311. We obtained this 311 by numerically optimizing for each k, the apx ratio in Thm 3.3 over all choices of epsilon. We then observed that when k < 311, the apx ratio obtained for the best epsilon is < than 1/2 and > 1/2 otherwise. - G(x) is apx during the continuous greedy by sampling sets over the distribution which picks each element ij with probability x_ij independently. It is possible to show that with poly-many samples, the avg of these samples is arbitrarily close to G(x). We will include this as well; Reviewer 1:thank you for your review - Thm 3.3 does not make sense for epsilon = 0: the thm statement begins by assuming that epsilon > 2/k. - Unclear how to optimize coverage functions via LP in Section 3.1: in Section 3.1 we precise that a complete discussion of the LP approach can be found in the appendix. - Thm 3 does not include the discussion of the running time in terms of epsilon: the running time does not depend on epsilon, only the apx ratio does. epsilon is left as a variable since the optimal epsilon to obtain the best apx depends on k. - Missing definition of potential function: the potential function is defined in line 450. - In line 453 what does it mean by piecewise relaxation of f_j? should L_j be the continuous relaxation of f_j by definition?: L_j is the piecewise relaxation of f_j and defined in line 455. It is not the cont. relaxation. - Unclear how the piecewise linearity enable computing the optimal solution efficiently: this enables the LP which can be solved efficiently. See line 418: Apdx C contains a detailed explanation. - Without thorough discussion of motivation, the studied problem lacks practical usage: the motivation is thoroughly discussed in intro and experiments sections. - The authors may want to add some discussion comparing the local search and continuous optimization algorithms: see intro (starting line 125).