Paper ID: 642 Title: Fast Constrained Submodular Maximization: Personalized Data Summarization Review #1 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The paper shows a new fast algorithm for maximizing a (non-monotone) submodular function subject to a p-system and l knapsack constraints. The algorithms and the analysis is a combination of previous work by Gupta et al. ’10 and Badanidiyuru,Vondrak ’14. The paper also obtains a slight tightening of the analysis by Gupta et al. ’10, which leads to an improvement in the approximation factor. Clarity - Justification: The proof and the experiments are well explained. Significance - Justification: Both the algorithms and the analysis are largely adaptation of previous works. Roughly speaking, the algorithm by Badanidiyuru,Vondrak ’14 for the same set of constraints but only monotone functions works by calling a version of greedy as subroutine. Greedy does not work for the non-monotone setting and but fortunately an algorithm with a few calls to greedy by Gupta et al. does the job. Thus, the current work replaces greedy in the algorithm by Badanidiyuru,Vondrak ’14 with the algorithm by Gupta et al. In the proofs, instead of using the proof for greedy, one instead use the proof from Gupta et al. While the plan seems simple, a nontrivial amount of work is needed to synthesize them into a single proof. The authors also show improvement to the original analysis by Gupta et al. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): The comparisons in the experiments are reasonable but can be improved for future work. For instance, all the scenarios considered can be modeled with exclusively knapsack constraints (all matroid constraints are either uniform or partition matroids, which are just a collection of cardinality constraints). Even with the same algorithm from this paper, what is the relative performance of the algorithm when modelling a constraint as a knapsack vs a matroid constraint? Line 474, 477: descritize -> discretize The years are missing in citations Hartline et al. and Lindgren et al. ===== Review #2 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This work proposes a new algorithm, FANTOM, and a proves that it is a (1 + epsilon)(p + 1)(2p + 2\ell + 1)/p approximation to the problem of non-monotone submodular maximization subject to a constraint that consists of the intersection of a p-system with \ell knapsack constraints. The algorithm has query complexity O(nrp log(n) / epsilon), where r is size of the largest feasible solution. As indicated by Table 1 of the paper, the algorithm applies to a constraint setting for which no approximation guarantees previously existed. For many simpler constraint settings, it also provides a faster runtime than was previously known. Clarity - Justification: The example applications are nice, as is the long introduction. But the origin of some of the contents in Table 1 is unclear, and the proofs could really use some cleaning up. (Additionally, there are a significant number of typos.) I've made a bunch of notes in the "detailed comments" section that might help with getting the cleanup process started. Significance - Justification: As mentioned in the paper summary, FANTOM addresses a constraint setting never before covered, and does so with a relatively efficient algorithm. This constraint setting is clearly relevant to many practical applications (3 concrete examples are given in the paper), and thus the result is of substantial significance. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): Line 95: "Gomes and Krause" reference doesn't have a date. (Several other references in the paper are also missing their dates. Maybe check in the bib file to make sure the year is always in a separate field?) Line 99: "Kernel" should be "kernel" (no caps) Line 148: "that make them" should be "that makes them" Line 172: "Map inference" should be "MAP inference" Line 173: "Determinantal Point Processes" should be "determinantal point processes" Line 176: "divers" should be "diverse" Table 1: Can you make the references in Table 1 more precise (give Theorem numbers from the cited papers)? In particular, it's not clear to me where, in either of the Buchbinder et al. citations, there is an algorithm for general 1-matroid + \ell-knapsack constraints. Their 2012 paper handles the unconstrained setting and their 2014 paper the cardinality-constrained setting, but neither seems to address 1-matroid + \ell-knapsack. Also, it would be nice to have a Theorem and proof somewhere in the paper stating how the p-matroid + l-knapsack result simplifies to the p-system and p-matriod results given at the bottom right of the table. Line 237: "a family of subset of E" should be "a family of subsets of E" Line 434: Don't think the notation f_S(j) is defined anywhere. Might want to add "f_S(j) = f(S \cup {j}) - f(S)" somewhere in Section 5.1. Line 435: This line would have been much clearer to me if it had read simply "T = argmax_{j \in E} f(j)". Line 473: "we consider density threshold" should be "we consider the density threshold" Line 474: "descritize" should be "discretize" Line 478: "threshold" should be "thresholds" Line 496: "on three" should be "on the three" Line 508: "due to expensive" should be "due to an expensive" Line 517: "baseline start with" should be " baseline starts with" Line 522: Incomplete sentence. Experiments: Can you give some intuition about the performance of the baselines? (For instance, it would be nice to know why Greedy frequently beats Density Greedy.) Line 757: "\sum_i^{\ell}" should be "\sum_{i=1}^{\ell}" Line 766: Up until this line things seem pretty straightforward, but the "max{...} >= 0.5\rho" expression really should have more explanation than "by submodularity". For example, something like the following would suffice: "Normalization requires that f(\emptyset) = 0, and submodularity implies f(j) - f(\emptyset) >= f(S_{\rho}) - f(T_{\rho}). Thus, if f(j) < 0.5\rho, then f(S_{\rho}) - f(T_{\rho}) < 0.5\rho. Combining this with the fact that f(S_{\rho}) > \rho, we know f(T_{\rho}) > 0.5\rho. Therefore, either f(j) >= 0.5\rho, or f(T_{\rho}) > 0.5\rho." In general, more detailed explanations in the proofs would be nice. If you're running out of space, please just push one of the proofs to the appendix rather than skimping on explanation. Also, for clarity, perhaps use something other than T in Lines 435 and 436 to denote the singleton set. (In the proof T_p is used to denote the non-singleton solution possibility, while in the algorithm T is the singleton solution.) Line 771: "Now we divide elements in C into two sets." I don't think a specific C is defined anywhere? Does this statement mean "Now consider any C \in I (any independent set), and divide its elements into two sets."? If so, then state it that way in the proof. Lines 780 and 782: "j" should be "e" Line 788: "then we still S_{\rho} as" should be "then we still get S_{\rho} as" Lines 794 and 797: "S" should be "S_{\rho}" General question about the proof of Theorem 5.1: The two cases analyzed consider FANTOM terminating because of violated constraints. What about the case where FANTOM terminates not because adding the next item would violate any constraint, but simply because it cannot find an item of sufficient density? Notation suggestion: It's a bit confusing to have both p and \rho in the paper, since they look so similar. Replacing \rho with any other letter would improve the clarity. Line 866: "in lemma 5.2" should be "in Theorem 5.2" Line 873: "showed it's applications" should be "showed its applications" ===== Review #3 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This paper considers fast algorithms for maximization of (monotone and non-monotone) submodular functions under intersection of p-matroids and l-knapsack constraints. The technique builds on a line of work in this area, and provides the state-of-the-art guarantees for this problem. The main contribution of the paper is the FANTOM algorithm: this algorithm obtains a desirable approximation guarantee of (1+\epsilon)(p+1)(2p+2l+1)/p using O(nrplog n/epsilon) queries; what is nice about this algorithm is that its performance is comparable to state-of-the-art algorithms that are specialized for a particular set of constraints (e.g. matroid or cardinality constraints), but it is somewhat of a bulldozer in that it can take on very general problems. The personalized data summarization is a nice example. Clarity - Justification: The paper is well written and easy to follow. The comparison with previous work is quite clear. Significance - Justification: The significance is somewhere between above avg, and excellent -- I think this paper is a nice contribution but I'm discounting since there has been a rich line of work on this topic and the algorithmic techniques are new but are somewhat generalizations of previous ones. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): The authors first present several nice applications of submodular maximization where one would care about fast algorithms. The nicest application is the personalized data summarization where the authors give a new model for this problem,. In particular, the authors show how submodular maximization under matroid intersection and multiple knapsack constraints can be used. This is a new and interesting application, and fully utilizes the power of submodular optimization. The other applications presented are nice, but are not new to the paper. This is an advantage since the authors show how their algorithms can be applied to problems we care about. The applications discussed are personalized image summarization (here similar to facility location problems), and revenue maximization in social networks, which reduces to non-monotone submodular maximization. The algorithm suggested here applies to all the applications. The algorithm is called FANTOM and has a similar flavor to previous algorithms for fast submodular maximization: at a very high level it chooses a threshold and keeps all the elements whose bang-per-buck us above this threshold. The crux here is the analysis, which is not trivial for the general constraints here. Regarding the experiments, they are quite extensive, compare against natural benchmarks (variants of greedy), and mainly showcase the applicability of FANTOM. I appreciate the fact that the authors state that they do not see immediate applications of non-monotone submodular maximization. It's nevertheless useful to have the result hold for this case, and makes the result have an even more general appeal. =====