Paper ID: 1225 Title: Algorithms for Optimizing the Ratio of Submodular Functions Review #1 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): The paper develops several algorithms that optimize ratios of submodular functions with approximation guarantees. Three algorithms (GreedRatio, EllipsoidApprox and MajorizationMinimization) provide a nice trade-off between computational efficiency and approximation factor, while two procedures reduce the optimization to the (harder)problem of optimizing difference of submodular functions and submodular optimization with submodular constraints respectively. Simulation on a toy experiment shows that these approaches are better than a random baseline. Clarity - Justification: The paper is well written. The toy experiment is simple and the procedures are clearly explained, so they should be easy to reproduce. Significance - Justification: Although ratios of submodular functions can be optimized by using a black-box optimizer for differences of submodular functions with a log-transform, the authors rightly point out that ratios are a useful and tractable subclass. Moreover, there are interesting open questions on deriving good approximation algorithms for ratios of non-monotone submodular functions. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): In 1.1, I believe the measure you define is the Dice coefficient, not the F-measure. I would appreciate a more rigorous lower bound argument (i.e., make the oracle model explicit: does it apply to oracles that only return f(X)/g(X) or also for oracular access to f(X) and g(X),etc.) Figure labels are far too small to read. In Figure 1, how was the optimal value calculated? At present, Section 6 feels quite pointless (and I'm willing to accept the paper just for a collection of theoretically interesting procedures). For a comprehensive evaluation section, it would make sense to apply these procedures to the normalized cuts and other image segmentation problems and study computational scaling issues there (and reserve the synthetic setup just to verify the approximation factors). Similarly, it is often interesting to see if the approximation factors derived by analysis are tight or loose in real world (small-scale) instances. ===== Review #2 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This paper introduces a new problem, which is minimizing the ratio f(S)/g(S) of two submodular functions f() and g(). This problem is motivated by three machine learning applications. In the general case, a strong lower bound of n^{1/2 ­ eps}, which is tight up to a log factor, is provided and positive results are given in terms of the curvature of the function. Approximation factors of e/(e­1) and 1+ eps are obtained in the special cases where f() and g(), respectively, are additive. Clarity - Justification: The paper is well written and easy to follow; the authors also do a good job with positioning the contributions in relationship to previous work. Significance - Justification: The main problem is that this paper suggests a new optimization model which has bad approximation guarantees (a sqrt n lower bound). The authors discuss some related optimization models like maximizing the difference between submodular functions. It is mentioned that these problems can also be formulated as a difference of submodular functions or as constrained submodular optimization but it is not argued why it would be more relevant to formulate these problems as the ratio of two submodular functions instead of the known formulations. Is it possible that these models better are more suitable? There is no discussion about this, nor is there experimental comparison. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): - The positive results listed above are interesting; - The approximation ratios obtained when f() or g() are additive are nice (note that these are special cases, and follow from either the curvature result or known results for minimizing the difference of submodular functions). Regarding applications, maximizing the F­measure in information retrieval seemed like a good application. I have some reservations. Regarding applications, maximizing diversity does not seem to be a natural application in this case. A minor comment regarding the second application is that g() which is defined as m(A)(m(V) ­ m(A)) seems to be an additive functions. ===== Review #3 ===== Summary of the paper (Summarize the main claims/contributions of the paper.): This paper discusses the minimization of the ratio of submodular functions, where applications, the relation to other problems (such as discrete dc problem), approximation algorithms and hardness bounds are discussed. This problem is an instance of combinatorial fractional programming, which has been discussed for a few decades in combinatorial optimization. Although the paper does not mention at all about the relation to such existing papers, it includes several novel theoretical results that could be useful in applications including machine learning. Clarity - Justification: The paper is well organized. On the other hand, differences from existing works are not clear (although I admit this paper includes several novel results). The authors should survey more seriously the related papers in combinatorial optimization. I think it would be mandatory because this paper discusses almost only theoretical/algorithmic aspects of this problem without any mention about applications in machine learning. Significance - Justification: This paper includes several important novel results. However, relations to existing works in combinatorial fractional programming problems are not addressed at all. Detailed comments. (Explain the basis for your ratings while providing constructive feedback.): The minimization of the ratio of submodular functions (which is an instance of the so-called combinatorial fractional problem) is a topic that has been often discussed in combinatorial optimization. However, I think that this paper includes several novel theoretical results (although the paper does not cite any such papers). I recommend the authors to include the connections of the existing papers in combinatorial optimization about combinatorial fractional programming. I think that the most important novelty of paper is the results based on the Curvature of a submodular function (for example, Theorem 3.4) and the majorization-minimization scheme. The other results look somehow minor (straightforward applications of the existing techniques to the problem). The experiment looks like a giveaway attached. I think that the authors should perform it more seriously. At least, there are much space for that in the current paper. The results in Figure 1 just show that all algorithms behave almost equivalently. =====