We thank all reviewers for their valuable time and comments. $ 1) Dice's coefficient or the F-measure. (R1) As far as we understand, Dice's coefficient shares the same definition and interpretation as F-measure. Although we feel it is more conventional to term the metric as F-measure in the context of Information Retrieval, we appreciate your pointing it out. 2) Make the oracle model explicit: does it apply to oracles that only return f(X)/g(X) or also for oracle access to f(X) and g(X)? (R1) In our hardness analysis (Section 4), we give the lower bound of O(\sqrt{n}) assuming oracle access to both f and g (which is exactly the problem statement we consider, and also a model more natural in applications since we are generally given both the functions f and g). Moreover, our algorithms also assume this oracle model. It would be also interesting to analyze the inapproximability of the problem assuming a more restricted oracle model that only has access only to the ratio f/g. In this case, the lower bound should be at least O(\sqrt{n}), but it remains open whether this lower bound could be improved. We will clarify this in the paper, should it be accepted. 3) In Figure 1, how was the optimal value calculated? (R1) We did not compute nor report the optimal value in our experiment. It is computationally infeasible to search for the optimal solution with a ground set size of 100, even using the most efficient integer programming solvers. 4) A wish for stronger empirical validations. (R1, R2) The main focus of our paper is the theoretical study of the RS optimization problem (which include several novel results including new lower and upper bounds). We perform the synthetic data experiments as a proof of concept study on the proposed algorithms. Though preliminary, this empirical study shows some interesting observations: GreedRatio, though very efficient, achieves consistently the best performance, and EllipsoidApprox, though having the tightest approximation guarantee, does not work as well empirically (line 865). We believe that this could be of significance in several of the applications that motivate the study of this problem. 5) Lack of discussion on the combinatorial fractional programming. (R2) We thank the reviewer for pointing out this line of work to us. We will add the discussion of this work into our paper should it be accepted. Though related in spirit, we think the RS optimization problem as studied in our paper is significantly different from this work in terms of both the optimization techniques and the optimization objective (submodularity is assumed in our work, whereas, linearity in the objective is often required for efficient combinatorial fractional programming). 6) RS minimization vs known formulations (e.g. DS minimization and SCSC/SCSK) (R3) As demonstrated in Section 1, RS minimization naturally occurs in a number of machine learning applications, such as Information Retrieval, Normalized Cuts (Ratio Cuts) and cost-normalized diversity maximization. We believe that RS minimization, along with our proposed algorithmic framework, adds to the growing body of the tractable optimization problems indirectly and structurally involving submodularity. Moreover, from a theoretical perspective we show several interesting connections between RS minimization and the above known formulations 7) Maximizing diversity is not a natural application. (R3) In applications where finding the best cost normalized subset is the goal, and with a submodular cost (which is common), we think that RS optimization directly optimizes a very natural utility measure, hence is a more natural formulation for such cases. 8) Is g(A)=m(A)(m(V)-m(A)) additive? (R3) No, g is indeed a submodular function. A quick proof of submodularity is as follows: For all A\subset B\subseteq V and a\notin B, g(a|A)=m(A\cup{a})(m(V)-m(A\cup{a}))-m(A)(m(V)-m(A))=m(a)m(V)-2m(A)m(a)-m^2(a)\geq g(a|B). Therefore, g is a submodular and not additive in general. Alternatively, we see g(A) = c*m(A) - \sum_{i,j \in A} m(i)m(j) which is a modular minus a supermodular, and hence a submodular, function (c is a constant).