We thank the reviewers for their time and effort in reviewing the paper and their generous and valuable feedback. In what follows, we address specific questions asked by the reviewers.$-------------- Assigned_Reviewer_1 (1) “achieve better bound for C_PAR if restricted to high snr as in C_HIGH?” As you point out, the two classes C_PAR and C_HIGH are incomparable, and in general an algorithm tailored to one of them might perform poorly on the other. C_SST on the other hand contains both these classes: The SVT algorithm is consistent over C_SST and can be a superior algorithm to ones designed for either C_PAR or C_HIGH in cases when the corresponding assumptions of C_PAR or C_HIGH are violated. To clarify, our claim refers to the superiority of algorithms for C_SST (and not C_HIGH) over those for C_PAR In the problem framework (for C_HIGH \gamma is constant and for C_PAR the weights are bounded), the intersection between these classes is asymptotically empty. We plan to investigate other regimes (where these parameters are not held fixed) and provide a more adequate answer to your question in future work. (2) “observing entries more than once” In most cases, we expect the results will be scaled by 1/m. In each case, the lower bounds are a straightforward extension of our existing results, while the upper bounds require more careful calculations. -------------- Assigned_Reviewer_3 (1) “power law distribution over item frequency” While our observation model is realistic in situations where the pairs compared are chosen uniformly at random (a common practice in crowdsourcing applications), dealing with highly heterogenous observation probabilities or loss functions that focus on the estimation error of a few important pairs are useful questions that we hope to address in future work. (2) “weaker notions of SST” In modeling heterogenous preferences, where each user might have a different underlying ranking of objects, it is practically important to allow flexibility in the underlying permutation. Your suggestion of allowing a random permutation is very relevant in this context, and is an interesting area for future work, on the modeling and analysis fronts. We should also point out that the assumptions of moderate and weak stochastic transitivity are sometimes considered in the psychology and economics literature. However, it can be shown that these weaker assumptions lead to comparison probabilities that are not estimable without several observations of each pair. -------------- Assigned_Reviewer_5 (1) “why SST is acceptable in the psychology literature” There are broadly two considerations in psychology that have led to the widespread acceptance of this model. It was originally proposed as an axiomatic generalization of “strong utility” models (see Davidson & Marshak, 1959). In this literature, models are derived on the basis of a reasonable set of behavioral axioms, and the SST model is arrived at naturally in this fashion. From an empirical standpoint, the SST assumption is testable from data in a strong sense. This second aspect led to further acceptance as several empirical studies have confirmed the validity of the assumption in a variety of contexts. We thank the reviewer for this suggestion and will include these additional details in the revision. (2) “entries of Y mutually independent” In some sense, the outcomes do depend on each other since the underlying probabilities obey the stochastic transitivity constraints. The “noise” in the observations is assumed to be independent. We feel that assuming that the stochastic fluctuations in outcomes are independent, is reasonable in a variety of applications. (3) “SVT not guaranteed to be SST” Yes, the SVT estimator does not guarantee “proper learning" and we will clarify this point in the revision. There are some practical solutions to this: for instance to estimate a reasonable permutation from the output of the SVT estimator, and then project the SVT estimate onto the SST constraints induced by this permutation. (4) "empirical improvement” In our simulations (and theory) we show that the SVT estimator can significantly outperform parametric estimators, when the parametric model is a bad fit, and can perform comparably even when the parametric assumption is correct. Figure 1, Figure 2d and Figure 2e show that the SVT estimator gives several orders of magnitude improvement over Thurstone MLE: while the error of Thurstone MLE does not reduce upon an increase in the problem size n, the error of SVT falls polynomially with n. Extrapolating the plots to n = 2^20 = 1 million, the Thurstone MLE error will be of order 2^{-4} and the SVT error will be about 2^{-10} (we are unable to simulate such high values of n due to the high computational complexity of Thurstone MLE). Figures 2a-2c show that even when the underlying setting is extremely favorable for Thurstone MLE, the SVT estimator performs quite comparably.