We thank the reviewers for their valuable comments (marked as R). Please find our feedback below (marked as A).$ ------ R1. It would be interesting to devise an algorithm that works for general Hemimetric while exploiting some partial symmetricity if there is any. A. This is indeed an interesting direction for future work. We will add a brief discussion to the final version of the paper. ------ R4. How reliable is the method proposed in lines 634-636? Especially in view of line 1345... However, the algorithm is only applicable on very small problems, since the running time scales with n^5. Do you have results for the running time? A. The speedup method proposed in Section 6.2 has the same guarantees as stated in Section 6.1 and reduces the runtime from n^5 to n^3. An approximate update can only tighten the LU-bounds and hence ensures lower sample complexity (compared to Ind-Greedy). Furthermore, the speedup method is competitive in practice --- for the experimental setting of clustered-hemimetrics in Figure#2(a), running the full LearnHM algorithm performs only slightly better than the speedup variant we proposed in Section 6.2. ------ R4. I don't see why you don't want to utilize features of the input data (this is made explicit in lines 795-797). A. It is indeed an interesting direction to utilize the (partial) feature space available to the system (e.g., location and type of items). In this work, we focus on the setting where features may not be accessible by the system as, for example, they correspond to human perception. ------ R4. Does it really make sense to learn the whole metric in order to obtain reviews? Maybe the ones that are closest. A. Indeed, it could be advantageous to implement this idea into the learning objective to reduce the sample complexity (while still maintaining usefulness of the model). We will add a brief discussion to the final version of the paper. ------ R4. For example, why not use (Jamieson and Nowak, 2011a)? You are using triplets anyway in IndGreedy-SIT? A. IndGreedy-SIT is indeed an application of using (Jamieson and Nowak, 2011a) as discussed in lines 649-669. Running (Jamieson and Nowak, 2011a) learns an embedding and thus provides triplet side-information for n^3 triplets using triplet-based comparison queries. However, the complete n^3 triplet side-information does not uniquely specify the underlying D^* that we seek to learn. As discussed in lines 660-669, IndGreedy-SIT is provided with this complete n^3 triple side-information and performs additional queries to learn the unique underlying D^* while exploiting this side information to update the LU-bounds. ------ R4. A thorough discussion on the existing methods for learning asymmetric distances A. We will extend the discussion on the existing methods in the final version of the paper. In particular, we will provide more detailed comparison with (Fetaya & Ullman, 2015), (Liu et al., 2015). ------ R4. ..preference model in Section 7.2. The authors did not provide convincing motivations. A. We also performed experiments on various other synthetic instances (synthetic generation of D^* without any preference model) and the experimental results were similar to those reported in the paper for the considered preference model. ------