We thank all the reviewers for the positive feedback and the helpful comments. Our responses consist of the following (non-major) clarifications:$ * About acknowledging SRL/ILP approaches (suggested by Reviewer 1) We will acknowledge SRL/ILP as indirected related fields in our final version. Recall that SRL/ILP approaches focus on first-order logic (and/or probabilistic reasoning over graphical models). Our method, on the other hand, focuses on using graph product to enable transductive learning (semi-supervised label propagation) based on heterogeneous input graphs, and a novel convex formulation for tractable approximation of tensor operations across multiple large graphs. Although our method shares the high-level goal of SRL/ILP in terms of multirelational learning, the problem settings and technical foundations differ substantially. * About possible evaluation on “multiple completions in each DB” (suggested by Reviewer 1) We agree that evaluating the proposed method in multiple prediction tasks on each DB is interesting. In fact, this is a part of our ongoing work (and we will report the results if space allows). We chose the paper field of DBLP in the current manuscript because this is the most difficult task—the paper graph is the largest among all the three graphs in DBLP. * About reporting random guessing for MAP (suggested by Reviewer 1) Sure, we will add that. Notice that there are about 12K candidate answers for each missing slot in DBLP; the corresponding MAP score of random guessing is 0.00072. In the Enzyme dataset, random guessing has a MAP of 0.014. Both are much lower than that of our system. * About why we call our approach a framework (instead of a specific method), and about “to which extent the novel method can be extended/generalized (questioned by Reviewer 2) We view our proposed approach as a framework for the following reasons: 1. The formulation is compatible with a wide range of graph product operators. That is, we can choose different functions (Tensor or Cartesian or any other ones within the SGP family) for combining multi-graph association patterns by specifying \kappa in SGP. 2. The approximation scheme supports a variety of tensor models such as Tucker, CP and Tensor-Train, with different tradeoffs (line 482-489). 3. Although for discussion purposes we optimize a ranking loss (Eq. (10)) using stochastic gradient descent, our graph product based approach and approximation technique do not rely on the particular choice of loss function and can be used in conjunction with any optimization solver. * Clarification on the comment of “the solution to scalability of if seems simply to be a sampling process as in Equation (10).” (Reviewer 2) The scalability of our algorithm comes from the novel convex formulation of the approximation technique as we described in Section 3.2, where no sampling is involved. * “If we have similarity graphs for papers, authors, and venues, then why would we need to predict (author, paper, venue) tuples? Perhaps there is something else about the model structure that has interesting insights, such as suggesting which authors are likely to publish in which venues in the future.” (Reviewer 3) Yes. Accurately predicting the emerging cross-graph associations (tuples) in the future will enable a wide range of potential applications (e.g. building citation/expert recommendation systems, as described in line 69-73), and is exactly the motivation of our problem setting. * “Is there an implicit clustering/dimensionality reduction taking place, as is in matrix factorization methods?” (Reviewer 3) One of the building blocks of our approximation algorithm is to efficiently reduce the large tensor f into another tensor \alpha of much smaller dimensionality by exploiting the spectral properties of the product graph. Compared to matrix/tensor factorization methods which typically suffer from a lack of convexity, our algorithm enjoys convexity (and therefore guaranteed global optimum) and shows significantly better empirical performance on cross-graph multirelational learning tasks (see TOP v.s. TF/GRTF in Fig. 5 and Fig. 6).