We thank all reviewers for their time and comments.$ R1 Significance: By uprooting and rerooting, we demonstrate an equivalence class of models, allowing a user to select whichever is most beneficial for analysis or inference (MAP or marginal). This has theoretical and practical implications, and suggests avenues for future work. One implication is that singleton and edge potentials are in a sense equivalent, which casts fresh light on much previous theoretical and empirical work, and allows generalizations of many earlier results (Sec 5). Sec 5.2: We shall carefully justify these results, which we believe are theoretically strong with practical significance, and add background material on the various polytopes (as in Sontag 2007, or Deza and Laurent 2009 ‘DL09’ Sec 27.1 which shows that TRI equates to enforcing triplet constraints across all variables in M+, whereas LOC enforces just those that include X0), thank you. It is remarkable that TRI may be regarded as simultaneously rooted at all variables. This yields Theorem 3, a significant strengthening of the recent result of Weller and Sontag 2016 ‘WS16’ (available at Weller’s website), and is promising for future work. We shall add context to clarify the significance. LP relaxations have been studied for many years by many communities and are widely used to try to find a MAP configuration. LP+LOC is the most common but "Unfortunately, as various authors have noted (Meltzer et al 2005; Kolmogorov 2006; Yanover et al 2006; Sontag et al 2008b; Komodakis and Paragios 2008; Werner 2008) the standard LP relaxation is rarely tight on real applications. This motivates the need for cluster-based LPs..." [Batra et al AISTATS 2011]. Work by these authors showed that tighter relaxations often yield exact results empirically, but WS16 provided the first theoretical understanding of when and why this can happen, beyond restrictions only on treewidth or on potentials separately. WS16’s hybrid conditions (combining both types of restrictions) are an exciting research area with little prior work. WS16 showed that LP+TRI is always tight for an almost balanced model, a significant result. Our Theorem 3 is substantially stronger, proving that many more models are guaranteed to be tight for LP+TRI, e.g. see Fig 2. See also below. Detailed comments: A _random_ large model with attractive/repulsive potentials is unlikely to be balanced or almost balanced. But attractive (and more generally balanced) models are still of great interest; and almost balanced models are a much larger class of interest. Vision has many important practical examples due to spatial smoothness of objects, e.g. consider image segmentation as in the horses dataset of Domke TPAMI 2013, where LP+LOC is loose but LP+TRI is tight for models that are close to balanced (the ground truth segmentation is close to attractive). Further - Rerooting has negligible cost. In Sec 6, we provide a guideline to estimate if it will be beneficial to reroot, and our experiments show that there are common situations where there is a significant advantage. We shall clarify that rerooting switches an initial implicit clamp choice at X0 (perhaps a poor choice) in the uprooted model, instead to a carefully selected clamp choice, almost for free. This applies even for large models where it is desirable to clamp a series of variables: by rerooting, we may obtain one of the series of clampings for free, leading to significant approx 2x overall speedup; see also R3 comments on components. R2 Thank you. We shall clarify the proof of Lemma 1 (see text above it for a sketch). Please see proof of Lemma 2 in the Supplement. We shall add detail to Sec 5.2 as above including the proof of Theorem 3, and on the maxtW heuristic as below. R3 maxtW is a helpful contribution (see Sec 6.1). Our tanh form follows work on the loop series method (see Sec 4.2.1). Your suggestion is interesting and sensible – why not instead use a non-saturating sublinear function? We shall add more details. On our examples, log (|W|+1) performs better than maxW but worse than maxtW. Intuitively, we suggest a sufficiently strong edge effectively forces one end variable given the other. This was examined by Weller and Jebara NIPS 2014, see Sec 7.1 in their Supplement through to Lemma 12, which shows that the direct influence of one variable on another indeed has a hard saturation level. On Sec 5.2, we agree, thank you, please see above. Components. We agree with your helpful suggestions and shall discuss with acknowledgment, thank you: each separate connected component should be handled separately, with its own added variable. Indeed, as you say, this could be useful for (repeatedly) composing clamping and then rerooting each separated component to obtain a ‘free clamping’ in each. We shall explore if there are other settings where it may be helpful to introduce multiple variables when uprooting.