We thank the reviewers for their helpful feedback and for catching the typos. We address the individual points below.$
REV_1

- Positioning of this work

Our results consist of theoretical and algorithmic parts, both of which are considered as significant. Concretely:

Theoretical contributions: the idea of bounding BIFs with Gauss-type quadrature exists since decades. However, there has not been a single result stating explicit relations between different quadrature rules, and the convergence rates of neither Gauss-Radau nor Gauss-Lobatto rules were known so far. Our theoretical results resolve these long-standing questions, and provide a guide for practitioners who use these rules -- they know which rule to use and will get a sense of what convergence rate to expect. 

Algorithmic contributions: in addition to the theoretical analyses, we propose a framework that integrates lower and upper bounds into algorithms for acceleration by enabling early stopping while provably retaining correctness. A one-sided approximation cannot guarantee correctness in the same fashion. Although there exist various approaches addressing BIFs (as pointed out by the reviewer, and also noted below), these approaches only consider approximating the target values instead of obtaining intervals lower- and upper-bounding them. While the latter approach (ours) enables exact transitions in algorithms with efficient early stopping, the former will result in approximate, perhaps even wrong transitions. This is the main point distinguishing our approach from all existing methods.

We illustrated many applications so as to support our claim that BIFs exist across a wide range of problems, and consequently our accelerated algorithmic framework is widely applicable.  

- Alternative approaches in the literature

We thank the reviewer for kindly pointing out alternative approaches in the literature. We acknowledge that these should have been included, and will definitely discuss them in the final version. As pointed out by the reviewer, there exist various related methods:

(Brezinski 1999, Brezinski et al. 2011) use extrapolation of matrix moments and interpolation to estimate the 2-norm error of linear systems and trace of matrix inverse (our work essentially focuses on the A-norm error of linear systems). These works provide a base for other related methods for general approximation of x^Tf(A)y. In (Fika et al 2014) the authors extend the extrapolation method to BIFs and show (in Lemma 2.8) that the derived one-term and two-term approximations for posdef A coincide with Gauss quadrature, hence providing lower bounds. This method is further generalized by (Fika &amp; Mitrouli 2015) to address y^*f(A)x, where A is Hermitian. Other methods for estimating trace of a matrix function (Bai &amp; Golub 1996, Brezinski et al. 2011, Fika &amp; Koukouvinos 2015) or diagonal elements of matrix inverse (Bekas et al. 2007, Tang &amp; Saad 2012) also exist in the literature. 

But again, all these methods rely on approximations of the target values. Thus, they cannot guarantee that an algorithm using them will make correct transitions every time. In our framework, in contrast, we obtain iteratively tighter lower and upper bounds, so the algorithm is guaranteed to make the correct transition.


REV_2

- Significance of theoretical results

Our theoretical results answer long-standing questions in the vast literature of Gauss quadrature, and they provide a guide for practitioners who want to use these rules in their applications. Please also see our comments for REV_1 that further position our work.

- Line 235: 

Yes, if we compute strict bounds for BIFs using the polarization identity, we need to run Lanczos twice. Note that here we run two Lanczos iterations in turn to get tighter bounds on both the first and second terms in polarization. If the matrix is sparse, even with two Lanczos iterations we will get significant speedups, see Sec.5. We will add this comment here.

- Line 294: 

It should be Wilf's work, we thank the reviewer for pointing this out.

- Line 335: 

Alg.1 is an iterative version of running Lanczos for N steps, where in each iteration we compute the approximation [e_1^T J_i^{-1} e_1] for Gauss quadrature, to ultimately reach [e_1^T J^{-1} e_1], which is the output of running N iterations of Lanczos. Essentially there will be not much difference in efficiency if the iterative method is run until i = N, but in practice, iterative method will benefit a lot in early-stopping, which hugely enhances the efficiency of the algorithm.

- Line 377: 

The convergence of Gauss-Radau rule depends on that of Gauss rule, which in turn depends on how fast Ritz values converge to eigenvalues of A (explicitly shown in Thm.3 and was proved in previous work, see e.g. (Golub et al. 2009)).


REV_3

- Abbreviations: 

We will correct all of them immediately.