We thank all reviewers for their comments and feedback.$We comment on the main points raised. Reviewer 1 Q1 Here we use the term stability in a general sense, referring both to numerical stability and stability with respect to sampling and noise. We will change the text to make this more precise. Q2 Yes, there is a typo here. Q3 Good point. We will add and surely some comments and possibly some plots, if we can squeeze them in. - Thanks for pointing out all the typos etc. We’ll fix the text! Corresponding to Reviewer 2 Q1 The theoretical insight on the inferior experimental performance of SIGM with a decaying step-size as compared with SIGM with a constant step-size can be found in lines 390-392, a remark based on Corollaries 1 and 2. But indeed there seem to be some gaps between theory and experiments. Some aspects that will need further investigation is the stopping rule, we were also surprised by these results. Larger scale experiments are probably needed here. Q2 The definition is correct, and we were not precise in lines 590-592. The correct statement should be ‘the last term in the error decomposition can be upper bounded by the approximation error'. Q3 Yes, we will revise it. Thanks for pointing this out. Corresponding to Reviewer 4 Q1 $\beta$ does not necessary to be 1, since the hypothesis space may be chosen as a general infinite dimensional space, for example in non-parametric regression. But, for some certain circumstances, for example when the hypothesis parameter set is compact, or finite dimensional, then Assumption 2 is satisfied with $\beta=1$. We will add a further comment. Note that $\beta$ is generally not given in advance. To let SGM or SIGM achieve optimal performance, it is necessary to tune the step-size or the number of epochs, according to our results. Q2 For a smooth loss function, the convergence rates from Corollaries 1-4 are of order $O(m^{-1/2})$ when $\beta=1,$ which are the same as those in previous literature, while for a non-smooth loss function, the convergence rates are $O(m^{-1/3})$ for $\beta=1,$ which are worsen than those in previous results. Also in this case, comments will be added.