We thank all reviewers for their helpful comments.$ Both reviewer 1 and reviewer 4 mentioned some issues concerning optimality of the dependencies on d and b, or the necessity of the warm-start phase. To clarify, we make no claim that our bounds are optimal in all parameters, and it could very well be that the analysis has artifacts which can be removed. Currently, unlike the batch setting, we simply don't know what are the precise optimal parameter dependencies of streaming methods for this problem, in the settings that we consider (e.g. memory-efficient methods; performance measured in terms of optimization error; no eigengap etc.). We hope that our work will spur further research in this direction.