We thank all the reviewers for their positive and constructive comments.$
R1: 

“Neural Turing Machine, the Stack RNN, Neural DeQue, and End-toEnd Memory”: We did not compare our approach to these methods since they use continuous interfaces, thus can be trained by backprop. In contrast, our model uses discrete interfaces which requires RL to train. By allowing discrete actions, a far wider range of interfaces are permissible, and our paper is one of the first (along with the RLNTM) to explore them. One version of the stack-RNN does use a discrete interface, but it is unclear how it would scale to large-scale problems, such as those explored in our paper. 


R4: 

- Thank you for the many citations that we missed. We will revise the related work section to cover them. 

- Thanks also for finding the typos. We will fix them in the final version.


R5:

- Will fix typos, thanks.

- “For the kinds of results claimed in the paper, I'd want to see consistent behavior over maybe 20 or so problems.“ We do show results on 6 different problems, which is more than related approaches. However, we do agree that more problems are needed, thus will tone down the claims made in the paper.