Do RNN and LSTM have Long Memory?

Jingyu Zhao · Feiqing Huang · Jia Lv · Yanjie Duan · Zhen Qin · Guodong Li · Guangjian Tian

Keywords: [ Time Series and Sequence Models ] [ Sequential, Network, and Time-Series Modeling ]

[ Abstract ] [ Join Zoom
Please do not share or post zoom links


The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question -do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling longterm dependence of various datasets.

Chat is not available.