Timezone: »
Spotlight
Training Recurrent Neural Networks via Forward Propagation Through Time
Anil Kag · Venkatesh Saligrama
Back-propagation through time (BPTT) has been widely used for training Recurrent Neural Networks (RNNs). BPTT updates RNN parameters on an instance by back-propagating the error in time over the entire sequence length, and as a result, leads to poor trainability due to the well-known gradient explosion/decay phenomena. While a number of prior works have proposed to mitigate vanishing/explosion effect through careful RNN architecture design, these RNN variants still train with BPTT. We propose a novel forward-propagation algorithm, FPTT, where at each time, for an instance, we update RNN parameters by optimizing an instantaneous risk function. Our proposed risk is a regularization penalty at time $t$ that evolves dynamically based on previously observed losses, and allows for RNN parameter updates to converge to a stationary solution of the empirical RNN objective. We consider both sequence-to-sequence as well as terminal loss problems. Empirically FPTT outperforms BPTT on a number of well-known benchmark tasks, thus enabling architectures like LSTMs to solve long range dependencies problems.
Author Information
Anil Kag (Boston University)
Venkatesh Saligrama (Boston University)
Related Events (a corresponding poster, oral, or spotlight)
-
2021 Poster: Training Recurrent Neural Networks via Forward Propagation Through Time »
Thu. Jul 22nd 04:00 -- 06:00 AM Room
More from the Same Authors
-
2022 : Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk »
Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama -
2022 : ActiveHedge: Hedge meets Active Learning »
Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama -
2022 : Acting Optimistically in Choosing Safe Actions »
Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama -
2022 : ActiveHedge: Hedge meets Active Learning »
Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama -
2022 : Achieving High TinyML Accuracy through Selective Cloud Interactions »
Anil Kag · Igor Fedorov · Aditya Gangrade · Paul Whatmough · Venkatesh Saligrama -
2022 : FedHeN: Federated Learning in Heterogeneous Networks »
Durmus Alp Emre Acar · Venkatesh Saligrama -
2022 Poster: Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk »
Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama -
2022 Spotlight: Strategies for Safe Multi-Armed Bandits with Logarithmic Regret and Risk »
Tianrui Chen · Aditya Gangrade · Venkatesh Saligrama -
2022 Poster: Faster Algorithms for Learning Convex Functions »
Ali Siahkamari · Durmus Alp Emre Acar · Christopher Liao · Kelly Geyer · Venkatesh Saligrama · Brian Kulis -
2022 Poster: ActiveHedge: Hedge meets Active Learning »
Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama -
2022 Spotlight: ActiveHedge: Hedge meets Active Learning »
Bhuvesh Kumar · Jacob Abernethy · Venkatesh Saligrama -
2022 Spotlight: Faster Algorithms for Learning Convex Functions »
Ali Siahkamari · Durmus Alp Emre Acar · Christopher Liao · Kelly Geyer · Venkatesh Saligrama · Brian Kulis -
2021 Poster: Debiasing Model Updates for Improving Personalized Federated Training »
Durmus Alp Emre Acar · Yue Zhao · Ruizhao Zhu · Ramon Matas · Matthew Mattina · Paul Whatmough · Venkatesh Saligrama -
2021 Spotlight: Debiasing Model Updates for Improving Personalized Federated Training »
Durmus Alp Emre Acar · Yue Zhao · Ruizhao Zhu · Ramon Matas · Matthew Mattina · Paul Whatmough · Venkatesh Saligrama -
2021 Poster: Memory Efficient Online Meta Learning »
Durmus Alp Emre Acar · Ruizhao Zhu · Venkatesh Saligrama -
2021 Spotlight: Memory Efficient Online Meta Learning »
Durmus Alp Emre Acar · Ruizhao Zhu · Venkatesh Saligrama -
2020 Poster: Piecewise Linear Regression via a Difference of Convex Functions »
Ali Siahkamari · Aditya Gangrade · Brian Kulis · Venkatesh Saligrama -
2020 Poster: Minimax Rate for Learning From Pairwise Comparisons in the BTL Model »
Julien Hendrickx · Alex Olshevsky · Venkatesh Saligrama -
2019 Poster: Graph Resistance and Learning from Pairwise Comparisons »
Julien Hendrickx · Alex Olshevsky · Venkatesh Saligrama -
2019 Oral: Graph Resistance and Learning from Pairwise Comparisons »
Julien Hendrickx · Alex Olshevsky · Venkatesh Saligrama -
2019 Poster: Learning Classifiers for Target Domain with Limited or No Labels »
Pengkai Zhu · Hanxiao Wang · Venkatesh Saligrama -
2019 Oral: Learning Classifiers for Target Domain with Limited or No Labels »
Pengkai Zhu · Hanxiao Wang · Venkatesh Saligrama -
2018 Poster: Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers »
Yao Ma · Alex Olshevsky · Csaba Szepesvari · Venkatesh Saligrama -
2018 Oral: Gradient Descent for Sparse Rank-One Matrix Completion for Crowd-Sourced Aggregation of Sparsely Interacting Workers »
Yao Ma · Alex Olshevsky · Csaba Szepesvari · Venkatesh Saligrama -
2017 Workshop: ML on a budget: IoT, Mobile and other tiny-ML applications »
Manik Varma · Venkatesh Saligrama · Prateek Jain -
2017 Poster: Adaptive Neural Networks for Efficient Inference »
Tolga Bolukbasi · Joseph Wang · Ofer Dekel · Venkatesh Saligrama -
2017 Talk: Adaptive Neural Networks for Efficient Inference »
Tolga Bolukbasi · Joseph Wang · Ofer Dekel · Venkatesh Saligrama -
2017 Poster: Connected Subgraph Detection with Mirror Descent on SDPs »
Cem Aksoylar · Orecchia Lorenzo · Venkatesh Saligrama -
2017 Talk: Connected Subgraph Detection with Mirror Descent on SDPs »
Cem Aksoylar · Orecchia Lorenzo · Venkatesh Saligrama