Timezone: »
Sophisticated gated recurrent neural network architectures like LSTMs and GRUs have been shown to be highly effective in a myriad of applications. We develop an un-gated unit, the statistical recurrent unit (SRU), that is able to learn long term dependencies in data by only keeping moving averages of statistics. The SRU's architecture is simple, un-gated, and contains a comparable number of parameters to LSTMs; yet, SRUs perform favorably to more sophisticated LSTM and GRU alternatives, often outperforming one or both in various tasks. We show the efficacy of SRUs as compared to LSTMs and GRUs in an unbiased manner by optimizing respective architectures' hyperparameters for both synthetic and real-world tasks.
Author Information
Junier Oliva (Carnegie Mellon University)
Barnabás Póczos (CMU)
Jeff Schneider (CMU/Uber)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Talk: The Statistical Recurrent Unit »
Tue. Aug 8th 01:24 -- 01:42 AM Room Parkside 1
More from the Same Authors
-
2023 : Distributional Distance Classifiers for Goal-Conditioned Reinforcement Learning »
Ravi Tej Akella · Benjamin Eysenbach · Jeff Schneider · Ruslan Salakhutdinov -
2023 : Kernelized Offline Contextual Dueling Bandits »
Viraj Mehta · Ojash Neopane · Vikramjeet Das · Sen Lin · Jeff Schneider · Willie Neiswanger -
2023 Poster: Continuously Parameterized Mixture Models »
Christopher Bender · Yifeng Shi · Marc Niethammer · Junier Oliva -
2023 Poster: Learning Temporally AbstractWorld Models without Online Experimentation »
Benjamin Freed · Siddarth Venkatraman · Guillaume Sartoretti · Jeff Schneider · Howie Choset -
2020 Poster: VideoOneNet: Bidirectional Convolutional Recurrent OneNet with Trainable Data Steps for Video Processing »
Zoltán Á. Milacski · Barnabás Póczos · Andras Lorincz -
2019 Poster: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2019 Oral: Myopic Posterior Sampling for Adaptive Goal Oriented Design of Experiments »
Kirthevasan Kandasamy · Willie Neiswanger · Reed Zhang · Akshay Krishnamurthy · Jeff Schneider · Barnabás Póczos -
2018 Poster: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Oral: Transformation Autoregressive Networks »
Junier Oliva · Kumar Avinava Dubey · Manzil Zaheer · Barnabás Póczos · Ruslan Salakhutdinov · Eric Xing · Jeff Schneider -
2018 Poster: Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima »
Simon Du · Jason Lee · Yuandong Tian · Aarti Singh · Barnabás Póczos -
2018 Oral: Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima »
Simon Du · Jason Lee · Yuandong Tian · Aarti Singh · Barnabás Póczos -
2017 Poster: Multi-fidelity Bayesian Optimisation with Continuous Approximations »
kirthevasan kandasamy · Gautam Dasarathy · Barnabás Póczos · Jeff Schneider -
2017 Talk: Multi-fidelity Bayesian Optimisation with Continuous Approximations »
kirthevasan kandasamy · Gautam Dasarathy · Barnabás Póczos · Jeff Schneider -
2017 Poster: Nonparanormal Information Estimation »
Shashank Singh · Barnabás Póczos -
2017 Talk: Nonparanormal Information Estimation »
Shashank Singh · Barnabás Póczos -
2017 Poster: Equivariance Through Parameter-Sharing »
Siamak Ravanbakhsh · Jeff Schneider · Barnabás Póczos -
2017 Talk: Equivariance Through Parameter-Sharing »
Siamak Ravanbakhsh · Jeff Schneider · Barnabás Póczos