Timezone: »
Oral
Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations
Tri Dao · Albert Gu · Matthew Eichhorn · Atri Rudra · Christopher Re
Fast linear transforms are ubiquitous in machine learning, including the discrete Fourier transform, discrete cosine transform, and other structured transformations such as convolutions. All of these transforms can be represented by dense matrixvector multiplication, yet each has a specialized and highly efficient (subquadratic) algorithm. We ask to what extent handcrafting these algorithms and implementations is necessary, what structural prior they encode, and how much knowledge is required to automatically learn a fast algorithm for a provided structured transform. Motivated by a characterization of fast matrixvector multiplication as products of sparse matrices, we introduce a parameterization of divideandconquer methods that is capable of representing a large class of transforms. This generic formulation can automatically learn an efficient algorithm for many important transforms; for example, it recovers the $O(N \log N)$ CooleyTukey FFT algorithm to machine precision, for dimensions $N$ up to $1024$. Furthermore, our method can be incorporated as a lightweight replacement of generic matrices in machine learning pipelines to learn efficient and compressible transformations. On a standard task of compressing a single hiddenlayer network, our method exceeds the classification accuracy of unconstrained matrices on CIFAR10 by 3.9 pointsthe first time a structured approach has done sowith 4X faster inference speed and 40X fewer parameters.
Author Information
Tri Dao (Stanford)
Albert Gu (Stanford University)
Matthew Eichhorn (University at Buffalo)
Atri Rudra (University at Buffalo, SUNY)
Christopher Re (Stanford)
Related Events (a corresponding poster, oral, or spotlight)

2019 Poster: Learning Fast Algorithms for Linear Transforms Using Butterfly Factorizations »
Fri Jun 14th 01:30  04:00 AM Room Pacific Ballroom
More from the Same Authors

2020 Poster: Fast and Threerious: Speeding Up Weak Supervision with Triplet Methods »
Dan Fu · Mayee Chen · Frederic Sala · Sarah Hooper · Kayvon Fatahalian · Christopher Re 
2020 Poster: On the Generalization Effects of Linear Transformations in Data Augmentation »
Sen Wu · Hongyang Zhang · Gregory Valiant · Christopher Re 
2020 Poster: Improving the Gating Mechanism of Recurrent Neural Networks »
Albert Gu · Caglar Gulcehre · Thomas Paine · Matthew Hoffman · Razvan Pascanu 
2019 Poster: Learning Dependency Structures for Weak Supervision Models »
Paroma Varma · Frederic Sala · Ann He · Alexander J Ratner · Christopher Re 
2019 Oral: Learning Dependency Structures for Weak Supervision Models »
Paroma Varma · Frederic Sala · Ann He · Alexander J Ratner · Christopher Re 
2019 Poster: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re 
2019 Oral: A Kernel Theory of Modern Data Augmentation »
Tri Dao · Albert Gu · Alexander J Ratner · Virginia Smith · Christopher De Sa · Christopher Re 
2018 Poster: Representation Tradeoffs for Hyperbolic Embeddings »
Frederic Sala · Christopher De Sa · Albert Gu · Christopher Re 
2018 Oral: Representation Tradeoffs for Hyperbolic Embeddings »
Frederic Sala · Christopher De Sa · Albert Gu · Christopher Re 
2017 Poster: Learning the Structure of Generative Models without Labeled Data »
Stephen Bach · Bryan He · Alexander J Ratner · Christopher Re 
2017 Talk: Learning the Structure of Generative Models without Labeled Data »
Stephen Bach · Bryan He · Alexander J Ratner · Christopher Re