ICML Many Perception Tasks are Highly Redundant Functions of their Input Data

Poster
in
Workshop: Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models

Many Perception Tasks are Highly Redundant Functions of their Input Data

Rahul Ramesh · Anthony Bisulco · Ronald Di Tullio · Linran Wei · Vijay Balasubramanian · Kostas Daniilidis · Pratik Chaudhari

[ Abstract ]

Abstract:

We show that many perception tasks, from visual recognition, semantic segmentation, optical flow, depth estimation, to vocalization discrimination, are highly redundant functions of their input data. Images or spectrograms, projected into different subspaces, formed by orthogonal bases in pixel, Fourier or wavelet domains, can be used to solve these tasks remarkably well regardless of whether it is the top subspace where data varies the most, some intermediate subspace with moderate variability—or the bottom subspace where data varies the least. This phenomenon occurs because different subspaces have a large degree of redundant information with the task. We also find that masked auto-encodes (MAEs) zero out certain frequencies and perform whitening procedure in Fourier space.

Chat is not available.

Poster in Workshop: Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models

Many Perception Tasks are Highly Redundant Functions of their Input Data

Rahul Ramesh · Anthony Bisulco · Ronald Di Tullio · Linran Wei · Vijay Balasubramanian · Kostas Daniilidis · Pratik Chaudhari

Poster
in
Workshop: Data-centric Machine Learning Research (DMLR): Datasets for Foundation Models