Skip to yearly menu bar Skip to main content


Poster
in
Workshop: Beyond Bayes: Paths Towards Universal Reasoning Systems

P30: Meta-Learning Real-Time Bayesian AutoML For Small Tabular Data

Frank Hutter · Katharina Eggensperger


Abstract: Authors: Noah Hollmann, Samuel Müller, Katharina Eggensperger, Frank Hutter Abstract: We present TabPFN, an AutoML method that is competitive with the state of the art on small tabular datasets while being over 1\,000$\times$ faster. TabPFN not only outperforms boosted trees, the state-of-the-art standalone method, but is en par with complex AutoML systems, that tune and select ensembles of a range of methods. Our method is fully entailed in the weights of a single neural network, and a single forward pass directly yields predictions for a new dataset. TabPFN is meta-learned using the Transformer-based Prior-Data Fitted Network (PFN) architecture and approximates Bayesian inference with a prior that is based on assumptions of simplicity and causal structures. The prior contains a large space of structural causal models with a bias for small architectures and thus low complexity. Furthermore, we extend the PFN approach to differentiably calibrate the prior's hyperparameters on real data. By doing so, we separate our abstract prior assumptions from their heuristic calibration on real data. Afterwards, the calibrated hyperparameters are fixed and TabPFN can be applied to any new tabular dataset at the push of a button. Finally, on 30 datasets from the OpenML-CC18 suite we show that our method outperforms boosted trees and performs on par with complex state-of-the-art AutoML systems with predictions produced in less than a second. Our code and pretrained models are available at https://anonymous.4open.science/r/TabPFN-2AEE.

Chat is not available.