Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: AI for Science: Scaling in AI for Scientific Discovery

AstroPT: Scaling Large Observation Models for Astronomy

Michael J. Smith · Ryan Roberts · Eirini Angeloudi · Marc Huertas-Company

Keywords: [ Galaxies ] [ Astronomy ] [ self-supervised learning ] [ Neural Scaling Laws ]


Abstract: This work presents AstroPT, an autoregressive pretrained transformer developed with astronomical use-cases in mind. The AstroPT models presented here have been pretrained on 8.6 million $512 \times 512$ pixel grz-band galaxy postage stamp observations from the DESI Legacy Survey DR8. We train a selection of models of increasing size from 1 million to 2.1 billion parameters, and find that AstroPT follows a similar saturating log-log scaling law to textual models. We also find that the models' performance on downstream tasks as measured by linear probing improves with model size up to the model parameter saturation point. To ensure that this work can provide a basis for building toward s an open source `Large Observation Model' we will release the code, model weights, and dataset for AstroPT under the MIT license with the deanonymised version of this manuscript.

Chat is not available.