The Square Kilometre Array (SKA) will be the world's largest radio telescope, producing data volumes approaching exa-scale within a few years of operation. Extracting scientific value from those data in a timely manner will be a challenge that quickly goes beyond traditional analyses and instead requires robust domain-specific AI solutions. Here I will discuss how we have been building foundation models that can be adapted across different SKA precursor instruments, by applying self-supervised learning with instance differentiation to learn a multi-purpose representation for use in radio astronomy. For a standard radio astronomy use case, our models exceed baseline supervised classification performance by a statistically significant margin for most label volumes in the in-distribution classification case and for all label volumes in the out-of-distribution case. I will also show how such learned representations can be more widely scientifically useful, for example in similarity searches that allow us to find hybrid radio galaxies without any pre-labelled examples.