An agnostic PAC learning algorithm finds a predictor that is competitive with the best predictor in a benchmark hypothesis class, where competitiveness is measured with respect to a given loss function. However, its predictions might be quite sub-optimal for structured subgroups of individuals, such as protected demographic groups. Motivated by such fairness concerns, we study ``multi-group agnostic PAC learnability'': fixing a measure of loss, a benchmark class $\H$ and a (potentially) rich collection of subgroups $\G$, the objective is
to learn a single predictor such that the loss experienced by every group $g \in \G$ is not much larger than the best possible loss for this group within $\H$. Under natural conditions, we provide a characterization of the loss functions for which such a predictor is guaranteed to exist. For any such loss function we construct a learning algorithm whose sample complexity is logarithmic in the size of the collection $\G$. Our results unify and extend previous positive and negative results from the multi-group fairness literature, which applied for specific loss functions.