Optimal Pricing for Data-Augmented AutoML Marketplaces
Abstract
Data markets promise to unlock data value by matching data suppliers with ML consumers. However, market design involves addressing intricate challenges, including data pricing, fairness, and robustness. We propose a pragmatic data-augmented AutoML market that seamlessly integrates with existing cloud-based AutoML platforms, such as Google’s Vertex AI. Unlike standard AutoML solutions, our design automatically augments buyer-submitted training data with valuable external datasets, pricing the resulting models based on their measurable performance improvements rather than computational costs as the status quo. Our key innovation is a pricing mechanism grounded in the instrumental value—the marginal model quality improvement—of externally sourced data. This approach bypasses direct dataset pricing complexities and accommodates diverse buyer valuations through menu-based options, thus providing an economically sustainable framework for monetizing external data.