How to Price Data: A Market Equilibrium Based Approach
Pooja Kulkarni ⋅ Parnian Shahkar ⋅ Ruta Mehta
Abstract
High-quality data is a key input to modern machine learning models, leading to the emergence of platforms that facilitate the buying and selling of data. A central challenge in these platforms is how the data is priced to balance the interests of both buyers and sellers. Traditional market equilibrium notions, where demand meets supply are commonly used to price goods but do not extend naturally to data due to its non-rivalrous nature, whereby multiple buyers can simultaneously benefit from the same dataset. We therefore introduce a new notion of equilibrium for data pricing based on Nash equilibrium and study it in settings where data may be complementary or substitutable, focusing on the canonical utility models for each, namely Leontief and linear, respectively. We show that equilibrium prices fail to exist for linear utilities even with homogeneous buyers and two sellers, while establishing strong existence, efficiency, and polynomial-time computation guarantees for Leontief utilities in general markets with $n$ homogeneous buyers and $m$ sellers. We further examine the role of platform mediation and price discrimination in enabling *optimal* equilibrium outcomes efficiently. On the technical front, we develop a novel proof technique based on systematically reducing the space of candidate equilibria through the *graph-of-deviations*, which may be of independent interest.
Successful Page Load