Desirable Effort Fairness and Optimality Trade-offs in Strategic Learning
Abstract
Strategic classification examines how decision rules interact with agents who strategically adapt their features. Most existing models focus on maximizing predictive performance, assuming agents best respond to the learned classifier. However, real decision-making systems are rarely optimized solely for accuracy: ethical, economic, and institutional considerations often make some feature changes more desirable than others. At the same time, principals may wish to incentivize these changes fairly across heterogeneous agents. While prior work has studied causal structure between features, notions of desirability, and information disparities in isolation, this work initiates a unified treatment of these components within a single framework. We frame the problem as a constrained optimization problem that captures the trade-offs between optimality, desirability, and fairness. We provide theoretical guarantees on the principal's optimality loss constrained to a particular desirability fairness tolerance for multiple broad classes of fairness measures. Finally, through experiments on real datasets, we show the explicit tradeoff between maximizing accuracy and fairness in desirability effort.