CARE: Adaptive Calibration for Reliable Recommendations
Abstract
Modern recommender systems are typically trained offline and deployed with parameters held fixed between periodic refreshes, yet user behavior can evolve substantially during deployment. This can cause ranking utility to degrade over time and makes it difficult to provide formal guarantees about recommendation quality. We propose CARE, an adaptive calibration framework that wraps an arbitrary backbone recommender and outputs variable-size recommendation sets with finite-sample performance guarantees over interaction streams. CARE combines (i) a loss-based monitoring module that localizes behavioral changes and triggers threshold recalibration, and (ii) an online aggregation rule that promotes compact recommendation sets by dynamically reweighting candidate set predictors. We provide theoretical results establishing finite-sample guarantees for utility-based risk control and bounds on the expected set size relative to the best constituent predictor. Experiments across multiple datasets and backbone models demonstrate that CARE improves robustness and maintains compact recommendation sets while preserving the desired statistical guarantees. The code and implementation are available in https://anonymous.4open.science/r/CARE-FCBD.