Poster
in
Workshop: The Many Facets of Preference-Based Learning
Differentially Private Reward Estimation from Preference Based Feedback
Sayak Ray Chowdhury · Xingyu Zhou
Abstract:
Preference-based reinforcement learning (RL) has gained attention as a promising approach to align learning algorithms with human interests in various domains. Instead of relying on numerical rewards, preference-based RL uses feedback from human labelers in the form of pairwise or $K$-wise comparisons between actions. In this paper, we focus on reward learning in preference-based RL and address the issue of estimating unknown parameters while protecting privacy. We propose two estimators based on the Randomized Response strategy that ensure label differential privacy. The first estimator utilizes maximum likelihood estimation (MLE), while the second estimator employs stochastic gradient descent (SGD). We demonstrate that both estimators achieve an estimation error of $\widetilde O(1/\sqrt{n})$ with $n$ number of samples. The additional cost of ensuring privacy for human labelers is proportional to $\frac{e^\epsilon +1 }{e^\epsilon -1}$ in the best case, where $\epsilon>0$ is the privacy
Chat is not available.