ICML Differentially Private Reward Estimation from Preference Based Feedback

Poster
in
Workshop: The Many Facets of Preference-Based Learning

Differentially Private Reward Estimation from Preference Based Feedback

Sayak Ray Chowdhury · Xingyu Zhou

[ Abstract ] [ Project Page ]

[ OpenReview]

Abstract: Preference-based reinforcement learning (RL) has gained attention as a promising approach to align learning algorithms with human interests in various domains. Instead of relying on numerical rewards, preference-based RL uses feedback from human labelers in the form of pairwise or

$K$ -wise comparisons between actions. In this paper, we focus on reward learning in preference-based RL and address the issue of estimating unknown parameters while protecting privacy. We propose two estimators based on the Randomized Response strategy that ensure label differential privacy. The first estimator utilizes maximum likelihood estimation (MLE), while the second estimator employs stochastic gradient descent (SGD). We demonstrate that both estimators achieve an estimation error of

$\widetilde O(1/\sqrt{n})$ with

$n$ number of samples. The additional cost of ensuring privacy for human labelers is proportional to

$\frac{e^\epsilon +1 }{e^\epsilon -1}$ in the best case, where

$\epsilon>0$ is the privacy

Chat is not available.

Poster in Workshop: The Many Facets of Preference-Based Learning

Differentially Private Reward Estimation from Preference Based Feedback

Sayak Ray Chowdhury · Xingyu Zhou

Poster
in
Workshop: The Many Facets of Preference-Based Learning