Poster
in
Workshop: Differentiable Almost Everything: Differentiable Relaxations, Algorithms, Operators, and Simulators
Using gradients to check sensitivity of MCMC-based analyses to removing data
Tin Nguyen · Ryan Giordano · Rachael Meager · Tamara Broderick
Keywords: [ MCMC ] [ sensitivity ] [ data removal ] [ Gradients ]
If the conclusion of a data analysis is sensitive to dropping very few data points, that conclusion might hinge on the particular data at hand rather than representing a more broadly applicable truth. To check for this sensitivity, one idea is to consider every small data subset, drop it, and re-run our analysis. But the number of re-runs needed is combinatorially large. Recent work proposes a differentiable relaxation to find the worst-case subset, but that work was developed for conclusions based on estimating equations --- and does not directly handle Bayesian posterior approximations using MCMC. We make two principal contributions. We adapt the existing data-dropping relaxation to estimators computed via MCMC; in particular, we re-use existing MCMC draws to estimate the necessary derivatives via a covariance relationship. Observing that Monte Carlo errors induce variability in the estimates, we use a variant of the bootstrap to quantify this uncertainty. Empirically, our method is accurate in simple models, such as linear regression. In models with complex structure, such as hierarchies, the performance of our method is mixed.