Context-Aware Hierarchical Bayesian Modeling of IVF Laboratory Environmental Conditions
Abstract
IVF pregnancy rates are routinely modeled using patient-level variables, while high-resolution laboratory environmental data remain underutilized- lized. We show that this is a missed opportunity. Rather than relying on raw sensor averages, we engineer 55 context-aware temporal features including rolling thermal stability, simultaneous temperature-humidity adherence, peak stress duration, and post-stress recovery speed that capture the dynamics of incubator microenvironments. On 61 weeks of data from an Asian IVF clinic, these features reduce cross-validated prediction error to 1.27%, compared to 3–5% for raw averages. We then train a hierarchical Bayesian Beta regression model that shares environmental effects across an Asian and a Northern European clinic via partial pooling, while preserving site-specific baselines. On held-out data from the Northern European clinic, the model achieves R² = 0.86 and a 64% error reduction for the 35–39 age group over a naive baseline, demonstrating that structured environmental monitoring contains clinically meaningful, transferable signal.