Explicitly Modeling Censoring Produces Superior Survival Predictors
Abstract
Likelihood-based training is the dominant paradigm in survival prediction. Under independent censoring, we can factorize the likelihood and optimize only the terms related to event modeling, effectively treating the censoring mechanism as incidental. This is justified when censoring is non-informative, i.e., when the censoring process shares no parameters with the event-time model. However, this may not hold in practice, and ignoring censoring contributions may discard useful signals for learning representations that can help to effectively estimate event distributions. Motivated by this, we argue that explicitly modeling censoring can improve representation learning and time-to-event estimation, particularly when event and censoring processes are coupled. We introduce a latent decomposition view that partitions covariates into four disjoint factors: those affecting only the event process, only the censoring process, both, or neither. We then learn decomposed representations for the first three categories to guide a better estimation of the event distribution. We instantiate our method on four popular deep-learning survival models and evaluate on 10 datasets (2 semi-synthetic and 8 real-world), showing consistent gains over strong baselines and multiple SOTA methods.