A new class of Bayesian survival models and beyond

Date of Completion

January 2011


Biology, Biostatistics|Statistics




In this research we introduce a new class of Bayesian hierarchical models that incorporates spatial or spatio-temporal heterogeneity and cure rates to model spatially clustered time-to-event data. The models allow for various structures including proportional odds (PO), proportional hazard (PH), and a generalization of the two with semi-parametric and non-parametric baseline hazard functions. ^ We propose three methodologies that have different assumptions for the hazard function and cure rates. We first propose a semi-parametric PH model of spatially clustered survival data under cure fraction. We next propose a general transformation class of semi-parametric spatio-temporal cure rate survival models. Finally, we present a unified power series transformation class of cure rate survival models for spatially clustered data. ^ The first methodology assumes a generalized PH structure with covariance function. A flexible family of semi-parametric baseline hazard functions is proposed using a random partition grid defined by join-point parameters. We then integrate into the analysis the cure rates and spatial frailties. The second model proposes a general class of transformation models without any structural assumption for the hazard function. Instead, a transformation model that includes as special cases the benchmark PH and PO models is proposed. We integrate into the analysis the cure rates and the spatio-temporal frailties, and the baseline functions are modeled non-parametrically. Finally, we develop a unified model to incorporate surviving fraction and spatial effects under a general class of discrete probability distributions. This approach unifies popular cure models.^ Our estimation methodology is Bayesian. When modeling censored time event data with cure rates, spatial or spatio-temporal frailties, and semi-parametric or non-parametric baseline hazard curves, a high dimensional parameter vector and covariance matrix need to be estimated. The Bayesian paradigm makes the problem tractable by enabling us to infer parameters by sampling from a posterior distribution.^ We compare a broad collection of high-dimensional hierarchical models using the log pseudo maximum likelihood (LPML) and the deviance information criteria (DIC) comparison techniques. We obtain the usual posterior estimates and smooth by county level maps of spatial or spatio-temporal frailties and the cute rates over space. We apply our methodologies to real survival data sets on cancer epidemiology, yet the proposed work offers useful contributions to general time-to-event analysis and methodology. ^