Date of Completion
Baseline Hazard; Cox Model; Online Updating; Proportional Hazards; Survival Analysis
Elizabeth D. Schifano
Field of Study
Doctor of Philosophy
While studies of the proportional hazards model for big survival data mainly focus on speeding up computation and selecting features from a huge number of covariates, verifying the crucial assumption of proportional hazards (PH) has not been tackled for big data when the data size exceeds a computer’s memory. This dissertation summarizes methodological developments in statistics that address the diagnostics of the PH model, including the PH assumption, functional form, and outlying and/or influential observations. Specifically, an online updating approach with minimal storage requirement that updates the standard test statistic for the PH assumption in an online fashion is proposed. The test and its variant based on most recent data blocks maintain their sizes when the PH assumption holds, and have substantial power when it is violated in different ways. Attention has also been paid to the baseline hazard function of the PH model. Nonparametric methods to compare cumulative baseline hazard curves using profile monitoring techniques, and their combination with parametric methods to detect heterogeneity in data blocks, are presented.
Xue, Yishu, "Diagnostic Methods for Big Survival Data" (2019). Doctoral Dissertations. 2184.