Date of Completion
4-9-2019
Embargo Period
4-21-2020
Keywords
Baseline Hazard; Cox Model; Online Updating; Proportional Hazards; Survival Analysis
Major Advisor
Elizabeth D. Schifano
Co-Major Advisor
Jun Yan
Associate Advisor
HaiYing Wang
Associate Advisor
see above
Field of Study
Statistics
Degree
Doctor of Philosophy
Open Access
Open Access
Abstract
While studies of the proportional hazards model for big survival data mainly focus on speeding up computation and selecting features from a huge number of covariates, verifying the crucial assumption of proportional hazards (PH) has not been tackled for big data when the data size exceeds a computer’s memory. This dissertation summarizes methodological developments in statistics that address the diagnostics of the PH model, including the PH assumption, functional form, and outlying and/or influential observations. Specifically, an online updating approach with minimal storage requirement that updates the standard test statistic for the PH assumption in an online fashion is proposed. The test and its variant based on most recent data blocks maintain their sizes when the PH assumption holds, and have substantial power when it is violated in different ways. Attention has also been paid to the baseline hazard function of the PH model. Nonparametric methods to compare cumulative baseline hazard curves using profile monitoring techniques, and their combination with parametric methods to detect heterogeneity in data blocks, are presented.
Recommended Citation
Xue, Yishu, "Diagnostic Methods for Big Survival Data" (2019). Doctoral Dissertations. 2184.
https://digitalcommons.lib.uconn.edu/dissertations/2184