Statistical Inferences for Interval Censored Data

Date of Completion

January 2011






Interval censored data are often observed in studies where event processes are monitored only at certain discrete times. The special data structure presents challenges in statistical inferences. We develop inference procedures for Cox models alone with computing algorithms under two settings of interval censored data. In the first setting where each subject experiences at most a single event, we extend the most popular Cox proportional hazards model to allow time-varying coefficients to capture the temporal dynamics of covariate effects. Instead of being pre-specified, the degree of temporal dynamics for each covariate is estimated from data. In a Bayesian framework, the missing event time information as well as the model parameters are sampled iteratively via an efficient reversible jump Markov chain Monte Carlo (MCMC) algorithm. The method is shown to outperform existing approaches for Cox model with time-varying coefficients, and reveals new findings in an analysis of time to tooth emergence in children. The second setting is panel count data, where each subject can experience multiple events and is monitored at multiple time points. We propose an augmented estimating equations (AEE) approach for a semiparametric mean regression model, which accommodates the possible association between the event process and the observation scheme through an unspecified frailty variable. An Expectation-Solving (ES) algorithm—an analogue to the EM algorithm—is utilized to tackle the missing event time information on a fine grid. Simulation studies demonstrate that AEE yields a more efficient and numerical stable estimator than existing estimators under a wide range of practical settings. The method in the first setting is applied to the cystic fibrosis disease registry data from a collaborative project to study the natural history of lung function deterioration among young cystic fibrosis patients. The time-varying effects of some variables were not seen from existing analyses. The methods have been implemented in two R packages, respectively, that are publicly available. ^