Estimation of time-dependent association for bivariate failure times in the presence of a competing risk.
ABSTRACT: This article targets the estimation of a time-dependent association measure for bivariate failure times, the conditional cause-specific hazards ratio (CCSHR), which is a generalization of the conditional hazards ratio (CHR) to accommodate competing risks data. We model the CCSHR as a parametric regression function of time and event causes and leave all other aspects of the joint distribution of the failure times unspecified. We develop a pseudo-likelihood estimation procedure for model fitting and inference and establish the asymptotic properties of the estimators. We assess the finite-sample properties of the proposed estimators against the estimators obtained from a moment-based estimating equation approach. Data from the Cache County study on dementia are used to illustrate the proposed methodology.
Project description:This article considers nonparametric methods for studying recurrent disease and death with competing risks. We first point out that comparisons based on the well-known cumulative incidence function can be confounded by different prevalence rates of the competing events, and that comparisons of the conditional distribution of the survival time given the failure event type are more relevant for investigating the prognosis of different patterns of recurrence disease. We then propose nonparametric estimators for the conditional cumulative incidence function as well as the conditional bivariate cumulative incidence function for the bivariate gap times, that is, the time to disease recurrence and the residual lifetime after recurrence. To quantify the association between the two gap times in the competing risks setting, a modified Kendall's tau statistic is proposed. The proposed estimators for the conditional bivariate cumulative incidence distribution and the association measure account for the induced dependent censoring for the second gap time. Uniform consistency and weak convergence of the proposed estimators are established. Hypothesis testing procedures for two-sample comparisons are discussed. Numerical simulation studies with practical sample sizes are conducted to evaluate the performance of the proposed nonparametric estimators and tests. An application to data from a pancreatic cancer study is presented to illustrate the methods developed in this article.
Project description:A population average regression model is proposed to assess the marginal effects of covariates on the cumulative incidence function when there is dependence across individuals within a cluster in the competing risks setting. This method extends the Fine-Gray proportional hazards model for the subdistribution to situations, where individuals within a cluster may be correlated due to unobserved shared factors. Estimators of the regression parameters in the marginal model are developed under an independence working assumption where the correlation across individuals within a cluster is completely unspecified. The estimators are consistent and asymptotically normal, and variance estimation may be achieved without specifying the form of the dependence across individuals. A simulation study evidences that the inferential procedures perform well with realistic sample sizes. The practical utility of the methods is illustrated with data from the European Bone Marrow Transplant Registry.
Project description:We develop time-varying association analyses for onset ages of two lung infections to address the statistical challenges in utilizing registry data where onset ages are left-truncated by ages of entry and competing-risk censored by deaths. Two types of association estimators are proposed based on conditional cause-specific hazard function and cumulative incidence function that are adapted from unconditional quantities to handle left truncation. Asymptotic properties of the estimators are established by using the empirical process techniques. Our simulation study shows that the estimators perform well with moderate sample sizes. We apply our methods to the Cystic Fibrosis Foundation Registry data to study the relationship between onset ages of Pseudomonas aeruginosa and Staphylococcus aureus infections.
Project description:The cross-odds ratio is defined as the ratio of the conditional odds of the occurrence of one cause-specific event for one subject given the occurrence of the same or a different cause-specific event for another subject in the same cluster over the unconditional odds of occurrence of the cause-specific event. It is a measure of the association between the correlated cause-specific failure times within a cluster. The joint cumulative incidence function can be expressed as a function of the marginal cumulative incidence functions and the cross-odds ratio. Assuming that the marginal cumulative incidence functions follow a generalized semiparametric model, this paper studies the parametric regression modeling of the cross-odds ratio. A set of estimating equations are proposed for the unknown parameters and the asymptotic properties of the estimators are explored. Non-parametric estimation of the cross-odds ratio is also discussed. The proposed procedures are applied to the Danish twin data to model the associations between twins in their times to natural menopause and to investigate whether the association differs among monozygotic and dizygotic twins and how these associations have changed over time.
Project description:The cause of failure in cohort studies that involve competing risks is frequently incompletely observed. To address this, several methods have been proposed for the semiparametric proportional cause-specific hazards model under a missing at random assumption. However, these proposals provide inference for the regression coefficients only, and do not consider the infinite dimensional parameters, such as the covariate-specific cumulative incidence function. Nevertheless, the latter quantity is essential for risk prediction in modern medicine. In this paper we propose a unified framework for inference about both the regression coefficients of the proportional cause-specific hazards model and the covariate-specific cumulative incidence functions under missing at random cause of failure. Our approach is based on a novel computationally efficient maximum pseudo-partial-likelihood estimation method for the semiparametric proportional cause-specific hazards model. Using modern empirical process theory we derive the asymptotic properties of the proposed estimators for the regression coefficients and the covariate-specific cumulative incidence functions, and provide methodology for constructing simultaneous confidence bands for the latter. Simulation studies show that our estimators perform well even in the presence of a large fraction of missing cause of failures, and that the regression coefficient estimator can be substantially more efficient compared to the previously proposed augmented inverse probability weighting estimator. The method is applied using data from an HIV cohort study and a bladder cancer clinical trial.
Project description:Correlated survival data naturally arise from many clinical and epidemiological studies. For the analysis of such data, the Gamma-frailty proportional hazards (PH) model is a popular choice because the regression parameters have marginal interpretations and the statistical association between the failure times can be explicitly quantified via Kendall's tau. Despite their popularity, Gamma-frailty PH models for correlated interval-censored data have not received as much attention as analogous models for right-censored data. In this work, a Gamma-frailty PH model for bivariate interval-censored data is presented and an easy to implement expectation-maximization (EM) algorithm for model fitting is developed. The proposed model adopts a monotone spline representation for the purposes of approximating the unknown conditional cumulative baseline hazard functions, significantly reducing the number of unknown parameters while retaining modeling flexibility. The EM algorithm was derived from a data augmentation procedure involving latent Poisson random variables. Extensive numerical studies illustrate that the proposed method can provide reliable estimation and valid inference, and is moreover robust to the misspecification of the frailty distribution. To further illustrate its use, the proposed method is used to analyze data from an epidemiological study of sexually transmitted infections.
Project description:Interval-censored competing risks data arise when each study subject may experience an event or failure from one of several causes and the failure time is not observed directly but rather is known to lie in an interval between two examinations. We formulate the effects of possibly time-varying (external) covariates on the cumulative incidence or sub-distribution function of competing risks (i.e., the marginal probability of failure from a specific cause) through a broad class of semiparametric regression models that captures both proportional and non-proportional hazards structures for the sub-distribution. We allow each subject to have an arbitrary number of examinations and accommodate missing information on the cause of failure. We consider nonparametric maximum likelihood estimation and devise a fast and stable EM-type algorithm for its computation. We then establish the consistency, asymptotic normality, and semiparametric efficiency of the resulting estimators for the regression parameters by appealing to modern empirical process theory. In addition, we show through extensive simulation studies that the proposed methods perform well in realistic situations. Finally, we provide an application to a study on HIV-1 infection with different viral subtypes.
Project description:The proportional hazards assumption in the commonly used Cox model for censored failure time data is often violated in scientific studies. Yang and Prentice (2005) proposed a novel semiparametric two-sample model that includes the proportional hazards model and the proportional odds model as sub-models, and accommodates crossing survival curves. The model leaves the baseline hazard unspecified and the two model parameters can be interpreted as the short-term and long-term hazard ratios. Inference procedures were developed based on a pseudo score approach. Although extension to accommodate covariates was mentioned, no formal procedures have been provided or proved. Furthermore, the pseudo score approach may not be asymptotically efficient. We study the extension of the short-term and long-term hazard ratio model of Yang and Prentice (2005) to accommodate potentially time-dependent covariates. We develop efficient likelihood-based estimation and inference procedures. The nonparametric maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical settings. The proposed method successfully captured the phenomenon of crossing hazards in a cancer clinical trial and identified a genetic marker with significant long-term effect missed by using the proportional hazards model on age-at-onset of alcoholism in a genetic study.
Project description:Bandeen-Roche and Liang (2002, Modelling multivariate failure time associations in the presence of a competing risk. Biometrika 89, 299-314.) tailored Oakes (1989, Bivariate survival models induced by frailties. Journal of the American Statistical Association 84, 487-493.)'s conditional hazard ratio to evaluate cause-specific associations in bivariate competing risks data. In many population-based family studies, one observes complex multivariate competing risks data, where the family sizes may be > 2, certain marginals may be exchangeable, and there may be multiple correlated relative pairs having a given pairwise association. Methods for bivariate competing risks data are inadequate in these settings. We show that the rank correlation estimator of Bandeen-Roche and Liang (2002) extends naturally to general clustered family structures. Consistency, asymptotic normality, and variance estimation are easily obtained with U-statistic theories. A natural by-product is an easily implemented test for constancy of the association over different time regions. In the Cache County Study on Memory in Aging, familial associations in dementia onset are of interest, accounting for death prior to dementia. The proposed methods using all available data suggest attenuation in dementia associations at later ages, which had been somewhat obscured in earlier analyses.
Project description:Censored quantile regression models, which offer great flexibility in assessing covariate effects on event times, have attracted considerable research interest. In this study, we consider flexible estimation and inference procedures for competing risks quantile regression, which not only provides meaningful interpretations by using cumulative incidence quantiles but also extends the conventional accelerated failure time model by relaxing some of the stringent model assumptions, such as global linearity and unconditional independence. Current method for censored quantile regressions often involves the minimization of the L1 -type convex function or solving the nonsmoothed estimating equations. This approach could lead to multiple roots in practical settings, particularly with multiple covariates. Moreover, variance estimation involves an unknown error distribution and most methods rely on computationally intensive resampling techniques such as bootstrapping. We consider the induced smoothing procedure for censored quantile regressions to the competing risks setting. The proposed procedure permits the fast and accurate computation of quantile regression parameter estimates and standard variances by using conventional numerical methods such as the Newton-Raphson algorithm. Numerical studies show that the proposed estimators perform well and the resulting inference is reliable in practical settings. The method is finally applied to data from a soft tissue sarcoma study.