Nonproportional hazards and unobserved heterogeneity in clustered survival data: When can we tell the difference?
ABSTRACT: Multivariate survival data are frequently encountered in biomedical applications in the form of clustered failures (or recurrent events data). A popular way of analyzing such data is by using shared frailty models, which assume that the proportional hazards assumption holds conditional on an unobserved cluster-specific random effect. Such models are often incorporated in more complicated joint models in survival analysis. If the random effect distribution has finite expectation, then the conditional proportional hazards assumption does not carry over to the marginal models. It has been shown that, for univariate data, this makes it impossible to distinguish between the presence of unobserved heterogeneity (eg, due to missing covariates) and marginal nonproportional hazards. We show that time-dependent covariate effects may falsely appear as evidence in favor of a frailty model also in the case of clustered failures or recurrent events data, when the cluster size or number of recurrent events is small. When true unobserved heterogeneity is present, the presence of nonproportional hazards leads to overestimating the frailty effect. We show that this phenomenon is somewhat mitigated as the cluster size grows. We carry out a simulation study to assess the behavior of test statistics and estimators for frailty models in such contexts. The gamma, inverse Gaussian, and positive stable shared frailty models are contrasted using a novel software implementation for estimating semiparametric shared frailty models. Two main questions are addressed in the contexts of clustered failures and recurrent events: whether covariates with a time-dependent effect may appear as indication of unobserved heterogeneity and whether the additional presence of unobserved heterogeneity can be detected in this case. Finally, the practical implications are illustrated in a real-world data analysis example.
Project description:Neglecting the presence of unobserved heterogeneity in survival analysis models has been showed to potentially lead to underestimating the effect of the covariates included in the analysis. This study aimed to investigate the role of unobserved heterogeneity of frailty on the estimation of mortality differentials from age 50 on by education level.Longitudinal mortality follow-up of the census-based Turin population linked with the city registry office.Italian North-Western city of Turin, observation window 1971-2007.391 170 men and 456 216 women followed from age 50.Mortality rate ratios obtained from survival analysis regression. Models were estimated with and without the component of unobserved heterogeneity of frailty and controlling for mortality improvement over time from both cohort and period perspectives.In the majority of cases, the models without frailty estimated a smaller educational gradient than the models with frailty.The results draw the attention of the potential underestimation of the mortality inequalities by socioeconomic levels in survival analysis models when not controlling for unobserved heterogeneity of frailty.
Project description:OBJECTIVES:To investigate the association between recurrent AIDS-defining events and a semicompeting risk of death in patients with advanced, multidrug-resistant human immunodeficiency virus infection and to identify individuals at increased risk for these events using a joint frailty model. STUDY DESIGN AND SETTING:Three hundred sixty-eight patients with antiretroviral treatment failure in the Options in Management of Antiretrovirals Trial randomized to two antiretroviral treatment strategies using a 2 × 2 factorial design, intensive vs. standard and interruption vs. continuation, and followed for development of AIDS-defining events and death. RESULTS:Participants were heterogeneous for risk of AIDS-defining events and death (P < 0.001), and AIDS-defining events were strongly associated with death (P < 0.001), irrespective of treatment. The frailty model was used to classify individuals into high- and low-risk groups based on unobserved heterogeneity. Low-risk individuals were unlikely to die (0%) or have an AIDS-defining event (<4%), whereas high-risk individuals had event rates approaching 70%. About one-third of high-risk individuals had accelerated mortality, all who died before experiencing an AIDS-defining event. High-risk was associated with being immunocompromised and higher predicted 5-year mortality. CONCLUSION:The joint frailty model permits classification of individuals into risk groups based on unobserved heterogeneity that may be identifiable based on observed covariates, providing advantages over the traditional Cox model.
Project description:Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.
Project description:Theoretical models of mortality selection have great utility in explaining otherwise puzzling phenomena. The most famous example may be the Black-White mortality crossover: at old ages, Blacks outlive Whites, presumably because few frail Blacks survive to old ages while some frail Whites do. Yet theoretical models of unidimensional heterogeneity, or frailty, do not speak to the most common empirical situation for mortality researchers: the case in which some important population heterogeneity is observed and some is not. I show that, when one dimension of heterogeneity is observed and another is unobserved, neither the observed nor the unobserved dimension need behave as classic frailty models predict. For example, in a multidimensional model, mortality selection can increase the proportion of survivors who are disadvantaged, or "frail," and can lead Black survivors to be more frail than Whites, along some dimensions of disadvantage. Transferring theoretical results about unidimensional heterogeneity to settings with both observed and unobserved heterogeneity produces misleading inferences about mortality disparities. The unusually flexible behavior of individual dimensions of multidimensional heterogeneity creates previously unrecognized challenges for empirically testing selection models of disparities, such as models of mortality crossovers.
Project description:We propose a semiparametrically efficient estimation of a broad class of transformation regression models for nonproportional hazards data. Classical transformation models are to be viewed from a frailty model paradigm, and the proposed method provides a unified approach that is valid for both continuous and discrete frailty models. The proposed models are shown to be flexible enough to model long-term follow-up survival data when the treatment effect diminishes over time, a case for which the PH or proportional odds assumption is violated, or a situation in which a substantial proportion of patients remains cured after treatment. Estimation of the link parameter in frailty distribution, considered to be unknown and possibly dependent on a time-independent covariates, is automatically included in the proposed methods. The observed information matrix is computed to evaluate the variances of all the parameter estimates. Our likelihood-based approach provides a natural way to construct simple statistics for testing the PH and proportional odds assumptions for usual survival data or testing the short- and long-term effects for survival data with a cure fraction. Simulation studies demonstrate that the proposed inference procedures perform well in realistic settings. Applications to two medical studies are provided.
Project description:We investigate different primary efficacy analysis approaches for a 2-armed randomized clinical trial when interest is focused on a time to event primary outcome that is subject to a competing risk. We extend the work of Friedlin and Korn (2005) by considering estimation as well as testing and by simulating the primary and competing events' times from both a cause-specific hazards model as well as a joint subdistribution-cause-specific hazards model. We show that the cumulative incidence function can provide useful prognostic information for a particular patient but is not advisable for the primary efficacy analysis. Instead, it is preferable to fit a Cox model for the primary event which treats the competing event as an independent censoring. This is reasonably robust for controlling type I error and treatment effect bias with respect to the true primary and competing events' cause-specific hazards model, even when there is a shared, moderately prognostic, unobserved baseline frailty for the primary and competing events in that model. However, when it is plausible that a strongly prognostic frailty exists, combining the primary and competing events into a composite event should be considered. Finally, when there is an a priori interest in having both the primary and competing events in the primary analysis, we compare a bivariate approach for establishing overall treatment efficacy to the composite event approach. The ideas are illustrated by analyzing the Women's Health Initiative clinical trials sponsored by the National Heart, Lung, and Blood Institute.
Project description:Panel count data arise when the number of recurrent events experienced by each subject is observed intermittently at discrete examination times. The examination time process can be informative about the underlying recurrent event process even after conditioning on covariates. We consider a semiparametric accelerated mean model for the recurrent event process and allow the two processes to be correlated through a shared frailty. The regression parameters have a simple marginal interpretation of modifying the time scale of the cumulative mean function of the event process. A novel estimation procedure for the regression parameters and the baseline rate function is proposed based on a conditioning technique. In contrast to existing methods, the proposed method is robust in the sense that it requires neither the strong Poisson-type assumption for the underlying recurrent event process nor a parametric assumption on the distribution of the unobserved frailty. Moreover, the distribution of the examination time process is left unspecified, allowing for arbitrary dependence between the two processes. Asymptotic consistency of the estimator is established, and the variance of the estimator is estimated by a model-based smoothed bootstrap procedure. Numerical studies demonstrated that the proposed point estimator and variance estimator perform well with practical sample sizes. The methods are applied to data from a skin cancer chemoprevention trial.
Project description:Hospitalization events exact a substantial toll across the age spectrum. Frailty is associated with all-cause hospitalization among HIV-uninfected adults aged 65 years and older. Limited data exist on the frailty relationship to hospitalization among HIV-infected persons or those aged less than 65 years. Comparative investigation of the frailty relationship to specific classes of hospitalizations has rarely been reported among adults of any age. This study sought to determine the frailty relationship to three distinct classes of hospitalization events among HIV-infected persons and their uninfected counterparts.Frailty was ascertained semiannually among persons with prior injection drug use using the five Fried phenotypic domains. Hospitalization events were categorized using Agency for Healthcare Research and Quality clinical classification software into chronic, infectious, and nonchronic, noninfectious conditions. Cox proportional hazards models were used to examine the frailty relationship to time to first hospitalization event.Among 1,303 subjects, mean age was 48 years; 32% were HIV-infected. Adjusting for sociodemographics, comorbidity, substance use, and HIV disease stage, time-updated frailty status was associated with risk for all hospitalization classes. Baseline frailty was significantly associated with all-cause (hazards ratio [HR] 1.41; 95% confidence interval [CI], 1.06, 1.87), chronic (HR 2.13; 95% CI, 1.46, 3.11), and infectious disease hospitalization (HR 2.51; 95% CI, 1.60, 3.91) but not with nonchronic, noninfectious hospitalization risk (HR 1.09; 95% CI, 0.74, 1.61).The frailty phenotype predicts vulnerability to chronic and infectious disease-related hospitalization. Frailty-targeted interventions may mitigate the substantial burden of infectious and chronic disease-related morbidity and health care utilization in HIV-infected and uninfected populations.
Project description:<h4>Background</h4>Recurrent events data analysis is common in biomedicine. Literature review indicates that most statistical models used for such data are often based on time to the first event or consider events within a subject as independent. Even when taking into account the non-independence of recurrent events within subjects, data analyses are mostly done with continuous risk interval models, which may not be appropriate for treatments with sustained effects (e.g., drug treatments of malaria patients). Furthermore, results can be biased in cases of a confounding factor implying different risk exposure, e.g. in malaria transmission: if subjects are located at zones showing different environmental factors implying different risk exposures.<h4>Methods</h4>This work aimed to compare four different approaches by analysing recurrent malaria episodes from a clinical trial assessing the effectiveness of three malaria treatments [artesunate + amodiaquine (AS + AQ), artesunate + sulphadoxine-pyrimethamine (AS + SP) or artemether-lumefantrine (AL)], with continuous and discontinuous risk intervals: Andersen-Gill counting process (AG-CP), Prentice-Williams-Peterson counting process (PWP-CP), a shared gamma frailty model, and Generalized Estimating Equations model (GEE) using Poisson distribution. Simulations were also made to analyse the impact of the addition of a confounding factor on malaria recurrent episodes.<h4>Results</h4>Using the discontinuous interval analysis, AG-CP and Shared gamma frailty models provided similar estimations of treatment effect on malaria recurrent episodes when adjusted on age category. The patients had significant decreased risk of recurrent malaria episodes when treated with AS + AQ or AS + SP arms compared to AL arm; Relative Risks were: 0.75 (95% CI (Confidence Interval): 0.62-0.89), 0.74 (95% CI: 0.62-0.88) respectively for AG-CP model and 0.76 (95% CI: 0.64-0.89), 0.74 (95% CI: 0.62-0.87) for the Shared gamma frailty model.With both discontinuous and continuous risk intervals analysis, GEE Poisson distribution models failed to detect the effect of AS + AQ arm compared to AL arm when adjusted for age category. The discontinuous risk interval analysis was found to be the more appropriate approach.<h4>Conclusion</h4>Repeated event in infectious diseases such as malaria can be analysed with appropriate existing models that account for the correlation between multiple events within subjects with common statistical software packages, after properly setting up the data structures.
Project description:Survival analysis is used in the medical field to identify the effect of predictive variables on time to a specific event. Generally, not all variation of survival time can be explained by observed covariates. The effect of unobserved variables on the risk of a patient is called frailty. In multicenter studies, the unobserved center effect can induce frailty on its patients, which can lead to selection bias over time when ignored. For this reason, it is common practice in multicenter studies to include a random frailty term modeling center effect. In a more complex event structure, more than one type of event is possible. Independent frailty variables representing center effect can be incorporated in the model for each competing event. However, in the medical context, events representing disease progression are likely related and correlation is missed when assuming frailties to be independent. In this work, an additive gamma frailty model to account for correlation between frailties in a competing risks model is proposed, to model frailties at center level. Correlation indicates a common center effect on both events and measures how closely the risks are related. Estimation of the model using the expectation-maximization algorithm is illustrated. The model is applied to a data set from a multicenter clinical trial on breast cancer from the European Organisation for Research and Treatment of Cancer (EORTC trial 10854). Hospitals are compared by employing empirical Bayes estimates methodology together with corresponding confidence intervals.