Bayesian inference in time-varying additive hazards models with applications to disease mapping.
ABSTRACT: Environmental health and disease mapping studies are often concerned with the evaluation of the combined effect of various socio-demographic and behavioral factors, and environmental exposures on time-to-events of interest, such as death of individuals, organisms or plants. In such studies, estimation of the hazard function is often of interest. In addition to known explanatory variables, the hazard function maybe subject to spatial/geographical variations, such that proximally located regions may experience similar hazards than regions that are distantly located. A popular approach for handling this type of spatially-correlated time-to-event data is the Cox's Proportional Hazards (PH) regression model with spatial frailties. However, the PH assumption poses a major practical challenge, as it entails that the effects of the various explanatory variables remain constant over time. This assumption is often unrealistic, for instance, in studies with long follow-ups where the effects of some exposures on the hazard may vary drastically over time. Our goal in this paper is to offer a flexible semiparametric additive hazards model (AH) with spatial frailties. Our proposed model allows both the frailties as well as the regression coefficients to be time-varying, thus relaxing the proportionality assumption. Our estimation framework is Bayesian, powered by carefully tailored posterior sampling strategies via Markov chain Monte Carlo techniques. We apply the model to a dataset on prostate cancer survival from the US state of Louisiana to illustrate its advantages.
Project description:This article develops a Bayesian semiparametric approach to the extended hazard model, with generalization to high-dimensional spatially grouped data. County-level spatial correlation is accommodated marginally through the normal transformation model of Li and Lin (2006, Journal of the American Statistical Association 101, 591-603), using a correlation structure implied by an intrinsic conditionally autoregressive prior. Efficient Markov chain Monte Carlo algorithms are developed, especially applicable to fitting very large, highly censored areal survival data sets. Per-variable tests for proportional hazards, accelerated failure time, and accelerated hazards are efficiently carried out with and without spatial correlation through Bayes factors. The resulting reduced, interpretable spatial models can fit significantly better than a standard additive Cox model with spatial frailties.
Project description:Meta-analysis of time-to-event outcomes using the hazard ratio as a treatment effect measure has an underlying assumption that hazards are proportional. The between-arm difference in the restricted mean survival time is a measure that avoids this assumption and allows the treatment effect to vary with time. We describe and evaluate meta-analysis based on the restricted mean survival time for dealing with non-proportional hazards and present a diagnostic method for the overall proportional hazards assumption. The methods are illustrated with the application to two individual participant meta-analyses in cancer. The examples were chosen because they differ in disease severity and the patterns of follow-up, in order to understand the potential impacts on the hazards and the overall effect estimates. We further investigate the estimation methods for restricted mean survival time by a simulation study.
Project description:The proportional hazards assumption in the commonly used Cox model for censored failure time data is often violated in scientific studies. Yang and Prentice (2005) proposed a novel semiparametric two-sample model that includes the proportional hazards model and the proportional odds model as sub-models, and accommodates crossing survival curves. The model leaves the baseline hazard unspecified and the two model parameters can be interpreted as the short-term and long-term hazard ratios. Inference procedures were developed based on a pseudo score approach. Although extension to accommodate covariates was mentioned, no formal procedures have been provided or proved. Furthermore, the pseudo score approach may not be asymptotically efficient. We study the extension of the short-term and long-term hazard ratio model of Yang and Prentice (2005) to accommodate potentially time-dependent covariates. We develop efficient likelihood-based estimation and inference procedures. The nonparametric maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Extensive simulation studies demonstrate that the proposed methods perform well in practical settings. The proposed method successfully captured the phenomenon of crossing hazards in a cancer clinical trial and identified a genetic marker with significant long-term effect missed by using the proportional hazards model on age-at-onset of alcoholism in a genetic study.
Project description:BACKGROUND:Randomized Controlled Trials almost invariably utilize the hazard ratio calculated with a Cox proportional hazard model as a treatment efficacy measure. Despite the widespread adoption of HRs, these provide a limited understanding of the treatment effect and may even provide a biased estimate when the assumption of proportional hazards in the Cox model is not verified by the trial data. Additional treatment effect measures on the survival probability or the time scale may be used to supplement HRs but a framework for the simultaneous generation of these measures is lacking. METHODS:By splitting follow-up time at the nodes of a Gauss Lobatto numerical quadrature rule, techniques for Poisson Generalized Additive Models (PGAM) can be adopted for flexible hazard modeling. Straightforward simulation post-estimation transforms PGAM estimates for the log hazard into estimates of the survival function. These in turn were used to calculate relative and absolute risks or even differences in restricted mean survival time between treatment arms. We illustrate our approach with extensive simulations and in two trials: IPASS (in which the proportionality of hazards was violated) and HEMO a long duration study conducted under evolving standards of care on a heterogeneous patient population. FINDINGS:PGAM can generate estimates of the survival function and the hazard ratio that are essentially identical to those obtained by Kaplan Meier curve analysis and the Cox model. PGAMs can simultaneously provide multiple measures of treatment efficacy after a single data pass. Furthermore, supported unadjusted (overall treatment effect) but also subgroup and adjusted analyses, while incorporating multiple time scales and accounting for non-proportional hazards in survival data. CONCLUSIONS:By augmenting the HR conventionally reported, PGAMs have the potential to support the inferential goals of multiple stakeholders involved in the evaluation and appraisal of clinical trial results under proportional and non-proportional hazards.
Project description:BACKGROUND: When boosting algorithms are used for building survival models from high-dimensional data, it is common to fit a Cox proportional hazards model or to use least squares techniques for fitting semiparametric accelerated failure time models. There are cases, however, where fitting a fully parametric accelerated failure time model is a good alternative to these methods, especially when the proportional hazards assumption is not justified. Boosting algorithms for the estimation of parametric accelerated failure time models have not been developed so far, since these models require the estimation of a model-specific scale parameter which traditional boosting algorithms are not able to deal with. RESULTS: We introduce a new boosting algorithm for censored time-to-event data which is suitable for fitting parametric accelerated failure time models. Estimation of the predictor function is carried out simultaneously with the estimation of the scale parameter, so that the negative log likelihood of the survival distribution can be used as a loss function for the boosting algorithm. The estimation of the scale parameter does not affect the favorable properties of boosting with respect to variable selection. CONCLUSION: The analysis of a high-dimensional set of microarray data demonstrates that the new algorithm is able to outperform boosting with the Cox partial likelihood when the proportional hazards assumption is questionable. In low-dimensional settings, i.e., when classical likelihood estimation of a parametric accelerated failure time model is possible, simulations show that the new boosting algorithm closely approximates the estimates obtained from the maximum likelihood method.
Project description:Right-truncated data arise when observations are ascertained retrospectively, and only subjects who experience the event of interest by the time of sampling are selected. Such a selection scheme, without adjustment, leads to biased estimation of covariate effects in the Cox proportional hazards model. The existing methods for fitting the Cox model to right-truncated data, which are based on the maximization of the likelihood or solving estimating equations with respect to both the baseline hazard function and the covariate effects, are numerically challenging. We consider two alternative simple methods based on inverse probability weighting (IPW) estimating equations, which allow consistent estimation of covariate effects under a positivity assumption and avoid estimation of baseline hazards. We discuss problems of identifiability and consistency that arise when positivity does not hold and show that although the partial tests for null effects based on these IPW methods can be used in some settings even in the absence of positivity, they are not valid in general. We propose adjusted estimating equations that incorporate the probability of observation when it is known from external sources, which results in consistent estimation. We compare the methods in simulations and apply them to the analyses of human immunodeficiency virus latency.
Project description:<h4>Background</h4>Most clinical trials with time-to-event primary outcomes are designed assuming constant event rates and proportional hazards over time. Non-constant event rates and non-proportional hazards are seen increasingly frequently in trials. The objectives of this review were firstly to identify whether non-constant event rates and time-dependent treatment effects were allowed for in sample size calculations of trials, and secondly to assess the methods used for the analysis and reporting of time-to-event outcomes including how researchers accounted for non-proportional treatment effects.<h4>Methods</h4>We reviewed all original reports published between January and June 2017 in four high impact medical journals for trials for which the primary outcome involved time-to-event analysis. We recorded the methods used to analyse and present the main outcomes of the trial and assessed the reporting of assumptions underlying these methods. The sample size calculation was reviewed to see if the effect of either non-constant hazard rates or anticipated non-proportionality of the treatment effect was allowed for during the trial design.<h4>Results</h4>From 446 original reports we identified 66 trials with a time-to-event primary outcome encompassing trial start dates from July 1995 to November 2014. The majority of these trials (73%) had sample size calculations that used standard formulae with a minority of trials (11%) using simulation for anticipated changing event rates and/or non-proportional hazards. Well-established analytical methods, Kaplan-Meier curves (98%), the log rank test (88%) and the Cox proportional hazards model (97%), were used almost exclusively for the main outcome. Parametric regression models were considered in 11% of the reports. Of the trials reporting inference from the Cox model, only 11% reported any results of testing the assumption of proportional hazards.<h4>Conclusions</h4>Our review confirmed that when designing trials with time-to-event primary outcomes, methodologies assuming constant event rates and proportional hazards were predominantly used despite potential efficiencies in sample size needed or power achieved using alternative methods. The Cox proportional hazards model was used almost exclusively to present inferential results, yet testing and reporting of the pivotal assumption underpinning this estimation method was lacking.
Project description:The effect of an exposure on survival can be biased when the regression model is misspecified. Hazard difference is easier to use in risk assessment than hazard ratio and has a clearer interpretation in the assessment of effect modifications.We proposed two doubly robust additive hazards models to estimate the causal hazard difference of a continuous exposure on survival. The first model is an inverse probability-weighted additive hazards regression. The second model is an extension of the doubly robust estimator for binary exposures by categorizing the continuous exposure. We compared these with the marginal structural model and outcome regression with correct and incorrect model specifications using simulations. We applied doubly robust additive hazard models to the estimation of hazard difference of long-term exposure to PM2.5 (particulate matter with an aerodynamic diameter less than or equal to 2.5 microns) on survival using a large cohort of 13 million older adults residing in seven states of the Southeastern United States.We showed that the proposed approaches are doubly robust. We found that each 1 ?g m increase in annual PM2.5 exposure was associated with a causal hazard difference in mortality of 8.0?×?10 (95% confidence interval 7.4?×?10, 8.7?×?10), which was modified by age, medical history, socioeconomic status, and urbanicity. The overall hazard difference translates to approximately 5.5 (5.1, 6.0) thousand deaths per year in the study population.The proposed approaches improve the robustness of the additive hazards model and produce a novel additive causal estimate of PM2.5 on survival and several additive effect modifications, including social inequality.
Project description:We propose a Bayesian semiparametric joint regression model for a recurrent event process and survival time. Assuming independent latent subject frailties, we define marginal models for the recurrent event process intensity and survival distribution as functions of the subject's frailty and baseline covariates. A robust Bayesian model, called Joint-DP, is obtained by assuming a Dirichlet process for the frailty distribution. We present a simulation study that compares posterior estimates under the Joint-DP model to a Bayesian joint model with lognormal frailties, a frequentist joint model, and marginal models for either the recurrent event process or survival time. The simulations show that the Joint-DP model does a good job of correcting for treatment assignment bias, and has favorable estimation reliability and accuracy compared with the alternative models. The Joint-DP model is applied to analyze an observational dataset from esophageal cancer patients treated with chemo-radiation, including the times of recurrent effusions of fluid to the heart or lungs, survival time, prognostic covariates, and radiation therapy modality.
Project description:Survival analysis is used in the medical field to identify the effect of predictive variables on time to a specific event. Generally, not all variation of survival time can be explained by observed covariates. The effect of unobserved variables on the risk of a patient is called frailty. In multicenter studies, the unobserved center effect can induce frailty on its patients, which can lead to selection bias over time when ignored. For this reason, it is common practice in multicenter studies to include a random frailty term modeling center effect. In a more complex event structure, more than one type of event is possible. Independent frailty variables representing center effect can be incorporated in the model for each competing event. However, in the medical context, events representing disease progression are likely related and correlation is missed when assuming frailties to be independent. In this work, an additive gamma frailty model to account for correlation between frailties in a competing risks model is proposed, to model frailties at center level. Correlation indicates a common center effect on both events and measures how closely the risks are related. Estimation of the model using the expectation-maximization algorithm is illustrated. The model is applied to a data set from a multicenter clinical trial on breast cancer from the European Organisation for Research and Treatment of Cancer (EORTC trial 10854). Hospitals are compared by employing empirical Bayes estimates methodology together with corresponding confidence intervals.