Project description:Composite endpoints are commonly used in clinical trials, and time-to-first-event analysis has been the usual standard. Time-to-first-event analysis treats all components of the composite endpoint as having equal severity and is heavily influenced by short-term components. Over the last decade, novel statistical approaches have been introduced to overcome these limitations. We reviewed win ratio analysis, competing risk regression, negative binomial regression, Andersen-Gill regression, and weighted composite endpoint (WCE) analysis. Each method has both advantages and limitations. The advantage of win ratio and WCE analyses is that they take event severity into account by assigning weights to each component of the composite endpoint. These weights should be pre-specified because they strongly influence treatment effect estimates. Negative binomial regression and Andersen-Gill analyses consider all events for each patient -rather than only the first event - and tend to have more statistical power than time-to-first-event analysis. Pre-specified novel statistical methods may enhance our understanding of novel therapy when components vary substantially in severity and timing. These methods consider the specific types of patients, drugs, devices, events, and follow-up duration.
Project description:BackgroundIn clinical trials the study interest often lies in the comparison of a treatment to a control regarding a time to event endpoint. A composite endpoint allows to consider several time to event endpoints at once. Usually, only the time to the first occurring event for a patient is thereby analyzed. However, an individual may experience more than one non-fatal event. Including all observed events in the analysis can increase the power and provides a more complete picture of the disease. Thus, analytical methods for recurrent events are required. A challenge is that the different event types belonging to the composite often are of different clinical relevance. In this case, weighting the event types according to their clinical relevance is an option. Different weight-based methods for composite time to event endpoints were proposed. So far, there exists no systematic comparison of these methods.MethodsWithin this work we provide a systematic comparison of three methods proposed for weighted composite endpoints in a recurrent event setting combining non-fatal and fatal events of different clinical relevance. We consider an extension of an approach proposed by Wei and Lachin, an approach by Rauch et al., and an approach by Bakal et al.. Comparison is done based on a simulation study and based on a clinical study example.ResultsFor all three approaches closed formula test statistics are available. The Wei-Lachin approach and the approach by Rauch et al. show similar results in mean squared error. For the approach by Wei and Lachin confidence intervals are provided. The approach by Bakal et al. is not related to a quantifiable estimand. The relevance weights of the different approaches work on different level, i.e. either on cause-specific hazard ratios or on event count.ConclusionThe provided comparison and simulations can help to guide applied researchers to choose an adequate method for the analysis of composite endpoints combining (recurrent) events of different clinical relevance. The approach by Wei and Lachin and Rauch et al. can be recommended in scenarios where the composite effect is time-independent. The approach by Bakal et al. should be applied carefully.
Project description:Composite endpoints are frequently used in clinical trials, but simple approaches, such as the time to first event, do not reflect any ordering among the endpoints. However, some endpoints, such as mortality, are worse than others. A variety of procedures have been proposed to reflect the severity of the individual endpoints such as pairwise ranking approaches, the win ratio, and the desirability of outcome ranking. When patients have different lengths of follow-up, however, ranking can be difficult and proposed methods do not naturally lead to regression approaches and require specialized software. This paper defines an ordering score O to operationalize the patient ranking implied by hierarchical endpoints. We show how differential right censoring of follow-up corresponds to multiple interval censoring of the ordering score allowing standard software for survival models to be used to calculate the nonparametric maximum likelihood estimators (NPMLEs) of different measures. Additionally, if one assumes that the ordering score is transformable to an exponential random variable, a semiparametric regression is obtained, which is equivalent to the proportional hazards model subject to multiple interval censoring. Standard software can be used for estimation. We show that the NPMLE can be poorly behaved compared to the simple estimators in staggered entry trials. We also show that the semiparametric estimator can be more efficient than simple estimators and explore how standard Cox regression maneuvers can be used to assess model fit, allow for flexible generalizations, and assess interactions of covariates with treatment. We analyze a trial of short versus long-term antiplatelet therapy using our methods.
Project description:BackgroundWhen assessing the efficacy of a treatment in any clinical trial, it is recommended by the International Conference on Harmonisation to select a single meaningful endpoint. However, a single endpoint is often not sufficient to reflect the full clinical benefit of a treatment in multifaceted diseases, which is often the case in rare diseases. Therefore, the use of a combination of several clinically meaningful outcomes is preferred. Many methodologies that allow for combining outcomes in a so-called composite endpoint are however limited in a number of ways, not in the least in the number and type of outcomes that can be combined and in the poor small-sample properties. Moreover, patient reported outcomes, such as quality of life, often cannot be integrated in a composite analysis, in spite of their intrinsic value.ResultsRecently, a class of non-parametric generalized pairwise comparisons tests have been proposed, which members do allow for any number and type of outcomes, including patient reported outcomes. The class enjoys good small-sample properties. Moreover, this very flexible class of methods allows for prioritizing the outcomes by clinical severity, allows for matched designs and for adding a threshold of clinical relevance. Our aim is to introduce the generalized pairwise comparison ideas and concepts for rare disease clinical trial analysis, and demonstrate their benefit in a post-hoc analysis of a small-sample trial in epidermolysis bullosa. More precisely, we will include a patient relevant outcome (Quality of life), in a composite endpoint. This publication is part of the European Joint Programme on Rare Diseases (EJP RD) series on innovative methodologies for rare diseases clinical trials, which is based on the webinars presented within the educational activity of EJP RD. This publication covers the webinar topic on composite endpoints in rare diseases and includes participants' response to a questionnaire on this topic.ConclusionsGeneralized pairwise comparisons is a promising statistical methodology for evaluating any type of composite endpoints in rare disease trials and may allow a better evaluation of therapy efficacy including patients reported outcomes in addition to outcomes related to the diseases signs and symptoms.
Project description:Goal Attainment Scaling is an assessment instrument to evaluate interventions on the basis of individual, patient-specific goals. The attainment of these goals is mapped in a pre-specified way to attainment levels on an ordinal scale, which is common to all goals. This approach is patient-centred and allows one to integrate the outcomes of patients with very heterogeneous symptoms. The latter is of particular importance in clinical trials in rare diseases because it enables larger sample sizes by including a broader patient population. In this paper, we focus on the statistical analysis of Goal Attainment Scaling outcomes for the comparison of two treatments in randomised clinical trials. Building on a general statistical model, we investigate the properties of different hypothesis testing approaches. Additionally, we propose a latent variable approach to generate Goal Attainment Scaling data in a simulation study, to assess the impact of model parameters such as the number of goals per patient and their correlation, the choice of discretisation thresholds and the type of design (parallel group or cross-over). Based on our findings, we give recommendations for the design of clinical trials with a Goal Attainment Scaling endpoint. Furthermore, we discuss an application of Goal Attainment Scaling in a clinical trial in mastocytosis.
Project description:BackgroundComposite endpoints are recommended in rare diseases to increase power and/or to sufficiently capture complexity. Often, they are in the form of responder indices which contain a mixture of continuous and binary components. Analyses of these outcomes typically treat them as binary, thus only using the dichotomisations of continuous components. The augmented binary method offers a more efficient alternative and is therefore especially useful for rare diseases. Previous work has indicated the method may have poorer statistical properties when the sample size is small. Here we investigate small sample properties and implement small sample corrections.MethodsWe re-sample from a previous trial with sample sizes varying from 30 to 80. We apply the standard binary and augmented binary methods and determine the power, type I error rate, coverage and average confidence interval width for each of the estimators. We implement Firth's adjustment for the binary component models and a small sample variance correction for the generalized estimating equations, applying the small sample adjusted methods to each sub-sample as before for comparison.ResultsFor the log-odds treatment effect the power of the augmented binary method is 20-55% compared to 12-20% for the standard binary method. Both methods have approximately nominal type I error rates. The difference in response probabilities exhibit similar power but both unadjusted methods demonstrate type I error rates of 6-8%. The small sample corrected methods have approximately nominal type I error rates. On both scales, the reduction in average confidence interval width when using the adjusted augmented binary method is 17-18%. This is equivalent to requiring a 32% smaller sample size to achieve the same statistical power.ConclusionsThe augmented binary method with small sample corrections provides a substantial improvement for rare disease trials using composite endpoints. We recommend the use of the method for the primary analysis in relevant rare disease trials. We emphasise that the method should be used alongside other efforts in improving the quality of evidence generated from rare disease trials rather than replace them.
Project description:The measurement of outcomes in kidney transplantation has been more accurately documented than almost any other surgical procedure result in recent decades. With significant improvements in short- and long-term outcomes related to optimized immunosuppression, outcomes have gradually shifted away from conventional clinical endpoints (ie, patient and graft survival) to surrogate and composite endpoints. This article reviews how outcomes measurements have evolved in the past 2 decades in the setting of increased data collection and summarizes recent advances in outcomes measurements pertaining to clinical, histopathological, and immune outcomes. Finally, we discuss the use of composite endpoints and Bayesian concepts, specifically focusing on the integrative box risk prediction score, in conjunction with machine learning to refine prognostication.
Project description:BackgroundAssessments of lung function, exacerbations and health status are common measures of chronic obstructive pulmonary disease (COPD) progression and treatment response in clinical trials. We hypothesised that a composite endpoint could more holistically assess clinically important deterioration (CID) in a COPD clinical trial setting.MethodsA composite endpoint was tested in a post hoc analysis of 5652 patients with Global Initiative for Chronic Obstructive Lung Disease (GOLD) 2-4 COPD from the 4-year UPLIFT study. Patients received tiotropium 18 μg or placebo.ResultsThe composite endpoint included time to first confirmed decrease in trough forced expiratory volume in 1 s (FEV1) ≥100 mL, confirmed increase in St. George's Respiratory Questionnaire (SGRQ) total score ≥ 4 units, or moderate/severe exacerbation. Most patients (> 80%) experienced CID, with similar incidence among GOLD subgroups. Most confirmed trough FEV1 (74.6-81.6%) and SGRQ (72.3-78.1%) deteriorations were sustained across the study and in all GOLD subgroups. Patients with CID more frequently experienced subsequent exacerbation (hazard ratio [HR] 1.79; 95% confidence interval [CI] 1.67, 1.92) or death (HR 1.21; 95% CI 1.06, 1.39) by Month 6. CID was responsive to bronchodilator treatment.ConclusionsComposite endpoints provide additional information on COPD progression and treatment effects in clinical trials.Trial registrationClinicalTrials.gov NCT00144339 .
Project description:ObjectivesFDA-approved treatments for platinum-sensitive recurrent ovarian cancer (PSROC) include bevacizumab and PARP inhibitors (PARPi); clinical decisions regarding therapy must be made prior to initiating chemotherapy. Using the American Society of Clinical Oncology (ASCO) and European Society of Medical Oncology (ESMO) value frameworks, we assessed relative values of concurrent/maintenance biologic therapies in PSROC.MethodsValue scores were calculated for key maintenance therapies based on randomized controlled trials: bevacizumab (OCEANS, GOG 213); olaparib (Study 19, SOLO2); niraparib (NOVA); rucaparib (ARIEL3). Personalized value scorecards were constructed for patients with germline/somatic-BRCA mutations, homologous recombination deficiency (HRD), and wild-type BRCA (wBRCA). ASCO value scores assess clinical benefit, toxicity, long-term survival, symptom palliation, treatment-free interval, and quality of life (QOL). ESMO value scores assess clinical benefit, toxicity, and QOL.ResultsASCO scores were highest for maintenance PARPi in germline/somatic-BRCA mutation cohorts: olaparib (SOLO2) = 47, (Study 19) = 62; niraparib = 50; rucaparib = 54. HRD cohorts had slightly lower scores: niraparib = 46; rucaparib = 37. wBRCA cohorts had the lowest scores: niraparib = 26; rucaparib = 26; and olaparib (Study 19) = 32, as did patients receiving bevacizumab (OCEANS) = 35, (GOG 213) = 26. ESMO scores demonstrated high-value for maintenance PARPi in germline/somatic-BRCA mutation cohorts and low-value for bevacizumab and PARPi in wBRCA cohorts.ConclusionsThe value of maintenance PARPi therapy depends heavily on BRCA status, with the highest value scores in germline/somatic-BRCA mutation cohorts. Personalized value scorecards provide a visual aid to assess the harm-benefit balance of maintenance PARPi for PSROC.