Sample size calculations for stepped wedge and cluster randomised trials: a unified approach.
ABSTRACT: To clarify and illustrate sample size calculations for the cross-sectional stepped wedge cluster randomized trial (SW-CRT) and to present a simple approach for comparing the efficiencies of competing designs within a unified framework.We summarize design effects for the SW-CRT, the parallel cluster randomized trial (CRT), and the parallel cluster randomized trial with before and after observations (CRT-BA), assuming cross-sectional samples are selected over time. We present new formulas that enable trialists to determine the required cluster size for a given number of clusters. We illustrate by example how to implement the presented design effects and give practical guidance on the design of stepped wedge studies.For a fixed total cluster size, the choice of study design that provides the greatest power depends on the intracluster correlation coefficient (ICC) and the cluster size. When the ICC is small, the CRT tends to be more efficient; when the ICC is large, the SW-CRT tends to be more efficient and can serve as an alternative design when the CRT is an infeasible design.Our unified approach allows trialists to easily compare the efficiencies of three competing designs to inform the decision about the most efficient design in a given scenario.
Project description:The ability to accurately estimate the sample size required by a stepped-wedge (SW) cluster randomized trial (CRT) routinely depends upon the specification of several nuisance parameters. If these parameters are misspecified, the trial could be overpowered, leading to increased cost, or underpowered, enhancing the likelihood of a false negative. We address this issue here for cross-sectional SW-CRTs, analyzed with a particular linear-mixed model, by proposing methods for blinded and unblinded sample size reestimation (SSRE). First, blinded estimators for the variance parameters of a SW-CRT analyzed using the Hussey and Hughes model are derived. Following this, procedures for blinded and unblinded SSRE after any time period in a SW-CRT are detailed. The performance of these procedures is then examined and contrasted using two example trial design scenarios. We find that if the two key variance parameters were underspecified by 50%, the SSRE procedures were able to increase power over the conventional SW-CRT design by up to 41%, resulting in an empirical power above the desired level. Thus, though there are practical issues to consider, the performance of the procedures means researchers should consider incorporating SSRE in to future SW-CRTs.
Project description:Standard sample size calculation formulas for stepped wedge cluster randomized trials (SW-CRTs) assume that cluster sizes are equal. When cluster sizes vary substantially, ignoring this variation may lead to an under-powered study. We investigate the relative efficiency of a SW-CRT with varying cluster sizes to equal cluster sizes, and derive variance estimators for the intervention effect that account for this variation under a mixed effects model-a commonly used approach for analyzing data from cluster randomized trials. When cluster sizes vary, the power of a SW-CRT depends on the order in which clusters receive the intervention, which is determined through randomization. We first derive a variance formula that corresponds to any particular realization of the randomized sequence and propose efficient algorithms to identify upper and lower bounds of the power. We then obtain an "expected" power based on a first-order approximation to the variance formula, where the expectation is taken with respect to all possible randomization sequences. Finally, we provide a variance formula for more general settings where only the cluster size arithmetic mean and coefficient of variation, instead of exact cluster sizes, are known in the design stage. We evaluate our methods through simulations and illustrate that the average power of a SW-CRT decreases as the variation in cluster sizes increases, and the impact is largest when the number of clusters is small.
Project description:BACKGROUND:Stepped-wedge cluster randomised trials (SW-CRTs) are a pragmatic trial design, providing an unprecedented opportunity to increase the robustness of evidence underpinning implementation and quality improvement interventions. Given the complexity of the SW-CRT, the likelihood of trials not delivering on their objectives will be mitigated if a feasibility study precedes the definitive trial. It is not currently known if feasibility studies are being conducted for SW-CRTs nor what the objectives of these studies are. METHODS:Searches were conducted of several databases to identify published feasibility studies which were designed to inform a future SW-CRT. For each eligible study, data were extracted on the characteristics of and rationale for the feasibility study; the process for determining progression to the main trial; how the feasibility study informed the main trial; and whether the main trial went ahead. A narrative synthesis and descriptive analysis are presented. RESULTS:Eleven feasibility studies were identified, which included eight completed study reports and three protocols. Three studies used a stepped-wedge design and these were the only studies to be randomised. Studies were predominantly of a mixed-methods design. Only one study assessed specific features related to the feasibility of using a SW-CRT and one investigated the time taken to complete the study procedures. The other studies were mostly assessing the feasibility and acceptability of the intervention. CONCLUSION:Published feasibility studies for SW-CRTs are scarce and those that are being reported do not investigate issues specific to the complexities of the trial design. When conducting feasibility studies in advance of a definitive SW-CRT, researchers should consider assessing the feasibility of study procedures, particularly those specific to the SW-CRT design, and ensure that the findings are published for the benefit of other researchers.
Project description:BACKGROUND:Stepped-wedge cluster randomized trials (SW-CRTs) are increasingly popular in health-related research in both high- and low-resource settings. There may be specific ethical issues that researchers face when designing and conducting SW-CRTs in low-resource settings. Knowledge of these issues can help to improve the ethical conduct of SW-CRTs in a global health context. METHODS:We performed an ethical analysis of two studies using SW-CRT designs in low-resource settings: the Que Vivan Las Madres study conducted from 2014 to 2017 in Guatemala and the Atmiyata study conducted from 2017 to 2018 in rural parts of India. For both case studies, we identified and evaluated the classification of the study as research or nonresearch and the ethical issues regarding the justification of the design, including the delayed rollout of an intervention that had a promising effect. RESULTS:In our case studies, some minor ethical issues surfaced about the registration and stakeholder pressure on the order of randomization, but both included good justification for the design and delayed rollout. Our analysis did, however, demonstrate that careful consideration of the role of randomization and registration of the trials is important. DISCUSSION:SW-CRTs can provide an opportunity for rigorous evaluation of interventions destined to be rolled out on the basis of limited evidence. Furthermore, in SW-CRTs, the underlying objective is often to provide a robust evaluation of the effectiveness for generalized dissemination, and this makes the SW-CRT no less a research study than any other form of cluster randomized trial. CONCLUSION:The design and conduct of stepped-wedge cluster randomized trials raises at least two ethical issues that need special consideration in both high- and low-resource settings: the justification for using the design, specifically the delayed rollout of the intervention to the control group, and the classification of the study as research or nonresearch. In our case studies, these issues did not seem to raise special ethical scrutiny in low-resource settings. Further ethical evaluation will hopefully result in specific ethical guidelines for the use of SW-CRTs in both high- and low-resource settings to contribute to responsible functioning of these trials and adequate protection of participants.
Project description:OBJECTIVES:To investigate the extent to which cluster sizes vary in stepped-wedge cluster randomised trials (SW-CRT) and whether any variability is accounted for during the sample size calculation and analysis of these trials. SETTING:Any, not limited to healthcare settings. PARTICIPANTS:Any taking part in an SW-CRT published up to March 2016. PRIMARY AND SECONDARY OUTCOME MEASURES:The primary outcome is the variability in cluster sizes, measured by the coefficient of variation (CV) in cluster size. Secondary outcomes include the difference between the cluster sizes assumed during the sample size calculation and those observed during the trial, any reported variability in cluster sizes and whether the methods of sample size calculation and methods of analysis accounted for any variability in cluster sizes. RESULTS:Of the 101 included SW-CRTs, 48% mentioned that the included clusters were known to vary in size, yet only 13% of these accounted for this during the calculation of the sample size. However, 69% of the trials did use a method of analysis appropriate for when clusters vary in size. Full trial reports were available for 53 trials. The CV was calculated for 23 of these: the median CV was 0.41 (IQR: 0.22-0.52). Actual cluster sizes could be compared with those assumed during the sample size calculation for 14 (26%) of the trial reports; the cluster sizes were between 29% and 480% of that which had been assumed. CONCLUSIONS:Cluster sizes often vary in SW-CRTs. Reporting of SW-CRTs also remains suboptimal. The effect of unequal cluster sizes on the statistical power of SW-CRTs needs further exploration and methods appropriate to studies with unequal cluster sizes need to be employed.
Project description:Background/Aims We sought to optimise the design of stepped wedge trials with an equal allocation of clusters to sequences and explored sample size comparisons with alternative trial designs. Methods We developed a new expression for the design effect for a stepped wedge trial, assuming that observations are equally correlated within clusters and an equal number of observations in each period between sequences switching to the intervention. We minimised the design effect with respect to (1) the fraction of observations before the first and after the final sequence switches (the periods with all clusters in the control or intervention condition, respectively) and (2) the number of sequences. We compared the design effect of this optimised stepped wedge trial to the design effects of a parallel cluster-randomised trial, a cluster-randomised trial with baseline observations, and a hybrid trial design (a mixture of cluster-randomised trial and stepped wedge trial) with the same total cluster size for all designs. Results We found that a stepped wedge trial with an equal allocation to sequences is optimised by obtaining all observations after the first sequence switches and before the final sequence switches to the intervention; this means that the first sequence remains in the control condition and the last sequence remains in the intervention condition for the duration of the trial. With this design, the optimal number of sequences is [Formula: see text], where [Formula: see text] is the cluster-mean correlation, [Formula: see text] is the intracluster correlation coefficient, and m is the total cluster size. The optimal number of sequences is small when the intracluster correlation coefficient and cluster size are small and large when the intracluster correlation coefficient or cluster size is large. A cluster-randomised trial remains more efficient than the optimised stepped wedge trial when the intracluster correlation coefficient or cluster size is small. A cluster-randomised trial with baseline observations always requires a larger sample size than the optimised stepped wedge trial. The hybrid design can always give an equally or more efficient design, but will be at most 5% more efficient. We provide a strategy for selecting a design if the optimal number of sequences is unfeasible. For a non-optimal number of sequences, the sample size may be reduced by allowing a proportion of observations before the first or after the final sequence has switched. Conclusion The standard stepped wedge trial is inefficient. To reduce sample sizes when a hybrid design is unfeasible, stepped wedge trial designs should have no observations before the first sequence switches or after the final sequence switches.
Project description:Previous reviews have focussed on the rationale for employing the stepped wedge design (SWD), the areas of research to which the design has been applied and the general characteristics of the design. However these did not focus on the statistical methods nor addressed the appropriateness of sample size methods used.This was a review of the literature of the statistical methodology used in stepped wedge cluster randomised trials.Literature Review. The Medline, Embase, PsycINFO, CINAHL and Cochrane databases were searched for methodological guides and RCTs which employed the stepped wedge design.This review identified 102 trials which employed the stepped wedge design compared to 37 from the most recent review by Beard et al. 2015. Forty six trials were cohort designs and 45 % (n = 46) had fewer than 10 clusters. Of the 42 articles discussing the design methodology 10 covered analysis and seven covered sample size. For cohort stepped wedge designs there was only one paper considering analysis and one considering sample size methods. Most trials employed either a GEE or mixed model approach to analysis (n = 77) but only 22 trials (22 %) estimated sample size in a way which accounted for the stepped wedge design that was subsequently used.Many studies which employ the stepped wedge design have few clusters but use methods of analysis which may require more clusters for unbiased and efficient intervention effect estimates. There is the need for research on the minimum number of clusters required for both types of stepped wedge design. Researchers should distinguish in the sample size calculation between cohort and cross sectional stepped wedge designs. Further research is needed on the effect of adjusting for the potential confounding of time on the study power.
Project description:BACKGROUND:Stepped wedge cluster randomized trials (SW-CRT) are increasingly used to evaluate new clinical programs, yet there is limited guidance on practical aspects of applying this design. We report our early experiences conducting a SW-CRT to examine an inpatient mobility program (STRIDE) in the Veterans Health Administration (VHA). We provide recommendations for future research using this design to evaluate clinical programs. METHODS:Based on data from study records and reflections from the investigator team, we describe and assess the design and initial stages of a SW-CRT, from site recruitment to program launch in 8 VHA hospitals. RESULTS:Site recruitment consisted of thirty 1-h conference calls with representatives from 22 individual VAs who expressed interest in implementing STRIDE. Of these, 8 hospitals were enrolled and randomly assigned in two stratified blocks (4 hospitals per block) to a STRIDE launch date. Block 1 randomization occurred in July 2017 with first STRIDE launch in December 2017; block 2 randomization occurred in April 2018 with first STRIDE launch in January 2019. The primary study outcome of discharge destination will be assessed using routinely collected data in the electronic health record (EHR). Within randomized blocks, two hospitals per sequence launched STRIDE approximately every 3?months with primary outcome assessment paused during the 3-month time period of program launch. All sites received 6-8 implementation support calls, according to a pre-specified schedule, from the time of recruitment to program launch, and all 8 sites successfully launched within their assigned 3-month window. Seven of the eight sites initially started with a limited roll out (for example on one ward) or modified version of STRIDE (for example, using existing staff to conduct walks until new positions were filled). CONCLUSIONS:Future studies should incorporate sufficient time for site recruitment and carefully consider the following to inform design of SW-CRTs to evaluate rollout of a new clinical program: (1) whether a blocked randomization fits study needs, (2) the amount of time and implementation support sites will need to start their programs, and (3) whether clinical programs are likely to include a "ramp-up" period. Successful execution of SW-CRT designs requires both adherence to rigorous design principles and also careful consideration of logistical requirements for timing of program roll out. TRIAL REGISTRATION:ClinicalsTrials.gov NCT03300336 . Prospectively registered on 3 October 2017.
Project description:BACKGROUND:Cluster randomised trials with unequal sized clusters often have lower precision than with clusters of equal size. To allow for this, sample sizes are inflated by a modified version of the design effect for clustering. These inflation factors are valid under the assumption that randomisation is stratified by cluster size. We investigate the impact of unequal cluster size when that constraint is relaxed, with particular focus on the stepped-wedge cluster randomised trial, where this is more difficult to achieve. METHODS:Assuming a multi-level mixed effect model with exchangeable correlation structure for a cross-sectional design, we use simulation methods to compare the precision for a trial with clusters of unequal size to a trial with clusters of equal size (relative efficiency). For a range of scenarios we illustrate the impact of various design features (the cluster-mean correlation - a function of the intracluster correlation and the cluster size, the number of clusters, number of randomisation sequences) on the average and distribution of the relative efficiency. RESULTS:Simulations confirm that the average reduction in precision, due to varying cluster sizes, is smaller in a stepped-wedge trial compared to the parallel trial. However, the variance of the distribution of the relative efficiency is large; and is larger under the stepped-wedge design compared to the parallel design. This can result in large variations in actual power, depending on the allocation of clusters to sequences. Designs with larger variations in cluster sizes, smaller number of clusters and studies with smaller cluster-mean correlations (smaller cluster sizes or smaller intra-cluster correlation) are particularly at risk. CONCLUSION:The actual realised power in a stepped-wedge trial might be substantially higher or lower than that estimated. This is particularly important when there are a small number of clusters or the variability in cluster sizes is large. Constraining the randomisation on cluster size, where feasible, might mitigate this effect.
Project description:INTRODUCTION:The stepped-wedge cluster randomised trial (SW-CRT) is a complex design, for which many decisions about key design parameters must be made during the planning. These include the number of steps and the duration of time needed to embed the intervention. Feasibility studies are likely to be useful for informing these decisions and increasing the likelihood of the main trial's success. However, the number of feasibility studies being conducted for SW-CRTs is currently unknown. This review aims to establish the number of feasibility studies being conducted for SW-CRTs and determine which feasibility issues are commonly investigated. METHODS AND ANALYSIS:Fully published feasibility studies for SW-CRTs will be identified, according to predefined inclusion criteria, from searches conducted in Ovid MEDLINE, Scopus, Embase and PsycINFO. To also identify and gain information on unpublished feasibility studies the following will be contacted: authors of published SW-CRTs (identified from the most recent systematic reviews); contacts for registered SW-CRTs (identified from clinical trials registries); lead statisticians of UK registered clinical trials units and researchers known to work in the area of SW-CRTs.Data extraction will be conducted independently by two reviewers. For the fully published feasibility studies, data will be extracted on the study characteristics, the rationale for the study, the process for determining progression to a main trial, how the study informed the main trial and whether the main trial went ahead. The researchers involved in the unpublished feasibility studies will be contacted to elicit the same information.A narrative synthesis will be conducted and provided alongside a descriptive analysis of the study characteristics. ETHICS AND DISSEMINATION:This review does not require ethical approval, as no individual patient data will be used. The results of this review will be published in an open-access peer-reviewed journal.