In a stepped wedge group-randomized trial (SWGRT), also called a stepped wedge cluster-randomized trial, groups or clusters begin the study in the control condition, are randomly assigned to treatment sequences, and cross-over to the intervention condition at pre-determined time points in a sequential, staggered fashion until all groups or clusters receive the intervention (
;
;
;
;
;
;
;
;
;
;
;
).
Special methods are needed for analysis and sample size estimation for these studies, as detailed below and in the SWGRT sample size calculator.
Features and Uses
Staggered, Sequential Cross-Over
A SWGRT is a trial in which groups cross over to the intervention condition at predetermined time points in a sequential, staggered fashion until all groups receive the intervention. The design has been used when limited resources or a large geographical area prevent the use of a conventional parallel GRT (
).
NIH Webinars
- Methods: Mind the Gap Webinar: Power Calculations for Stepped Wedge Designs with Binary Outcomes: Methods and Software
- Methods: Mind the Gap Webinar: Does it Decay? Decaying Correlations in the Design and Analysis of Stepped Wedge Trials
- Methods: Mind the Gap Webinar: Overview of Statistical Models for the Design and Analysis of Stepped Wedge Cluster Randomized Trials
- Methods: Mind the Gap Webinar: Stepped Wedge Cluster Randomized Designs for Disease Prevention Research
The design has also been used when staff training requirements in a clinical care intervention necessitated phased implementation (
). It has also been employed to improve power when a limited number of groups was available ( ).Nested or Hierarchical Design
In a SWGRT, members are nested within groups or clusters so that each member appears in only one group or cluster. In cross-sectional SWGRTs, different members are observed in each group at each measurement occasion; in closed cohort SWGRTs, members are observed repeatedly so that measurements are nested within members; in open cohort SWGRTs, some members are observed in only one time period and others are observed during multiple time periods (
;
;
;
).
Appropriate Use
SWGRTs have become more popular over time but the design has a greater risk of bias compared to conventional parallel GRTs (
). Therefore, the use of SWGRT over more conventional alternatives must have strong justification. Hemming and Taljaard (2020) provide a non-exhaustive list of broad justifications, indicating that it may be appropriate to use a SWGRT if:
- it provides a randomized trial when the only alternative is a staggered non-randomized trial and stakeholders can be convinced to randomly assign treatment order,
- it increases the likelihood that gatekeepers and stakeholders will enroll groups in the study due to receiving perceived benefits of the intervention while the trial is ongoing,
- staggered, sequential delivery of the intervention is the only logistically feasible design, or
- limited groups or resources are available and a SWGRT can attain the desired statistical power when a parallel GRT cannot.
Potential for Confounding
While a SWGRT tends to involve a limited number of groups, the impact of chance imbalances may be minimal because each group is exposed to both the control and intervention conditions (
). Chance imbalances may still occur, in which case stratified or constrained randomization on important group-level characteristics has been shown to improve power and maintain type I error rates in parallel GRTs (
; ; ; ) and we might expect similar effects in a SWGRT. In the SWGRT, the method for restricted randomization would be applied when the groups or clusters are randomized to treatment sequences.As time progresses, more groups implement the intervention condition. Therefore, time always has the potential to be a confounder in the relationship between the outcome and the intervention condition. To guard against this, time must be accounted for in the SWGRT design and analysis (
;
).
Within- and Between-Period Correlations
Important factors in determining the sample size for a SWGRT are the intraclass correlation (ICC), the cluster autocorrelation (CAC), and the individual autocorrelation (IAC) (
;
;
). These quantities provide information on the similarity among outcome values due to correlation within groups or clusters at the same time and to repeated measurements on the same groups or clusters or on the same members.
The ICC measures the similarity among values on the outcome variable for different members of the same group or cluster within a given time period. It is often described as the average correlation among members within the same group or cluster and within the same time period or as the proportion of variance due to group or cluster membership. The CAC is the correlation between the population means from the same group or cluster at two different time periods; it is sometimes called over-time correlation at the group level. The CAC is present in cross-sectional, closed cohort, and open cohort designs. The IAC is the correlation on the outcome variable for the same individual at two different time periods; it is sometimes called over-time correlation at the member level. The IAC is present only in closed and open cohort designs.
A characteristic of longitudinal GRTs such as SWGRTs is that the CAC and IAC can be considered as functions of compared time periods whose values decay over time. Failing to account for such decay in SWGRTs can result in increased Type I error rates (
;
;
;
). There are many possible decay structures, such as discrete-time decay and block-exchangeable structures.
Solutions
The recommended solutions to these challenges are to 1) employ stratified or constrained randomization techniques to balance important cluster-level covariates when assigning groups to treatment sequences, 2) account for time in the study design and analysis, and 3) estimate the sample size for SWGRTs based on realistic and data-based estimates of within- and between-period correlations and other parameters indicated by the analytic plan. Extra variation and limited df always reduce power, so it is essential to consider these factors while the study is being planned, and particularly as part of the estimation of sample size.
SWGRTs should only be used when all efforts to implement a more conventional parallel GRT have been exhausted. Compared to parallel GRTs, SWGRTs are at greater risk of bias. Given these risks, strong justifications must be given for the use of SWGRTs.
There are no textbooks dedicated to SWGRTs, but some provide overviews of design and analysis (
; ).Several papers provide further information (
; ; ; ; ; ; ; ; ).Z-scores and t-scores will give similar results if the df available for the test of the intervention effect are more than about 30. As the df decline below 30, it becomes increasingly important to use t-scores rather than z-scores. Unfortunately, the precise df to use for t-scores when calculating power or sample size for SWGRTs is unsettled, though it is a subject of on-going research. One approach is to use the number of groups or clusters minus the number of time periods minus one, but other approaches are possible (
; ).Standard sources assume that each group or cluster has the same number of observations, but that is almost never true in practice. In GRTs, power decreases as the variation in group or cluster size increases. This is true for SWGRTs as well, but the power decrease is less pronounced (
; ).If the distribution of group sizes within each treatment sequence is the same, expressions for design effects assuming block-exchangeable correlation structure are available that inflate the average cluster size relative the corresponding equal-cluster design (
; ).In SWGRTs, all groups eventually experience both study conditions. However, if groups transition to the treatment condition too late or too early, within-cluster contamination will arise that may produce biased results Several strategies have been suggested to mitigate the impact of this contamination (
).Power in a SWGRT is a function of several factors. These include the treatment effect, the number of time periods, the number of groups, the number of members per group, ICC, CAC, IAC, and the correlation decay structure.
Yes, but the CAC and IAC estimates from the block-exchangeable study should be adjusted for use with the planned discrete-time decay analysis. Expressions for this adjustment are available (
). Note that this adjustment should only be obtained when using CAC and IAC estimates from an analysis that incorrectly assumed a block-exchangeable correlation structure when discrete-time decay was present. In addition, the proposed study must have the same number of periods and period length as the previous study. If this is not the case, or if you are unsure of the previous study's decay mechanism, number of time periods, or period length, then you should not make this adjustment.There are numerous publications on sample size and power for SWGRTs (SWGRT Sample Size Calculator section of this website. That calculator supports sample size estimation for the three main types of SWGRTs: cross-sectional, open cohort, and closed cohort.
; ; ; ). Detailed information is also available in theThe original approach assumed a common secular trend and an immediate and constant intervention effect (
). Further work allowed treatment effects to vary across groups (
). In addition, methods that model the intervention effect as a trend over time have been offered ( ; ). Recently, a general model for SWGRTs that accommodates various forms for the intervention effect has been provided ( ).In the parallel GRT, the groups or clusters in the control condition remain in that condition throughout the trial. As such, if external events occur that affect the outcome, that will be seen in the control condition and it may be possible to adjust for it. In the SWGRT, the groups or clusters gradually cross over from the control condition to the intervention condition, so that there are fewer and fewer groups or clusters in the control condition as the study progresses. That can make it difficult to observe or adjust for the effect of an external event that may affect the outcome.