In a stepped wedge group-randomized trial (SWGRT), also called a stepped wedge cluster-randomized trial, groups or clusters begin the study in the control condition, are randomly assigned to sequences, and cross-over to the intervention condition at pre-determined time points in a sequential, staggered fashion until all groups or clusters receive the intervention (

;

;

;

;

;

;

;

;

;

;

;

).

Special methods are needed for analysis and sample size estimation for these studies, as detailed below and in the SWGRT sample size calculator.

### Features and Uses

#### Staggered, Sequential Cross-Over

A SWGRT is a trial in which groups cross over to the intervention condition at predetermined time points in a sequential, staggered fashion until all groups receive the intervention. The design has been used when limited resources or a large geographical area prevent the use of a conventional parallel GRT (

).

#### NIH Webinars

- Methods: Mind the Gap Webinar: Power Calculations for Stepped Wedge Designs with Binary Outcomes: Methods and Software
- Methods: Mind the Gap Webinar: Does it Decay? Decaying Correlations in the Design and Analysis of Stepped Wedge Trials
- Methods: Mind the Gap Webinar: Overview of Statistical Models for the Design and Analysis of Stepped Wedge Cluster Randomized Trials

- Methods: Mind the Gap Webinar: Stepped Wedge Cluster Randomized Designs for Disease Prevention Research

The design has also been used when staff training requirements in a clinical care intervention necessitated phased implementation (

). It has also been employed to improve power when a limited number of groups was available ( ).#### Nested or Hierarchical Design

In a SWGRT, members are nested within groups or clusters so that each member appears in only one group or cluster. In cross-sectional SWGRTs, different members are observed in each group at each measurement occasion; in closed cohort SWGRTs, members are observed repeatedly so that measurements are nested within members; in open cohort SWGRTs, some members are observed in only one time period and others are observed during multiple time periods (

;

;

;

).

#### Appropriate Use

SWGRTs have become more popular over time but the design has a greater risk of bias compared to conventional parallel GRTs (

). Therefore, the use of SWGRT over more conventional alternatives must have strong justification. Hemming and Taljaard (2020) provide a non-exhaustive list of broad justifications, indicating that it may be appropriate to use a SWGRT if:

- it provides a randomized trial when the only alternative is a staggered non-randomized trial and stakeholders can be convinced to randomly assign treatment order,
- it increases the likelihood that gatekeepers and stakeholders will enroll groups in the study due to receiving perceived benefits of the intervention while the trial is ongoing,
- staggered, sequential delivery of the intervention is the only logistically feasible design, or
- limited groups or resources are available and a SWGRT can attain the desired statistical power when a parallel GRT cannot.

#### Potential for Confounding

While a SWGRT tends to involve a limited number of groups, the impact of chance imbalances may be minimal because each group is exposed to both the control and intervention conditions (

). Chance imbalances may still occur, in which case stratified or constrained randomization on important group-level characteristics has been shown to improve power and maintain type I error rates in parallel GRTs (

; ; ; ) and we might expect similar effects in a SWGRT. In the SWGRT, the method for restricted randomization would be applied when the groups or clusters are randomized to sequences.As time progresses, more groups implement the intervention condition. Therefore, time always has the potential to be a confounder in the relationship between the outcome and the intervention condition. To guard against this, time must be accounted for in the SWGRT design and analysis (

;

).

#### Within- and Between-Period Correlations

Important factors in determining the sample size for a SWGRT are the intraclass correlation (ICC), the cluster autocorrelation (CAC), and the individual autocorrelation (IAC) (

;

;

). These quantities provide information on the similarity among outcome values due to correlation within groups or clusters at the same time and to repeated measurements on the same groups or clusters or on the same members.

The ICC measures the similarity among values on the outcome variable for different members of the same group or cluster within a given time period. It is often described as the average correlation among members within the same group or cluster and within the same time period or as the proportion of variance due to group or cluster membership. The CAC is the correlation between the population means from the same group or cluster at two different time periods; it is sometimes called over-time correlation at the group level. The CAC is present in cross-sectional, closed cohort, and open cohort designs. The IAC is the correlation on the outcome variable for the same individual at two different time periods; it is sometimes called over-time correlation at the member level. The IAC is present only in closed and open cohort designs.

A characteristic of longitudinal GRTs such as SWGRTs is that the CAC and IAC can be considered as functions of compared time periods whose values decay over time. Failing to account for such decay in SWGRTs can result in increased Type I error rates (

;

;

;

). There are many possible decay structures, such as discrete-time decay and block-exchangeable structures.

#### Solutions

The recommended solutions to these challenges are to 1) employ stratified or constrained randomization techniques to balance important cluster-level covariates when assigning groups to sequences, 2) account for time in the study design and analysis, and 3) estimate the sample size for SWGRTs based on realistic and data-based estimates of within- and between-period correlations and other parameters indicated by the analytic plan. Extra variation and limited df always reduce power, so it is essential to consider these factors while the study is being planned, and particularly as part of the estimation of sample size.