Group or Cluster Regression Discontinuity Designs

In a group or cluster regression discontinuity design (GRDD), a threshold or cutoff value of an assignment score variable measured at a group or cluster level is used to assign groups or clusters to the intervention or control conditions (Pennell et al., 2011; Schochet, 2009). As such, the GRDD is a group- or cluster-level analog to the more familiar individual-level regression discontinuity design (RDD) (Bor et al., 2014; Cappelleri and Trochim, 2015; Cattaneo et al., 2020; Smith et al., 2017).

Special methods are needed for analysis and sample size estimation for GRDDs. This page provides guidance for these designs, as detailed below and in the GRDD sample size calculator. A variety of issues are discussed and there are many references based on work done for RDDs that are relevant for GRDDs, including information on sample size estimation.

Features and Uses

Assignment to Conditions based on a Threshold Value

Launch the GRDD Calculator

FAQs TREND Statement Key References

Webinars and Training

The signature trait of RDDs is assignment to conditions based on a threshold or cutoff value of a score, also referred to as a running variable, forcing variable, or index. This approach can work because individuals or groups close to the cutoff value are likely to be similar on most other characteristics, so any estimated difference on either side of the cutoff is likely due to the assignment rule (Bor et al., 2014; O'Keeffe et al., 2014; Smith et al., 2017). If an intervention effect is present, then a scatterplot of the outcome by assignment score will reveal a “jump” or discontinuity at the cutoff value or a change in slope beginning at the cutoff value. The difference in estimated outcome values or in slopes at the cutoff score is the estimated RDD intervention effect.

The RDD has been used to examine the effect of immediate versus deferred antiretroviral therapy (ART) on retention in HIV care, where ART was provided if the CD4 count threshold is below 350 cells per µl (Bor et al., 2017). The design has also been used to assess the association between screening for prostate cancer and mortality, where biopsy-based screening was provided if a participant’s prostate-specific antigen levels were at least 4.0 µg/l (Shoag et al., 2015).

Most methods development in RDD has been in the areas of education or econometrics, but the use of RDDs in public health, epidemiology, and health care research has been suggested as observational data are commonplace in these settings (Bor et al., 2014; Cattaneo et al., 2023; Moscoe et al., 2015; Venkataramani et al., 2016).

Nested or Hierarchical Design

GRDDs in which groups or clusters are assigned to conditions based on a group-level summary of a variable at pretest have a hierarchical structure similar to group- or cluster-randomized trials (GRTs). However, all other things being equal, the number of groups required for a GRDD can be two to three times greater than that of a GRT (Pennell et al., 2011; Schochet, 2009). Even so, when randomization is not possible, the GRDD can be a good alternative that supports strong causal inference.

In GRDDs, participants are nested within groups and measurements are nested within members. If the assignment score is based on group-level summaries of the outcome at pretest, then repeated observations on the outcome are possible (Pennell et al., 2011). In cohort designs the same participants are measured at pretest and post-test, while in cross-sectional designs different participants are observed at each measurement occasion.

Appropriate Use

GRDDs can be employed in a wide variety of settings and populations to address a wide variety of research questions. They are an appropriate design if group randomization is not possible and the investigator wants to evaluate an intervention that:

assigns groups to conditions based on a threshold value of a score variable
does not expect groups close to the threshold to be different in the absence of the intervention, and
the only source of the discontinuity is the score variable.

Bias-Variance Trade-Off

RDD analysis is valid for individuals or groups close to the cutoff value. However, the number of observations in a narrow band surrounding this value is usually limited, yielding an estimate of the intervention effect with a large variance. To address this, more observations may be included by increasing the width of the band surrounding the cutoff value. However, increasing this bandwidth to include more observations may yield biased estimates, because assumptions of the trend on either side of the cutoff may not hold (Bor et al., 2015). To address this bias-variance trade-off, an optimal bandwidth can be chosen on the basis of some criterion to minimize the mean square error of the intervention effect estimate (Cattaneo et al., 2020).

Intervention Assignment

RDDs are a form of observational study because assignment to conditions is not random. As such, one serious concern is the potential for participants or groups to manipulate their assignment score so as to obtain the intervention (Cappelleri and Trochim, 2015; Cattaneo et al., 2023). However, strong evidence for causal inference can be provided by an RDD for individuals or groups near the cutoff if there is no expectation for outcomes to differ in the absence of the intervention (Bor et al., 2014).

Treatment Compliance

If there is perfect compliance to the assigned condition, then the design is said to be a “sharp” RDD. Designs in which perfect compliance is not achieved are said to be a “fuzzy” RDD.

Multiple interpretations of the intervention effect are possible with RDDs. Cappelleri and Trochim (Cappelleri and Trochim, 2015) indicate the distinction between sharp and fuzzy RDDs is analogous to the distinction between “intention-to-treat” and “treatment-on-the-treated” in randomized settings, respectively. Cattaneo et al. (Cattaneo et al., 2023) describe similar interpretations, but add the effect of assigning the intervention for all participants as a possibility for fuzzy RDDs.

Intraclass Correlation

One challenging feature of GRDDs is that members of the same group usually share some physical, geographic, social, or other connection. Those connections create the expectation for a positive intraclass correlation (ICC) among observations taken on members of the same group, as members of the same group tend to be more like one another than to members of other groups. Positive ICC reduces the variation among members of the same group but increases variation among the groups, which in turn increases the variance of group-level statistics. Complicating matters further, the degrees of freedom (df) available to conduct inference for the intervention effect are based on the number of groups and so are often limited. As with GRTs, any GRDD analysis that ignores the extra variation (or positive ICC) or the limited df will have an inflated type I error rate.

Solutions

The recommended solutions to these challenges are to

reflect the hierarchical structure of the design in the analytic plan,
assess RDD assumptions using established tests, and
estimate the sample size for the GRDD based on realistic and data-based estimates of the ICC and the other parameters indicated by the analytic plan.

FAQs

Show All Answers

When do I need to use a GRDD?

Use a GRDD if you cannot randomize groups to conditions and assignment to conditions is based on a threshold value on a quantitative score. However, if it possible to conduct a group- or cluster-randomized trial, that will generally be more powerful than a GRDD.

CONSORT statements provide guidelines for the reporting of results from randomized interventions. Is there a resource providing similar guidelines for reporting the results of GRDDs, which are nonrandomized interventions?

Yes. The Centers for Disease Control and Prevention currently host the Transparent Reporting of Evaluations with Nonrandomized Designs (TREND) statement. The TREND statement offers a 22-item checklist to ensure standardized reporting of nonrandomized or quasi-experimental interventions.

What are some important references on the design and analysis of GRDDs?

Resources providing good overviews of individual RDD design and analysis include the following (Cattaneo et al., 2020; Cattaneo et al., 2023; Smith et al., 2017). These sources can also be helpful for GRDDs.

What tests should be performed to assess the assumptions for RDDs?

Work on this question has been limited to individual RDDs; even so, that work is generally applicable to GRDDs. To test whether or not participants are manipulating score values so as to obtain the intervention, a density plot of the score variable can be generated (Cattaneo et al., 2023; McCrary, 2008; Smith et al., 2017). If score manipulation is present, the density curve at the cutoff may be distorted, with increased density to the side corresponding to the intervention. To assess the degree to which the distribution of covariates is the same on either side of the cutoff, plots of the distribution of baseline covariates or RDD analyses using covariates as outcomes can be generated (Cattaneo et al., 2023; Smith et al., 2017). Neither approach should indicate an association between the covariate and condition indicator.

When using t-scores to calculate power or sample size for GRDDs, what degrees of freedom should I use?

Assuming a single continuous score variable, the number of groups or clusters minus three has been used (Pennell et al., 2011).

All other things being equal, how do the sample sizes of GRDDs compare to a GRT?

GRDDs may require two to four times more groups than the corresponding GRT (Deke and Dragoset, 2012; Schochet, 2009). In a GRT, correlated observations between participants within groups means that participants provide less information than if there was no ICC, reducing the effective sample size. In an RDD, the intervention indicator and assignment score variable are also correlated, which reduces the amount of information provided by the intervention indicator and reduces the effective sample size further (Schochet, 2009).

Show All FAQs

TREND Statement

Des Jarlais DC, Lyles C, Crepaz N, Trend Group. Improving the reporting quality of nonrandomized evaluations of behavioral and public health interventions: the TREND statement. Am J Public Health. 2004;94(3):361-6.

PMID: 14998794.

Key References

RDDs and GRDDs

Cattaneo MD, Keele L, Titiunik R. A guide to regression discontinuity designs in medical applications. Stat Med. 2023;42(24):4484-4513. Epub 2023/08/01.

PMID: 37528626.

Maciejewski ML, Basu A. Regression discontinuity design. JAMA. 2020;324(4):381-382.

PMID: 32614409.

Cattaneo MD, Idrobo N, Titiunik R, Alverez M, Beck N. A practical introduction to regression discontinuity designs: Foundations. 2020 Cambridge University Press.

Calonico S, Cattaneo MD, Farrell MH, Titiunik R. Regression discontinuity designs using covariates. Rev Econ Stat. 2019;101(3):442-451.

Bor J, Fox MP, Rosen S, Venkataramani A, Tanser F, Pillay D, Bärnighausen T. Treatment eligibility and retention in clinical HIV care: A regression discontinuity study in South Africa. PLoS Med. 2017;14(11):e1002463. Epub 2017/11/28.

PMID: 29182641.

Smith LM, Lévesque LE, Kaufman JS, Strumpf EC. Strategies for evaluating the assumptions of the regression discontinuity design: A case study using a human papillomavirus vaccination programme. Int J Epidemiol. 2017;46(3):939-949. Epub 2016/10/08.

PMID: 28338752.

Thistlewaite DL, Campbell DT. Regression-discontinuity analysis: An alternative to the ex-post facto experiment. Obs Stud. 2017;3(2):119-128.

Venkataramani AS, Bor J, Jena AB. Regression discontinuity designs in healthcare research. BMJ. 2016;352:i1216. Epub 2016/03/14.

PMID: 26977086.

Bor J, Moscoe E, Bärnighausen T. Three approaches to causal inference in regression discontinuity designs. Epidemiology. 2015;26(2):e28-30; discussion e30.

PMID: 25643120.

Cappelleri JC, Trochim WM. Regression discontinuity design. In International encyclopedia of the social & behavioral sciences (Second edition). 2015 (pp. 152-159) Oxford: Elsevier.

Bor J, Moscoe E, Mutevedzi P, Newell ML, Bärnighausen T. Regression discontinuity designs in epidemiology: Causal inference without randomized trials. Epidemiology. 2014;25(5):729-37.

PMID: 25061922.

O'Keeffe AG, Geneletti S, Baio G, Sharples LD, Nazareth I, Petersen I. Regression discontinuity designs: An approach to the evaluation of treatment efficacy in primary care using observational data. BMJ. 2014;349:g5293. Epub 2014/09/08.

PMID: 25199521.

State of the Practice Reviews for RDDs and GRDDs

Villamizar-Villegas M, Pinzon-Puerto FA, Ruiz-Sanchez MA. A comprehensive history of regression discontinuity designs: An empirical survey of the last 60 years. J Econ Surv. 2022;36(4):1130-78.

Hilton Boon M, Craig P, Thomson H, Campbell M, Moore L. Regression discontinuity designs in health: A systematic review. Epidemiology. 2021;32(1):87-93. Epub 2020/11/30. Erratum.

PMID: 33196561.

Moscoe E, Bor J, Bärnighausen T. Regression discontinuity designs are underutilized in medicine, epidemiology, and public health: A review of current and best practice. J Clin Epidemiol. 2015;68(2):122-33.

PMID: 25579639.

Sample Size Estimation for RDDs and GRDDs

Cattaneo MD, Titiunik R, Vazquez-Bare G. Power calculations for regression-discontinuity designs. Stata J. 2019;19(1):210-245.

Deke J, Dragoset L. Statistical power for regression discontinuity designs in education: Empirical estimates of design effects relative to randomized controlled trials. 2012 Princeton, NJ: Mathematica Policy Research.

Pennell ML, Hade EM, Murray DM, Rhoda DA. Cutoff designs for community-based intervention studies. Stat Med. 2011;30(15):1865-82. Epub 2011/04/17.

PMID: 21500240.

Schochet PZ. Statistical power for regression discontinuity designs in education evaluations. J Educ Behav Stat. 2009;34(2):238-266.

McCrary J. Manipulation of the running variable in the regression discontinuity design: A density test. J Econometrics. 2008;142(2):698-714.

Research Methods Resources