Skip to contents

Analyses sample attrition (dropout / non-response) in experiments or panel studies. Tests whether attrition is differential (correlated with treatment), estimates attrition bias using Lee (2009) trimming bounds, and produces diagnostic plots.

Usage

attrition_analysis(
  data,
  outcome,
  treatment,
  covariates = NULL,
  alpha = 0.05,
  plot = TRUE
)

Arguments

data

A data frame.

outcome

Character. Name of the outcome variable (can have NA for attriters).

treatment

Character. Name of the binary treatment indicator (0/1).

covariates

Character vector. Pre-treatment covariates to check for differential attrition. Default: all numeric columns except outcome and treatment.

alpha

Numeric. Significance level. Default 0.05.

plot

Logical. Produce diagnostic plots. Default TRUE.

Value

A list with:

rates

Data frame: overall, treatment-group, control-group attrition rates and a chi-square test of differential attrition.

differential_test

Results of a logistic regression of attrition on treatment (and covariates). Tests \(H_0\): treatment coefficient = 0.

covariate_balance

SMD table comparing attriters vs completers within each treatment arm.

lee_bounds

Lee (2009) trimming bounds on the treatment effect (lower bound, upper bound, 95 percent CIs).

plot

A ggplot2 object (if plot = TRUE).

References

Lee, D. S. (2009). Training, wages, and sample selection: Estimating sharp bounds on treatment effects. Review of Economic Studies, 76(3), 1071-1102.

Examples

set.seed(7)
n <- 400
df <- data.frame(
  treat  = rbinom(n, 1, 0.5),
  age    = runif(n, 18, 65),
  female = rbinom(n, 1, 0.5)
)
# Differential attrition: control units with low age more likely to drop
retain <- rbinom(n, 1, ifelse(df$treat == 1, 0.85, 0.75 + 0.002 * df$age))
df$y   <- ifelse(retain == 1, df$treat * 2 + rnorm(n), NA_real_)

res <- attrition_analysis(df, outcome = "y", treatment = "treat",
                           covariates = c("age", "female"))
res$rates
#>     group   n attrition_rate pct_retained
#> 1 Overall 400      0.1575000     84.25000
#> 2 Treated 206      0.1747573     82.52427
#> 3 Control 194      0.1391753     86.08247
res$lee_bounds
#>       lower    upper   lower_ci upper_ci
#> 1 -0.178159 3.923602 -0.3246193 4.070063