
Attrition Analysis for Experimental & Panel Data
attrition_analysis.RdAnalyses sample attrition (dropout / non-response) in experiments or panel studies. Tests whether attrition is differential (correlated with treatment), estimates attrition bias using Lee (2009) trimming bounds, and produces diagnostic plots.
Arguments
- data
A data frame.
- outcome
Character. Name of the outcome variable (can have
NAfor attriters).- treatment
Character. Name of the binary treatment indicator (0/1).
- covariates
Character vector. Pre-treatment covariates to check for differential attrition. Default: all numeric columns except
outcomeandtreatment.- alpha
Numeric. Significance level. Default
0.05.- plot
Logical. Produce diagnostic plots. Default
TRUE.
Value
A list with:
ratesData frame: overall, treatment-group, control-group attrition rates and a chi-square test of differential attrition.
differential_testResults of a logistic regression of attrition on treatment (and covariates). Tests \(H_0\): treatment coefficient = 0.
covariate_balanceSMD table comparing attriters vs completers within each treatment arm.
lee_boundsLee (2009) trimming bounds on the treatment effect (lower bound, upper bound, 95 percent CIs).
plotA ggplot2 object (if
plot = TRUE).
References
Lee, D. S. (2009). Training, wages, and sample selection: Estimating sharp bounds on treatment effects. Review of Economic Studies, 76(3), 1071-1102.
Examples
set.seed(7)
n <- 400
df <- data.frame(
treat = rbinom(n, 1, 0.5),
age = runif(n, 18, 65),
female = rbinom(n, 1, 0.5)
)
# Differential attrition: control units with low age more likely to drop
retain <- rbinom(n, 1, ifelse(df$treat == 1, 0.85, 0.75 + 0.002 * df$age))
df$y <- ifelse(retain == 1, df$treat * 2 + rnorm(n), NA_real_)
res <- attrition_analysis(df, outcome = "y", treatment = "treat",
covariates = c("age", "female"))
res$rates
#> group n attrition_rate pct_retained
#> 1 Overall 400 0.1575000 84.25000
#> 2 Treated 206 0.1747573 82.52427
#> 3 Control 194 0.1391753 86.08247
res$lee_bounds
#> lower upper lower_ci upper_ci
#> 1 -0.178159 3.923602 -0.3246193 4.070063