Power Analysis for Difference-in-Differences Designs — did_power

Computes statistical power (or required sample size / MDE) for two-period and staggered DiD designs. Supports clustered standard errors, pre-trend variance correction, and autocorrelation adjustment.

Usage

did_power_analysis(
  n_treated = NULL,
  n_control = NULL,
  effect_size = NULL,
  sd_outcome = 1,
  n_periods = 2,
  pre_periods = 1,
  rho = 0,
  icc = 0,
  cluster_size = 1,
  alpha = 0.05,
  two_sided = TRUE,
  solve_for = c("power", "n", "mde"),
  plot = TRUE
)

Arguments

n_treated: Integer. Number of treated units. Required unless solve_for = "n".
n_control: Integer. Number of control units. Defaults to n_treated (1:1 ratio).
effect_size: Numeric. True average treatment effect (ATT). Required unless solve_for = "effect" or solve_for = "mde".
sd_outcome: Numeric. Standard deviation of the outcome. Default 1 (standardized).
n_periods: Integer. Total number of time periods. Default 2 (classic two-period DiD).
pre_periods: Integer. Number of pre-treatment periods. Default 1.
rho: Numeric. Within-unit serial autocorrelation coefficient. Default 0 (no autocorrelation).
icc: Numeric. Intraclass correlation if units are clustered within groups. Default 0 (no clustering).
cluster_size: Integer. Average cluster size (units per cluster). Required when icc > 0. Default 1.
alpha: Numeric. Significance level. Default 0.05.
two_sided: Logical. Whether to use a two-sided test. Default TRUE.
solve_for: Character. What to solve for: "power" (default), "n" (minimum per-arm sample size), or "mde" (minimum detectable effect).
plot: Logical. If TRUE (default), returns a power curve plot.

Value

If solve_for = "power": a list with:

power: Numeric. Estimated statistical power.
se_did: Numeric. Estimated standard error of the DiD estimator.
design_effect: Numeric. Design effect from clustering/autocorrelation.
plot: A ggplot2 power curve (power vs. effect size), or NULL if plot = FALSE.

If solve_for = "n": a list with n_per_arm, n_total. If solve_for = "mde": a list with mde.

Details

The DiD variance formula for the simple two-period case with N_t treated and N_c control units, T total periods, and outcome variance $\sigma^2$ is: $$Var(\hat{\tau}_{DiD}) \approx \sigma^2 \cdot \text{DEFF} \cdot \left(\frac{1}{N_t \cdot T/2} + \frac{1}{N_c \cdot T/2}\right)$$ where DEFF is the design effect incorporating autocorrelation and intraclass correlation. Serial autocorrelation with parameter $\rho$ and $T$ periods contributes a factor of approximately $1 + (T-1)\rho$. Clustering with ICC $\lambda$ and cluster size $m$ contributes a factor of $1 + (m-1)\lambda$.

Examples

# Two-period DiD power
did_power_analysis(n_treated = 50, effect_size = 0.3, sd_outcome = 1)

# With clustering
did_power_analysis(
  n_treated = 200, effect_size = 0.2, sd_outcome = 1,
  icc = 0.05, cluster_size = 20
)

# Solve for MDE
did_power_analysis(
  n_treated = 100, n_periods = 4, pre_periods = 2,
  solve_for = "mde"
)