Doubly-Robust Augmented IPW Estimator for the ATE and ATT

Estimates the Average Treatment Effect (ATE) or Average Treatment effect on the Treated (ATT) using the Augmented Inverse Probability Weighting (AIPW) estimator, which is doubly robust: consistent if either the propensity score model or the outcome model is correctly specified.

Usage

dr_ate(
  data,
  outcome,
  treatment,
  covariates,
  estimand = c("ATE", "ATT"),
  ps_formula = NULL,
  out_formula = NULL,
  ps_trim = c(0.01, 0.99),
  boot_se = FALSE,
  n_boot = 500,
  seed = 42,
  conf_level = 0.95
)

Arguments

data: A data frame.
outcome: Character. Name of the outcome variable.
treatment: Character. Name of the binary treatment variable (0/1).
covariates: Character vector. Names of covariates to use in both the propensity score and outcome models.
estimand: Character. Either "ATE" (default) or "ATT".
ps_formula: Formula or NULL. Custom formula for the propensity score model (logistic regression). If NULL, a main-effects logistic regression on covariates is used.
out_formula: Formula or NULL. Custom formula for the outcome model (linear regression). If NULL, a main-effects linear regression on treatment and covariates is used.
ps_trim: Numeric vector of length 2. Propensity scores outside this range are trimmed. Default c(0.01, 0.99).
boot_se: Logical. If TRUE, compute bootstrap standard errors. Default FALSE.
n_boot: Integer. Number of bootstrap replications. Default 500.
seed: Integer. Random seed for reproducibility. Default 42.
conf_level: Numeric. Confidence level. Default 0.95.

Value

A list with:

estimate: Numeric. AIPW point estimate.
se: Numeric. Influence-function standard error (or bootstrap SE).
ci_lower: Numeric. Lower confidence bound.
ci_upper: Numeric. Upper confidence bound.
t_stat: Numeric. t-statistic.
p_value: Numeric. Two-sided p-value.
estimand: Character. "ATE" or "ATT".
n_trimmed: Integer. Number of observations trimmed due to extreme PS.
ps_summary: Named vector. Summary statistics of propensity scores.

Details

The AIPW estimator for the ATE is: $$\hat{\tau}_{AIPW} = \frac{1}{n}\sum_{i=1}^{n}\left[ \mu_1(X_i) - \mu_0(X_i) + \frac{D_i(Y_i - \mu_1(X_i))}{e(X_i)} - \frac{(1-D_i)(Y_i - \mu_0(X_i))}{1-e(X_i)} \right]$$ where $e(X) = P(D=1|X)$ is the propensity score, and $\mu_d(X) = E[Y|D=d,X]$ is the conditional outcome mean.

References

Robins, J. M., Rotnitzky, A., & Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866.

Examples

data(lalonde, package = "MatchIt")
result <- dr_ate(
  data       = lalonde,
  outcome    = "re78",
  treatment  = "treat",
  covariates = c("age", "educ", "race", "married", "nodegree", "re74", "re75")
)
result
#> $estimate
#> [1] 887.1517
#> 
#> $se
#> [1] 937.3389
#> 
#> $ci_lower
#> [1] -949.9987
#> 
#> $ci_upper
#> [1] 2724.302
#> 
#> $t_stat
#> [1] 0.9464578
#> 
#> $p_value
#> [1] 0.3439151
#> 
#> $estimand
#> [1] "ATE"
#> 
#> $n_trimmed
#> [1] 1
#> 
#> $n_total
#> [1] 614
#> 
#> $ps_summary
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#> 0.00908 0.04854 0.12068 0.30130 0.63872 0.85315 
#>