Skip to contents

Estimates the Average Treatment Effect (ATE) or Average Treatment effect on the Treated (ATT) using the Augmented Inverse Probability Weighting (AIPW) estimator, which is doubly robust: consistent if either the propensity score model or the outcome model is correctly specified.

Usage

dr_ate(
  data,
  outcome,
  treatment,
  covariates,
  estimand = c("ATE", "ATT"),
  ps_formula = NULL,
  out_formula = NULL,
  ps_trim = c(0.01, 0.99),
  boot_se = FALSE,
  n_boot = 500,
  seed = 42,
  conf_level = 0.95
)

Arguments

data

A data frame.

outcome

Character. Name of the outcome variable.

treatment

Character. Name of the binary treatment variable (0/1).

covariates

Character vector. Names of covariates to use in both the propensity score and outcome models.

estimand

Character. Either "ATE" (default) or "ATT".

ps_formula

Formula or NULL. Custom formula for the propensity score model (logistic regression). If NULL, a main-effects logistic regression on covariates is used.

out_formula

Formula or NULL. Custom formula for the outcome model (linear regression). If NULL, a main-effects linear regression on treatment and covariates is used.

ps_trim

Numeric vector of length 2. Propensity scores outside this range are trimmed. Default c(0.01, 0.99).

boot_se

Logical. If TRUE, compute bootstrap standard errors. Default FALSE.

n_boot

Integer. Number of bootstrap replications. Default 500.

seed

Integer. Random seed for reproducibility. Default 42.

conf_level

Numeric. Confidence level. Default 0.95.

Value

A list with:

estimate

Numeric. AIPW point estimate.

se

Numeric. Influence-function standard error (or bootstrap SE).

ci_lower

Numeric. Lower confidence bound.

ci_upper

Numeric. Upper confidence bound.

t_stat

Numeric. t-statistic.

p_value

Numeric. Two-sided p-value.

estimand

Character. "ATE" or "ATT".

n_trimmed

Integer. Number of observations trimmed due to extreme PS.

ps_summary

Named vector. Summary statistics of propensity scores.

Details

The AIPW estimator for the ATE is: $$\hat{\tau}_{AIPW} = \frac{1}{n}\sum_{i=1}^{n}\left[ \mu_1(X_i) - \mu_0(X_i) + \frac{D_i(Y_i - \mu_1(X_i))}{e(X_i)} - \frac{(1-D_i)(Y_i - \mu_0(X_i))}{1-e(X_i)} \right]$$ where \(e(X) = P(D=1|X)\) is the propensity score, and \(\mu_d(X) = E[Y|D=d,X]\) is the conditional outcome mean.

References

Robins, J. M., Rotnitzky, A., & Zhao, L. P. (1994). Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89(427), 846–866.

Examples

data(lalonde, package = "MatchIt")
result <- dr_ate(
  data       = lalonde,
  outcome    = "re78",
  treatment  = "treat",
  covariates = c("age", "educ", "race", "married", "nodegree", "re74", "re75")
)
result
#> $estimate
#> [1] 887.1517
#> 
#> $se
#> [1] 937.3389
#> 
#> $ci_lower
#> [1] -949.9987
#> 
#> $ci_upper
#> [1] 2724.302
#> 
#> $t_stat
#> [1] 0.9464578
#> 
#> $p_value
#> [1] 0.3439151
#> 
#> $estimand
#> [1] "ATE"
#> 
#> $n_trimmed
#> [1] 1
#> 
#> $n_total
#> [1] 614
#> 
#> $ps_summary
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#> 0.00908 0.04854 0.12068 0.30130 0.63872 0.85315 
#>