Best Linear Predictor (BLP) Analysis for Conditional Average Treatment Effects

Runs the Best Linear Predictor (BLP) analysis proposed by Chernozhukov, Demirer, Duflo, and Fernandez-Val (2020) to summarize heterogeneous treatment effects (HTEs) estimated by a causal forest or any CATE estimator. Tests whether HTEs are significantly driven by observed covariates.

Usage

blp_analysis(
  cate_hat,
  Y,
  W,
  Y_hat = NULL,
  W_hat = NULL,
  data = NULL,
  run_gates = TRUE,
  n_groups = 5,
  debiased = TRUE,
  conf_level = 0.95
)

Arguments

cate_hat: Numeric vector of estimated CATEs (e.g., from grf::causal_forest()).
Y: Numeric vector. Observed outcomes.
W: Numeric vector. Binary treatment assignment (0/1).
Y_hat: Numeric vector. Cross-fitted $E[Y \mid X]$ estimates.
W_hat: Numeric vector. Cross-fitted $E[W \mid X]$ (propensity score).
data: Data frame or matrix of raw covariates. Used for GATES subgroup analysis if run_gates = TRUE.
run_gates: Logical. Whether to also run GATES analysis. Default TRUE.
n_groups: Integer. Number of quantile groups for GATES. Default 5.
debiased: Logical. Use debiased estimates with cross-fitting if TRUE (default). Requires Y_hat and W_hat.
conf_level: Numeric. Confidence level. Default 0.95.

Value

A list with:

blp: Data frame with BLP coefficients $\beta_1$ (average treatment effect) and $\beta_2$ (HTE loading). A significant $\beta_2$ indicates meaningful heterogeneity.
gates: Data frame of Group Average Treatment Effects by quantile group, or NULL if run_gates = FALSE.
blp_plot: ggplot2 coefficient plot of BLP results.
gates_plot: ggplot2 GATES plot, or NULL.

Details

The BLP is estimated from the regression: $$Y_i - \hat{Y}_i = \beta_1 (\hat{W}_i - \hat{e}_i) + \beta_2 (\hat{\tau}_i - \bar{\hat{\tau}})(\hat{W}_i - \hat{e}_i) + \varepsilon_i$$ Key tests:

$\beta_1 \neq 0$: Overall ATE different from zero.
$\beta_2 \neq 0$: CATE varies with $\hat{\tau}$ (HTE present).

References

Chernozhukov, V., Demirer, M., Duflo, E., & Fernandez-Val, I. (2020). Generic machine learning inference on heterogeneous treatment effects in randomized experiments. NBER Working Paper 24678.

Examples

if (FALSE) { # \dontrun{
library(grf)
n <- 2000; p <- 10
X <- matrix(rnorm(n * p), n, p)
W <- rbinom(n, 1, 0.5)
tau <- pmax(X[, 1], 0)
Y <- tau * W + rnorm(n)

cf <- causal_forest(X, Y, W)
tau_hat <- predict(cf)$predictions
blp_analysis(
  cate_hat = tau_hat, Y = Y, W = W,
  Y_hat = cf$Y.hat, W_hat = cf$W.hat
)
} # }