This function uses an adapted SynthDiD method (Arkhangelsky et al., 2021) to estimate the average treatment effect for staggered adoption scenarios. It combines cohort-level ATT estimates, similar to the approach in Ben-Michael et al. (2022), for synthetic controls with staggered adoption. The function is designed to handle various cohorts, lags, leads, placebo tests, and pooled analyses.

synthdid_est_ate(
  data,
  adoption_cohorts,
  lags,
  leads,
  time_var,
  unit_id_var,
  treated_period_var,
  treat_stat_var,
  outcome_var,
  placebo = F,
  pooled = F,
  subgroup = NULL,
  conf_level = 0.95,
  seed = 1,
  method = "synthdid"
)

Arguments

data

A data frame in long format to be analyzed.

adoption_cohorts

Vector of cohorts to use for adoption times.

lags

Integer, number of lags of adoption time to analyze.

leads

Integer, number of leads of adoption time to analyze.

time_var

String, column name of time variables.

unit_id_var

String, ID column of units.

treated_period_var

String, column with adoption time of each unit.

treat_stat_var

String, column name indicating treatment status.

outcome_var

String, column of outcome to analyze.

placebo

Logical, whether to run placebo analysis.

pooled

Logical, whether to run pooled analysis of all treated units.

subgroup

Vector, IDs for subgroup analysis.

conf_level

Numeric, confidence level for the interval estimation (Default: 95%).

seed

A numeric value for setting the random seed (for placebo SE and placebo analysis). Default is 1.

method

The estimation method to be used. Methods include:

  • 'did': Difference-in-Differences.

  • 'sc': Synthetic Control Method.

  • 'sc_ridge': Synthetic Control Method with Ridge Penalty. It adds a ridge regularization to the synthetic control method when estimating the synthetic control weights.

  • 'difp': De-meaned Synthetic Control Method, as proposed by Doudchenko and Imbens (2016) and Ferman and Pinto (2021).

  • 'difp_ridge': De-meaned Synthetic Control with Ridge Penalty. It adds a ridge regularizationd when estimating the synthetic control weights.

  • 'synthdid': Synthetic Difference-in-Differences, a method developed by Arkhangelsky et al. (2021) Defaults to 'synthdid'.

Value

A list containing the following elements:

  • time: Vector of time periods used in estimation from -lags to leads (relative to the adoption period)

  • TE_mean: Vector of ATT in each time period

  • SE_mean: Vector of Standard error of ATT each time period

  • TE_mean_lower: Vector of Lower C.I. for ATT per period

  • TE_mean_upper: Vector of Upper C.I. for ATT per period

  • TE_mean_w, SE_mean_w, TE_mean_w_lower, TE_mean_w_upper: Weighted versions of the above metrics by the number of treated units in each time period

  • Ntr: Number of treated units

  • Nco: Number of control units

  • TE: Treatment effect for each cohort in each time period

  • SE: Standard error of TE of each cohort in each time period

  • y_obs: Observed outcomes of treated units

  • y_pred: Predicted outcomes of treated units

  • col_names: Column names for TE and SE matrices (times and ATTs)

References

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12), 4088-4118. American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203.

Ben-Michael, E., Feller, A., & Rothstein, J. (2022). Synthetic controls with staggered adoption. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2), 351-381. Oxford University Press.

Ferman, B., & Pinto, C. (2021). Synthetic controls with imperfect pretreatment fit. Quantitative Economics, 12(4), 1197-1221.

Doudchenko, Nikolay, and Guido W. Imbens. 2016. “Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis.” NBER Working Paper 22791.

Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12), 4088-4118.

Examples

if (FALSE) {
  library(tidyverse)
  data <- fixest::base_stagg |>
    mutate(treatvar = if_else(time_to_treatment >= 0, 1, 0)) |>
    mutate(treatvar = as.integer(if_else(year_treated > (5 + 2), 0, treatvar)))

  synthdid_est_ate(
    data = data,
    adoption_cohorts = 5:7,
    lags = 2,
    leads = 2,
    time_var = "year",
    unit_id_var = "id",
    treated_period_var = "year_treated",
    treat_stat_var = "treatvar",
    pooled = F,
    outcome_var = "y"
  )
}