synthdid_est_ate.Rd
This function uses an adapted SynthDiD method (Arkhangelsky et al., 2021) to estimate the average treatment effect for staggered adoption scenarios. It combines cohort-level ATT estimates, similar to the approach in Ben-Michael et al. (2022), for synthetic controls with staggered adoption. The function is designed to handle various cohorts, lags, leads, placebo tests, and pooled analyses.
synthdid_est_ate(
data,
adoption_cohorts,
lags,
leads,
time_var,
unit_id_var,
treated_period_var,
treat_stat_var,
outcome_var,
placebo = F,
pooled = F,
subgroup = NULL,
conf_level = 0.95,
seed = 1,
method = "synthdid"
)
A data frame in long format to be analyzed.
Vector of cohorts to use for adoption times.
Integer, number of lags of adoption time to analyze.
Integer, number of leads of adoption time to analyze.
String, column name of time variables.
String, ID column of units.
String, column with adoption time of each unit.
String, column name indicating treatment status.
String, column of outcome to analyze.
Logical, whether to run placebo analysis.
Logical, whether to run pooled analysis of all treated units.
Vector, IDs for subgroup analysis.
Numeric, confidence level for the interval estimation (Default: 95%).
A numeric value for setting the random seed (for placebo SE and placebo analysis). Default is 1.
The estimation method to be used. Methods include:
'did': Difference-in-Differences.
'sc': Synthetic Control Method.
'sc_ridge': Synthetic Control Method with Ridge Penalty. It adds a ridge regularization to the synthetic control method when estimating the synthetic control weights.
'difp': De-meaned Synthetic Control Method, as proposed by Doudchenko and Imbens (2016) and Ferman and Pinto (2021).
'difp_ridge': De-meaned Synthetic Control with Ridge Penalty. It adds a ridge regularizationd when estimating the synthetic control weights.
'synthdid': Synthetic Difference-in-Differences, a method developed by Arkhangelsky et al. (2021) Defaults to 'synthdid'.
A list containing the following elements:
time: Vector of time periods used in estimation from -lags to leads (relative to the adoption period)
TE_mean: Vector of ATT in each time period
SE_mean: Vector of Standard error of ATT each time period
TE_mean_lower: Vector of Lower C.I. for ATT per period
TE_mean_upper: Vector of Upper C.I. for ATT per period
TE_mean_w, SE_mean_w, TE_mean_w_lower, TE_mean_w_upper: Weighted versions of the above metrics by the number of treated units in each time period
Ntr: Number of treated units
Nco: Number of control units
TE: Treatment effect for each cohort in each time period
SE: Standard error of TE of each cohort in each time period
y_obs: Observed outcomes of treated units
y_pred: Predicted outcomes of treated units
col_names: Column names for TE and SE matrices (times and ATTs)
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12), 4088-4118. American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203.
Ben-Michael, E., Feller, A., & Rothstein, J. (2022). Synthetic controls with staggered adoption. Journal of the Royal Statistical Society Series B: Statistical Methodology, 84(2), 351-381. Oxford University Press.
Ferman, B., & Pinto, C. (2021). Synthetic controls with imperfect pretreatment fit. Quantitative Economics, 12(4), 1197-1221.
Doudchenko, Nikolay, and Guido W. Imbens. 2016. “Balancing, Regression, Difference-in-Differences and Synthetic Control Methods: A Synthesis.” NBER Working Paper 22791.
Arkhangelsky, D., Athey, S., Hirshberg, D. A., Imbens, G. W., & Wager, S. (2021). Synthetic difference-in-differences. American Economic Review, 111(12), 4088-4118.
if (FALSE) {
library(tidyverse)
data <- fixest::base_stagg |>
mutate(treatvar = if_else(time_to_treatment >= 0, 1, 0)) |>
mutate(treatvar = as.integer(if_else(year_treated > (5 + 2), 0, treatvar)))
synthdid_est_ate(
data = data,
adoption_cohorts = 5:7,
lags = 2,
leads = 2,
time_var = "year",
unit_id_var = "id",
treated_period_var = "year_treated",
treat_stat_var = "treatvar",
pooled = F,
outcome_var = "y"
)
}