
Stacked Data for Staggered DiD Analysis
stack_data.Rdstack_data processes datasets used in staggered Difference-in-Differences (DiD) designs.
Staggered DiD designs arise when different units (e.g., firms, regions, countries)
get treated at different time periods. This function creates cohorts based on the provided
treatment period variable and stacks them together to create a comprehensive longitudinal format
suitable for staggered DiD analyses.
Usage
stack_data(
treated_period_var,
time_var,
pre_window,
post_window,
data,
control_type = c("both", "never-treated", "not-yet-treated")
)Arguments
- treated_period_var
A character string indicating the column name of the treatment period variable.
- time_var
A character string indicating the column name for time.
- pre_window
An integer indicating the number of periods before the treatment to consider (i.e., leads).
- post_window
An integer indicating the number of periods after the treatment to consider (i.e., lags).
- data
A data frame containing the dataset to be processed.
- control_type
A character string indicating which control type to use. One of "both", "never-treated", or "not-yet-treated".
Value
A data frame with the stacked data, augmented with relative period dummy variables, suitable for staggered DiD analysis.
Details
The function emphasizes the importance of having a control group, which should be represented by
the value 10000 in the treated_period_var column of the provided dataset. The output data will
be augmented with relative period dummy variables for ease of subsequent analysis.
Examples
if (FALSE) { # \dontrun{
library(did)
library(tidyverse)
data(base_stagg)
stacked_data <- stack_data("year_treated", "year", 3, 3, base_stagg, control_type = "both")
feols_result <- feols(as.formula(paste0(
"y ~ ",
paste(paste0("`rel_period_", c(-3:-2, 0:3), "`"), collapse = " + "),
" | id ^ df + year ^ df"
)), data = stacked_data)
print(feols_result)
} # }