stack_data.Rd
stack_data
processes datasets used in staggered Difference-in-Differences (DiD) designs.
Staggered DiD designs arise when different units (e.g., firms, regions, countries)
get treated at different time periods. This function creates cohorts based on the provided
treatment period variable and stacks them together to create a comprehensive longitudinal format
suitable for staggered DiD analyses.
stack_data(
treated_period_var,
time_var,
pre_window,
post_window,
data,
control_type = c("both", "never-treated", "not-yet-treated")
)
A character string indicating the column name of the treatment period variable.
A character string indicating the column name for time.
An integer indicating the number of periods before the treatment to consider (i.e., leads).
An integer indicating the number of periods after the treatment to consider (i.e., lags).
A data frame containing the dataset to be processed.
A character string indicating which control type to use. One of "both", "never-treated", or "not-yet-treated".
A data frame with the stacked data, augmented with relative period dummy variables, suitable for staggered DiD analysis.
The function emphasizes the importance of having a control group, which should be represented by
the value 10000 in the treated_period_var
column of the provided dataset. The output data will
be augmented with relative period dummy variables for ease of subsequent analysis.
if (FALSE) {
library(did)
library(tidyverse)
library(fixest)
data(base_stagg)
stacked_data <- stack_data("year_treated", "year", 3, 3, base_stagg, control_type = "both")
feols_result <- feols(as.formula(paste0(
"y ~ ",
paste(paste0("`rel_period_", c(-3:-2, 0:3), "`"), collapse = " + "),
" | id ^ df + year ^ df"
)), data = stacked_data)
print(feols_result)
}