
Multiverse / Specification Curve Analysis
multiverse_analysis.RdRuns a model across all combinations of analytical choices (model specification, sample restrictions, functional forms, etc.) and visualizes the resulting distribution of estimates. This reveals how sensitive conclusions are to reasonable but arbitrary researcher decisions.
Arguments
- data
A data frame.
- outcome
Character. Name of the outcome variable.
- treatment
Character. Name of the treatment variable.
- choices
A named list. Each element is a character vector of alternative choices for that analytical dimension. Supported keys:
controlsList of covariate sets to add. Each element is a character vector of variable names, e.g.
list(c(), c("age"), c("age","female")).sample_filtersNamed character vector of filter expressions (as strings), e.g.
c(full="TRUE", adults="age>=18").outcome_transformsNamed character vector of transformations applied to the outcome, e.g.
c(levels="y", log="log(y+1)").se_typesCharacter vector of SE types:
"OLS","HC1","HC3","cluster".cluster_varCharacter. Variable to cluster on (used when
se_typesincludes"cluster").
- family
Character.
"gaussian"(default),"binomial", or"poisson". Model family.- alpha
Numeric. Significance level for highlighting. Default
0.05.- sort_by
Character. Sort specifications by
"estimate"(default) or"p_value".- plot
Logical. Produce the multiverse plot. Default
TRUE.- parallel
Logical. Run specifications in parallel via
parallel::mclapply. DefaultFALSE.
Value
A list with:
resultsData frame of all specification results:
spec_id,estimate,std_error,t_stat,p_value,ci_lo,ci_hi,significant, plus one column per analytical dimension.summarySummary statistics across all specifications: median estimate, % significant, % positive, IQR.
plotA ggplot2 object (if
plot = TRUE).
References
Simonsohn, U., Simmons, J. P., & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4, 1208-1214.
Steegen, S., Tuerlinckx, F., Gelman, A., & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11(5), 702-712.
Examples
set.seed(42)
n <- 300
df <- data.frame(
y = rnorm(n),
treat = rbinom(n, 1, 0.5),
age = runif(n, 18, 65),
female = rbinom(n, 1, 0.5),
income = rnorm(n)
)
mv <- multiverse_analysis(
data = df,
outcome = "y",
treatment = "treat",
choices = list(
controls = list(c(), c("age"), c("age", "female"), c("age", "female", "income")),
sample_filters = c(full = "TRUE", age30plus = "age >= 30"),
outcome_transforms = c(levels = "y")
)
)
mv$summary
#> n_specs median_est iqr_est pct_positive pct_significant min_est max_est
#> 1 8 0.1891403 0.0610175 100 0 0.1542434 0.2336293