Instrumental Variables Diagnostics — iv_diagnostics • causalverse

Accepts a fitted IV / 2SLS model and returns a comprehensive list of diagnostic statistics, each flagged as passing or failing standard thresholds. A visual bar chart compares observed test statistics against critical values.

Usage

iv_diagnostics(model, data = NULL, alpha = 0.05)

Arguments

model

A fitted IV model. Supported classes:

ivreg from the ivreg package.
fixest estimated via feols(..., iv = ...) from fixest.

data

Data frame used to fit the model. Required for some fixest diagnostics; can be NULL for ivreg models.

alpha

Numeric. Significance level for pass/fail thresholds. Default 0.05.

Value

A named list:

first_stage: Named numeric vector: F-statistic and degrees of freedom.
overid: Named numeric vector with Sargan statistic and p-value, or NULL if exactly identified.
endogeneity: Named numeric vector with Wu-Hausman statistic and p-value.
summary_df: Data frame summarising all diagnostics with columns test, statistic, p_value, threshold, pass.
plot: A ggplot2 object visualising the diagnostics.

Details

Comprehensive IV / 2SLS Diagnostics

Runs a standard battery of instrumental variables diagnostics on a fitted ivreg or fixest 2SLS model: first-stage strength (F-statistic), weak-instrument tests (Cragg-Donald / Kleibergen-Paap), overidentification (Sargan-Hansen), and endogeneity (Wu-Hausman). Returns all results in a tidy list, including a ggplot2 summary bar chart.

References

Stock, J. H., & Yogo, M. (2005). Testing for weak instruments in linear IV regression. In D. W. K. Andrews & J. H. Stock (Eds.), Identification and Inference for Econometric Models. Cambridge University Press.

Kleibergen, F., & Paap, R. (2006). Generalized reduced rank tests using the singular value decomposition. Journal of Econometrics, 133(1), 97-126.

Examples

set.seed(42)
n  <- 500
z  <- rnorm(n)            # instrument
x  <- 0.8 * z + rnorm(n) # endogenous regressor
y  <- 1.5 * x + rnorm(n) # outcome
df <- data.frame(y = y, x = x, z = z)

if (requireNamespace("ivreg", quietly = TRUE)) {
  m <- ivreg::ivreg(y ~ x | z, data = df)
  diag <- iv_diagnostics(m)
  print(diag$summary_df)
}
#>                               test   statistic      p_value threshold  pass
#> 1 First-Stage F (Weak Instruments) 276.0700187 1.200217e-49 10.000000  TRUE
#> 2         Wu-Hausman (Endogeneity)   0.4988977 4.803159e-01  3.841459 FALSE
#> 3      Sargan (Overidentification)          NA           NA  3.841459 FALSE