Panel Data Diagnostic Tests — panel_diagnostics • causalverse

Runs a battery of diagnostic tests on panel data: unit roots, cross-sectional dependence, serial correlation, and heteroskedasticity.

Usage

panel_diagnostics(
  data,
  outcome,
  unit_var,
  time_var,
  tests = c("unit_root", "serial_corr", "cross_dep", "heterosked"),
  plot = TRUE
)

Arguments

data: A data frame in long format.
outcome: Character. Name of the outcome variable.
unit_var: Character. Name of the unit (panel) identifier.
time_var: Character. Name of the time variable.
tests: Character vector. Which tests to run. Options: "unit_root", "serial_corr", "cross_dep", "heterosked". Default: all four.
plot: Logical. Produce a 4-panel diagnostic plot. Default TRUE.

Value

An object of class "panel_diagnostics": a named list with one element per requested test. Each element is a list with:

statistic: Main test statistic.
p_value: Corresponding p-value.
decision: Character: "Reject H0" or "Fail to reject H0".
details: Additional per-unit or supplementary results.

When plot = TRUE the list also contains a plot element (a ggplot2 object).

Details

Unit root (IPS approximation): ADF tests are run unit-by-unit via tseries::adf.test. The Im-Pesaran-Shin (2003) panel statistic is approximated by standardising the mean ADF t-statistic. KPSS tests (tseries::kpss.test) are also run as a confirmatory check.

Serial correlation (Wooldridge 2002): The outcome is first- differenced within each unit. Residuals from an OLS of the differenced outcome on unit dummies are tested for AR(1) autocorrelation: the coefficient on the lagged residual should be \(-0.5\) under no serial correlation; a t-test on that coefficient is reported.

Cross-sectional dependence (Pesaran 2004): Pairwise correlations \(\hat{\rho}_{ij}\) of de-meaned residuals across units are computed. The CD statistic \(\sqrt{2T/(N(N-1))} \sum_{i<j} \hat{\rho}_{ij}\) is compared to a standard normal.

Heteroskedasticity (Breusch-Pagan): A pooled OLS of the outcome on unit and time dummies is estimated; the Breusch-Pagan test is applied via lmtest::bptest when available, otherwise a manual chi-squared approximation is used.

References

Im, K. S., Pesaran, M. H., and Shin, Y. (2003). "Testing for unit roots in heterogeneous panels." Journal of Econometrics, 115(1), 53-74.

Pesaran, M. H. (2004). "General diagnostic tests for cross section dependence in panels." Cambridge Working Papers in Economics 0435.

Wooldridge, J. M. (2002). Econometric Analysis of Cross Section and Panel Data. MIT Press.

Examples

if (FALSE) { # \dontrun{
# Simulate a balanced panel
set.seed(42)
N <- 20; TT <- 10
panel <- data.frame(
  unit = rep(1:N, each = TT),
  time = rep(1:TT, times = N),
  y    = rnorm(N * TT)
)
result <- panel_diagnostics(panel, outcome = "y",
                             unit_var = "unit", time_var = "time")
print(result)
} # }