38  Market Integration and Segmentation

Note

In this chapter, we measure the degree to which the Vietnamese equity market is integrated with or segmented from global capital markets. We construct multiple integration metrics, including correlation-based, factor-based, and pricing-error-based, trace their evolution through Vietnam’s liberalization timeline, and quantify the cost of segmentation for Vietnamese firms.

A market is fully integrated when its assets are priced by a global stochastic discount factor: risk premia reflect only exposures to global risk factors, and identical cash flow streams command the same expected return regardless of where the issuer is domiciled. A market is fully segmented when domestic risk factors alone determine prices, and the country’s risk-return trade-off is independent of the rest of the world. Reality sits somewhere between these poles, and the location shifts over time.

Vietnam is a particularly interesting case. It opened its stock exchange in July 2000 with heavy restrictions on foreign participation. Foreign ownership limits (initially 20%, raised to 30% in 2003, 49% in 2015, and selectively removed for some firms) have been gradually relaxed. Vietnam joined the WTO in 2007. FTSE Russell upgraded Vietnam from “unclassified” to “secondary emerging” in its frontier index in 2018 and has been evaluating further upgrades. Each of these events has potentially shifted the degree of integration.

Bekaert and Harvey (1995) and Bekaert and Harvey (2002) establish the modern framework for measuring time-varying integration. Errunza and Losq (1985) develop the “mild segmentation” model in which foreign investors face barriers but can partially replicate emerging market returns through global securities. Pukthuanthong and Roll (2009) propose a factor-model-based measure that avoids the pitfalls of simple correlation analysis. This chapter implements all three approaches and applies them to Vietnam’s integration trajectory.

38.1 Integration in Theory

38.1.1 The Integrated and Segmented Benchmarks

Under full integration, the expected excess return of Vietnamese stock \(i\) is:

\[ E[R_{i,t} - R_{f,t}] = \beta_{i,\text{world}} \cdot \lambda_{\text{world},t} \tag{38.1}\]

where \(\beta_{i,\text{world}}\) is the stock’s loading on the global market factor and \(\lambda_{\text{world},t}\) is the global risk premium. Only global systematic risk is priced; country-specific risk is diversifiable and commands no premium.

Under full segmentation:

\[ E[R_{i,t} - R_{f,t}] = \beta_{i,\text{local}} \cdot \lambda_{\text{local},t} \tag{38.2}\]

where \(\beta_{i,\text{local}}\) is the stock’s loading on the Vietnamese market and \(\lambda_{\text{local},t}\) is the domestic risk premium. The domestic market is effectively a closed economy for pricing purposes.

Bekaert and Harvey (1995) model the transition between these states as a regime-switching process where the mixing weight \(\omega_t \in [0, 1]\) evolves over time:

\[ E[R_{i,t} - R_{f,t}] = \omega_t \cdot \beta_{i,\text{world}} \lambda_{\text{world},t} + (1 - \omega_t) \cdot \beta_{i,\text{local}} \lambda_{\text{local},t} \tag{38.3}\]

The weight \(\omega_t\) is the degree of integration: \(\omega_t = 1\) is full integration, \(\omega_t = 0\) is full segmentation.

38.1.2 The Segmentation Premium

When a market transitions from segmented to integrated, its cost of capital falls because the relevant risk for pricing narrows from total domestic risk to only the portion correlated with the global market (Henry 2000; Bekaert, Harvey, and Lundblad 2005). The segmentation premium is the excess expected return that investors in a segmented market require:

\[ \text{Segmentation premium} = (1 - \omega_t) \cdot (\lambda_{\text{local}} - \beta_{\text{local,world}} \cdot \lambda_{\text{world}}) \tag{38.4}\]

This premium represents a deadweight cost: it raises the cost of capital for Vietnamese firms, reduces investment, and lowers welfare relative to the integrated benchmark.

38.2 Data Construction

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
from scipy import stats, optimize
from arch import arch_model
from arch.univariate import ConstantMean, GARCH
import warnings
warnings.filterwarnings('ignore')

plt.rcParams.update({
    'figure.figsize': (12, 6),
    'figure.dpi': 150,
    'font.size': 11,
    'axes.spines.top': False,
    'axes.spines.right': False
})
from datacore import DataCoreClient

client = DataCoreClient()

# Vietnamese market returns
vn_index = client.get_index_returns(
    index='VNINDEX',
    start_date='2000-07-01',
    end_date='2024-12-31',
    frequency='monthly',
    fields=['date', 'return', 'total_return_index']
)
vn_index['date'] = pd.to_datetime(vn_index['date'])
vn_index = vn_index.set_index('date')

# Global and regional indices (USD-denominated for comparability)
global_indices = client.get_global_indices(
    indices=[
        'MSCI_WORLD', 'MSCI_EM', 'MSCI_ASIA_PAC_EX_JP',
        'MSCI_FM',  # Frontier markets
        'SP500', 'STOXX600',
        'MSCI_CHINA', 'MSCI_THAILAND', 'MSCI_INDONESIA',
        'MSCI_PHILIPPINES', 'MSCI_MALAYSIA'
    ],
    start_date='2000-07-01',
    end_date='2024-12-31',
    frequency='monthly',
    currency='USD'
)
global_indices['date'] = pd.to_datetime(global_indices['date'])
global_indices = global_indices.pivot(index='date', columns='index', values='return')

# Vietnam returns in USD for apples-to-apples comparison
vn_usd = client.get_index_returns(
    index='VNINDEX',
    start_date='2000-07-01',
    end_date='2024-12-31',
    frequency='monthly',
    currency='USD'
)
vn_usd['date'] = pd.to_datetime(vn_usd['date'])
global_indices['VIETNAM'] = vn_usd.set_index('date')['return']

# VND/USD exchange rate
fx = client.get_exchange_rates(
    pair='USD_VND',
    start_date='2000-07-01',
    end_date='2024-12-31',
    frequency='monthly'
)
fx['date'] = pd.to_datetime(fx['date'])
fx = fx.set_index('date')

# Global factor returns (Fama-French global)
global_factors = client.get_global_factor_returns(
    start_date='2000-07-01',
    end_date='2024-12-31',
    frequency='monthly',
    factors=['mkt_excess_world', 'smb_world', 'hml_world',
             'rmw_world', 'cma_world', 'wml_world']
)
global_factors['date'] = pd.to_datetime(global_factors['date'])
global_factors = global_factors.set_index('date')

# Vietnamese factor returns (local)
local_factors = client.get_factor_returns(
    market='vietnam',
    start_date='2008-01-01',
    end_date='2024-12-31',
    factors=['mkt_excess', 'smb', 'hml', 'rmw', 'cma', 'wml']
)
local_factors['date'] = pd.to_datetime(local_factors['date'])
local_factors = local_factors.set_index('date')

print(f"Vietnam index: {len(vn_index)} months")
print(f"Global indices: {global_indices.shape}")
print(f"Global factors: {len(global_factors)} months")

38.2.1 Vietnam’s Liberalization Timeline

liberalization_events = pd.DataFrame([
    ('2000-07-28', 'HOSE opens', 'Institutional'),
    ('2002-03-01', 'FOL raised to 30%', 'Ownership'),
    ('2005-03-01', 'HNX opens', 'Institutional'),
    ('2006-01-01', 'Securities Law enacted', 'Legal'),
    ('2007-01-11', 'WTO accession', 'Trade'),
    ('2007-06-01', 'FOL raised to 49%', 'Ownership'),
    ('2009-06-24', 'UPCoM opens', 'Institutional'),
    ('2012-01-01', 'SSC restructuring', 'Regulatory'),
    ('2015-09-01', 'FOL selectively removed', 'Ownership'),
    ('2017-08-01', 'Derivatives market opens', 'Institutional'),
    ('2018-09-01', 'FTSE Frontier Secondary', 'Index'),
    ('2021-01-01', 'New Securities Law', 'Legal'),
    ('2023-06-01', 'KRX trading system', 'Infrastructure'),
], columns=['date', 'event', 'category'])
liberalization_events['date'] = pd.to_datetime(liberalization_events['date'])

print("Vietnam Liberalization Timeline:")
for _, row in liberalization_events.iterrows():
    print(f"  {row['date'].strftime('%Y-%m')}: {row['event']} [{row['category']}]")

38.3 Correlation-Based Integration Measures

38.3.1 Rolling Correlations

The simplest integration metric is the correlation between Vietnamese and global market returns. Higher correlation implies more integration (returns are driven by the same global factors). However, simple correlation is confounded by volatility changes (i.e., correlations tend to increase mechanically during high-volatility periods (Longin and Solnik 2001)).

# Align all series
indices_aligned = global_indices.dropna(subset=['VIETNAM', 'MSCI_WORLD']).copy()

# Rolling 36-month correlations
rolling_window = 36

corr_series = {}
for idx in ['MSCI_WORLD', 'MSCI_EM', 'MSCI_ASIA_PAC_EX_JP',
            'SP500', 'MSCI_CHINA', 'MSCI_THAILAND']:
    if idx in indices_aligned.columns:
        corr = (
            indices_aligned[['VIETNAM', idx]]
            .rolling(rolling_window)
            .corr()
            .unstack()['VIETNAM'][idx]
        )
        corr_series[idx] = corr

corr_df = pd.DataFrame(corr_series)
fig, ax = plt.subplots(figsize=(14, 6))

colors_idx = {
    'MSCI_WORLD': '#2C5F8A', 'MSCI_EM': '#C0392B',
    'MSCI_ASIA_PAC_EX_JP': '#27AE60', 'SP500': '#8E44AD',
    'MSCI_CHINA': '#E67E22', 'MSCI_THAILAND': '#1ABC9C'
}

for idx, color in colors_idx.items():
    if idx in corr_df.columns:
        ax.plot(corr_df.index, corr_df[idx], color=color,
                linewidth=1.5, label=idx.replace('MSCI_', '').replace('_', ' '),
                alpha=0.85)

# Add liberalization events
for _, event in liberalization_events.iterrows():
    if event['date'] >= corr_df.index.min():
        ax.axvline(x=event['date'], color='gray', linewidth=0.5,
                   linestyle=':', alpha=0.6)

ax.axhline(y=0, color='black', linewidth=0.5)
ax.set_ylabel('Correlation with Vietnam')
ax.set_title('Rolling 36-Month Correlation: Vietnam vs Global Markets')
ax.legend(fontsize=8, ncol=3)
ax.set_ylim([-0.3, 0.8])

plt.tight_layout()
plt.show()
Figure 38.1

38.3.2 DCC-GARCH Dynamic Correlations

To separate changes in correlation from changes in volatility, we estimate a Dynamic Conditional Correlation (DCC) model (Engle 2002). The DCC decomposes the time-varying covariance matrix into time-varying volatilities and a time-varying correlation matrix:

\[ H_t = D_t R_t D_t \tag{38.5}\]

where \(D_t = \text{diag}(\sigma_{1,t}, \ldots, \sigma_{n,t})\) and \(R_t\) is the conditional correlation matrix that evolves according to:

\[ Q_t = (1 - a - b) \bar{Q} + a \epsilon_{t-1} \epsilon_{t-1}' + b Q_{t-1} \tag{38.6}\]

\[ R_t = \text{diag}(Q_t)^{-1/2} Q_t \text{diag}(Q_t)^{-1/2} \tag{38.7}\]

def estimate_dcc(y1, y2, p=1, q=1):
    """
    Two-step DCC-GARCH estimation.
    Step 1: Univariate GARCH for each series.
    Step 2: DCC parameters from standardized residuals.
    """
    # Step 1: Univariate GARCH(1,1) for each series
    models = []
    std_resids = []
    cond_vols = []
    
    for y in [y1, y2]:
        am = arch_model(y * 100, vol='GARCH', p=p, q=q,
                          mean='Constant', dist='normal')
        res = am.fit(disp='off')
        models.append(res)
        std_resids.append(res.std_resid)
        cond_vols.append(res.conditional_volatility / 100)
    
    # Align residuals
    e1 = std_resids[0]
    e2 = std_resids[1]
    common = e1.dropna().index.intersection(e2.dropna().index)
    e1 = e1[common].values
    e2 = e2[common].values
    T = len(e1)
    
    # Step 2: DCC estimation
    # Q_bar = unconditional correlation of standardized residuals
    Q_bar = np.corrcoef(e1, e2)
    
    def dcc_loglik(params):
        a, b = params
        if a < 0 or b < 0 or a + b >= 1:
            return 1e10
        
        Q = np.zeros((T, 2, 2))
        R = np.zeros((T, 2, 2))
        Q[0] = Q_bar.copy()
        
        ll = 0
        for t in range(T):
            if t > 0:
                et = np.array([[e1[t-1]], [e2[t-1]]])
                Q[t] = (1 - a - b) * Q_bar + a * (et @ et.T) + b * Q[t-1]
            
            # Normalize
            d = np.sqrt(np.diag(Q[t]))
            if d[0] > 0 and d[1] > 0:
                R[t] = Q[t] / np.outer(d, d)
            else:
                R[t] = np.eye(2)
            
            # Clip correlation
            R[t, 0, 1] = np.clip(R[t, 0, 1], -0.999, 0.999)
            R[t, 1, 0] = R[t, 0, 1]
            
            # Log-likelihood contribution
            det_R = 1 - R[t, 0, 1] ** 2
            if det_R > 0:
                et_vec = np.array([e1[t], e2[t]])
                ll += -0.5 * (np.log(det_R) +
                              et_vec @ np.linalg.inv(R[t]) @ et_vec -
                              et_vec @ et_vec)
        
        return -ll
    
    result = optimize.minimize(dcc_loglik, [0.05, 0.90],
                                method='Nelder-Mead',
                                options={'maxiter': 5000})
    a_hat, b_hat = result.x
    
    # Reconstruct dynamic correlations
    Q = np.zeros((T, 2, 2))
    dcc_corr = np.zeros(T)
    Q[0] = Q_bar.copy()
    
    for t in range(T):
        if t > 0:
            et = np.array([[e1[t-1]], [e2[t-1]]])
            Q[t] = (1 - a_hat - b_hat) * Q_bar + a_hat * (et @ et.T) + b_hat * Q[t-1]
        
        d = np.sqrt(np.diag(Q[t]))
        if d[0] > 0 and d[1] > 0:
            dcc_corr[t] = Q[t, 0, 1] / (d[0] * d[1])
        else:
            dcc_corr[t] = 0
    
    return {
        'a': a_hat, 'b': b_hat,
        'persistence': a_hat + b_hat,
        'dcc_corr': pd.Series(dcc_corr, index=common),
        'cond_vol_1': cond_vols[0],
        'cond_vol_2': cond_vols[1]
    }

# Estimate DCC: Vietnam vs MSCI World
vn_ret = indices_aligned['VIETNAM'].dropna()
world_ret = indices_aligned['MSCI_WORLD'].dropna()
common_dates = vn_ret.index.intersection(world_ret.index)

dcc_result = estimate_dcc(vn_ret[common_dates], world_ret[common_dates])

print(f"DCC Parameters:")
print(f"  a (news): {dcc_result['a']:.4f}")
print(f"  b (persistence): {dcc_result['b']:.4f}")
print(f"  a + b: {dcc_result['persistence']:.4f}")
print(f"\nDCC Correlation with MSCI World:")
print(f"  Mean: {dcc_result['dcc_corr'].mean():.3f}")
print(f"  Min:  {dcc_result['dcc_corr'].min():.3f}")
print(f"  Max:  {dcc_result['dcc_corr'].max():.3f}")
fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True,
                          gridspec_kw={'height_ratios': [2, 1]})

# Panel A: DCC correlation
axes[0].plot(dcc_result['dcc_corr'].index, dcc_result['dcc_corr'].values,
             color='#2C5F8A', linewidth=1.5)
axes[0].fill_between(dcc_result['dcc_corr'].index,
                       dcc_result['dcc_corr'].values, 0,
                       alpha=0.2, color='#2C5F8A')

# Liberalization events
event_colors = {'Ownership': '#C0392B', 'Trade': '#27AE60',
                'Institutional': '#E67E22', 'Legal': '#8E44AD',
                'Index': '#1ABC9C', 'Regulatory': '#F1C40F',
                'Infrastructure': '#3498DB'}

for _, event in liberalization_events.iterrows():
    if event['date'] in dcc_result['dcc_corr'].index or True:
        color = event_colors.get(event['category'], 'gray')
        axes[0].axvline(x=event['date'], color=color, linewidth=1.5,
                         linestyle='--', alpha=0.7)
        axes[0].text(event['date'], axes[0].get_ylim()[1] * 0.95,
                      event['event'][:15], rotation=90, fontsize=6,
                      va='top', color=color)

axes[0].set_ylabel('Dynamic Conditional Correlation')
axes[0].set_title('Panel A: DCC-GARCH Correlation (Vietnam–World)')
axes[0].axhline(y=0, color='black', linewidth=0.5)

# Panel B: Conditional volatilities
vol1 = dcc_result['cond_vol_1'] * np.sqrt(12)  # Annualized
vol2 = dcc_result['cond_vol_2'] * np.sqrt(12)
common_vol = vol1.index.intersection(vol2.index)

axes[1].plot(common_vol, vol1[common_vol], color='#C0392B',
             linewidth=1, label='Vietnam', alpha=0.8)
axes[1].plot(common_vol, vol2[common_vol], color='#2C5F8A',
             linewidth=1, label='World', alpha=0.8)
axes[1].set_ylabel('Annualized Cond. Vol')
axes[1].set_title('Panel B: Conditional Volatility')
axes[1].legend(fontsize=9)

plt.tight_layout()
plt.show()
Figure 38.2

38.3.3 Asymmetric Integration

Integration may be state-dependent: co-movement often increases during global crises (contagion) but not during local booms. Longin and Solnik (2001) and Ang and Chen (2002) show that equity correlations are higher during market downturns. We test for asymmetry:

# Classify global market regimes
world_ret_aligned = world_ret[common_dates]
vn_ret_aligned = vn_ret[common_dates]

# Bear: world return in bottom 25th percentile
# Bull: world return in top 25th percentile
q25 = world_ret_aligned.quantile(0.25)
q75 = world_ret_aligned.quantile(0.75)

bear = world_ret_aligned <= q25
bull = world_ret_aligned >= q75
normal = ~bear & ~bull

regimes = {
    'Bear (bottom 25%)': bear,
    'Normal (middle 50%)': normal,
    'Bull (top 25%)': bull,
    'All': pd.Series(True, index=world_ret_aligned.index)
}

print("Asymmetric Correlation:")
print(f"{'Regime':<25} {'Correlation':>12} {'N months':>10}")
print("-" * 47)

for name, mask in regimes.items():
    r_vn = vn_ret_aligned[mask]
    r_w = world_ret_aligned[mask]
    corr = r_vn.corr(r_w)
    print(f"{name:<25} {corr:>12.3f} {mask.sum():>10}")

# Test: is bear correlation > bull correlation?
r_bear_vn = vn_ret_aligned[bear]
r_bear_w = world_ret_aligned[bear]
r_bull_vn = vn_ret_aligned[bull]
r_bull_w = world_ret_aligned[bull]

# Fisher z-transformation test
def fisher_z_test(r1, n1, r2, n2):
    z1 = np.arctanh(r1)
    z2 = np.arctanh(r2)
    se = np.sqrt(1 / (n1 - 3) + 1 / (n2 - 3))
    z_stat = (z1 - z2) / se
    p_val = 2 * (1 - stats.norm.cdf(abs(z_stat)))
    return z_stat, p_val

z, p = fisher_z_test(
    r_bear_vn.corr(r_bear_w), len(r_bear_vn),
    r_bull_vn.corr(r_bull_w), len(r_bull_vn)
)
print(f"\nFisher z-test (bear vs bull): z = {z:.2f}, p = {p:.4f}")

38.4 Factor-Based Integration Measures

38.4.1 The Pukthuanthong-Roll R² Measure

Pukthuanthong and Roll (2009) propose measuring integration as the \(R^2\) from regressing a country’s returns on a set of global principal components. The intuition: if a market is fully integrated, global factors should explain all of its systematic return variation.

def pukthuanthong_roll_integration(country_returns, global_returns_matrix,
                                     n_components=10, rolling_window=36):
    """
    Pukthuanthong-Roll (2009) R²-based integration measure.
    
    1. Extract principal components from global returns.
    2. Regress country returns on these PCs.
    3. R² = degree of integration.
    """
    common = country_returns.dropna().index.intersection(
        global_returns_matrix.dropna().index
    )
    
    dates = sorted(common)
    T = len(dates)
    
    integration = []
    
    for t in range(rolling_window, T):
        window = dates[t - rolling_window:t]
        
        # Global returns in window
        G = global_returns_matrix.loc[window].dropna(axis=1)
        if G.shape[1] < n_components:
            continue
        
        # Standardize
        G_std = (G - G.mean()) / G.std()
        
        # PCA
        cov = G_std.T @ G_std / len(G_std)
        eigenvalues, eigenvectors = np.linalg.eigh(cov.values)
        
        # Sort descending
        idx = np.argsort(-eigenvalues)
        eigenvalues = eigenvalues[idx]
        eigenvectors = eigenvectors[:, idx]
        
        # Project onto top K PCs
        PCs = G_std.values @ eigenvectors[:, :n_components]
        
        # Regress country returns on PCs
        y = country_returns.loc[window].values
        X = sm.add_constant(PCs)
        
        try:
            model = sm.OLS(y, X).fit()
            integration.append({
                'date': dates[t],
                'r_squared': model.rsquared,
                'adj_r_squared': model.rsquared_adj,
                'var_explained_pc1': eigenvalues[0] / eigenvalues.sum(),
                'n_countries': G.shape[1]
            })
        except Exception:
            pass
    
    return pd.DataFrame(integration)

# Build global returns matrix from multiple country indices
global_matrix = global_indices.drop(columns=['VIETNAM'], errors='ignore')

pr_result = pukthuanthong_roll_integration(
    indices_aligned['VIETNAM'],
    global_matrix,
    n_components=5,
    rolling_window=36
)

if len(pr_result) > 0:
    pr_result['date'] = pd.to_datetime(pr_result['date'])
    print(f"Pukthuanthong-Roll Integration (R²):")
    print(f"  Mean: {pr_result['r_squared'].mean():.3f}")
    print(f"  2008-2012: {pr_result[(pr_result['date'] >= '2008') & (pr_result['date'] < '2013')]['r_squared'].mean():.3f}")
    print(f"  2013-2018: {pr_result[(pr_result['date'] >= '2013') & (pr_result['date'] < '2019')]['r_squared'].mean():.3f}")
    print(f"  2019-2024: {pr_result[pr_result['date'] >= '2019']['r_squared'].mean():.3f}")

38.4.2 Global vs. Local Factor Pricing

Griffin (2002) tests whether global or local versions of the Fama-French factors better explain country-level returns. We implement this horse race for Vietnam:

# Align global and local factors
common_factor_dates = (
    global_factors.index
    .intersection(local_factors.index)
    .intersection(vn_index.index)
)

vn_excess = vn_index.loc[common_factor_dates, 'return']

# Model 1: Global FF5
X_global = sm.add_constant(
    global_factors.loc[common_factor_dates,
                        ['mkt_excess_world', 'smb_world', 'hml_world',
                         'rmw_world', 'cma_world']]
)
model_global = sm.OLS(vn_excess, X_global).fit(
    cov_type='HAC', cov_kwds={'maxlags': 6}
)

# Model 2: Local FF5
X_local = sm.add_constant(
    local_factors.loc[common_factor_dates,
                       ['mkt_excess', 'smb', 'hml', 'rmw', 'cma']]
)
model_local = sm.OLS(vn_excess, X_local).fit(
    cov_type='HAC', cov_kwds={'maxlags': 6}
)

# Model 3: Both global and local
X_both = sm.add_constant(pd.concat([
    global_factors.loc[common_factor_dates,
                        ['mkt_excess_world', 'smb_world', 'hml_world']],
    local_factors.loc[common_factor_dates,
                       ['mkt_excess', 'smb', 'hml']]
], axis=1))
model_both = sm.OLS(vn_excess, X_both).fit(
    cov_type='HAC', cov_kwds={'maxlags': 6}
)

print("Global vs Local Factor Models for VN-Index:")
print(f"{'Model':<20} {'R²':>8} {'Adj R²':>8} {'α (ann)':>10} {'α t-stat':>10}")
print("-" * 56)
for name, mod in [('Global FF5', model_global),
                    ('Local FF5', model_local),
                    ('Global + Local', model_both)]:
    print(f"{name:<20} {mod.rsquared:>8.3f} {mod.rsquared_adj:>8.3f} "
          f"{mod.params['const']*12:>10.4f} {mod.tvalues['const']:>10.2f}")
rolling_r2 = []
rw = 36

for t in range(rw, len(common_factor_dates)):
    window = common_factor_dates[t - rw:t]
    y = vn_excess[window]
    
    # Global
    X_g = sm.add_constant(global_factors.loc[window,
                           ['mkt_excess_world', 'smb_world', 'hml_world',
                            'rmw_world', 'cma_world']])
    try:
        r2_g = sm.OLS(y, X_g).fit().rsquared
    except:
        r2_g = np.nan
    
    # Local
    X_l = sm.add_constant(local_factors.loc[window,
                           ['mkt_excess', 'smb', 'hml', 'rmw', 'cma']])
    try:
        r2_l = sm.OLS(y, X_l).fit().rsquared
    except:
        r2_l = np.nan
    
    rolling_r2.append({
        'date': common_factor_dates[t],
        'r2_global': r2_g,
        'r2_local': r2_l,
        'ratio': r2_g / r2_l if r2_l > 0 else np.nan
    })

r2_df = pd.DataFrame(rolling_r2)

fig, axes = plt.subplots(2, 1, figsize=(14, 8), sharex=True)

axes[0].plot(r2_df['date'], r2_df['r2_global'], color='#2C5F8A',
             linewidth=1.5, label='Global FF5')
axes[0].plot(r2_df['date'], r2_df['r2_local'], color='#C0392B',
             linewidth=1.5, label='Local FF5')
axes[0].set_ylabel('R²')
axes[0].set_title('Panel A: Global vs Local Factor R²')
axes[0].legend()

axes[1].plot(r2_df['date'], r2_df['ratio'], color='#27AE60', linewidth=1.5)
axes[1].axhline(y=1, color='gray', linewidth=1, linestyle='--',
                label='Full integration (ratio = 1)')
axes[1].set_ylabel('Global R² / Local R²')
axes[1].set_title('Panel B: Integration Ratio')
axes[1].legend()
axes[1].set_ylim([0, 1.5])

plt.tight_layout()
plt.show()
Figure 38.3

38.5 Pricing-Error-Based Integration

38.5.1 The Bekaert-Harvey Approach

Bekaert and Harvey (1995) measure integration as the ability of a global CAPM to price local assets. Under integration, the local market alpha (intercept) in a regression on the global market should be zero, and the global risk premium should explain the local expected return. Under segmentation, the local alpha captures the segmentation premium.

def bekaert_harvey_integration(local_return, global_return,
                                 rolling_window=36):
    """
    Rolling alpha from regressing local on global market.
    Under integration, alpha -> 0.
    """
    common = local_return.dropna().index.intersection(global_return.dropna().index)
    
    results = []
    for t in range(rolling_window, len(common)):
        window = common[t - rolling_window:t]
        y = local_return[window]
        X = sm.add_constant(global_return[window])
        
        model = sm.OLS(y, X).fit()
        
        results.append({
            'date': common[t],
            'alpha': model.params['const'],
            'alpha_t': model.tvalues['const'],
            'beta_global': model.params.iloc[1],
            'r_squared': model.rsquared
        })
    
    return pd.DataFrame(results)

bh_result = bekaert_harvey_integration(
    indices_aligned['VIETNAM'],
    indices_aligned['MSCI_WORLD'],
    rolling_window=36
)

# The absolute alpha is the segmentation premium
bh_result['abs_alpha_ann'] = bh_result['alpha'].abs() * 12

print("Bekaert-Harvey Integration Diagnostic:")
print(f"  Mean |α| (ann.): {bh_result['abs_alpha_ann'].mean():.4f}")
print(f"  Mean β_world: {bh_result['beta_global'].mean():.3f}")
print(f"  Mean R²: {bh_result['r_squared'].mean():.3f}")

38.5.2 Composite Integration Index

We combine all measures into a single composite index of Vietnamese market integration:

# Standardize each measure to [0, 1] using historical percentile ranks
measures = pd.DataFrame(index=bh_result['date'])

# 1. DCC correlation (higher = more integrated)
dcc_aligned = dcc_result['dcc_corr'].reindex(measures.index).interpolate()
measures['dcc_corr'] = dcc_aligned

# 2. PR R² (higher = more integrated)
pr_aligned = pr_result.set_index('date')['r_squared'].reindex(measures.index).interpolate()
measures['pr_r2'] = pr_aligned

# 3. Global/Local R² ratio (higher = more integrated)
r2_aligned = r2_df.set_index('date')['ratio'].reindex(measures.index).interpolate()
measures['gl_ratio'] = r2_aligned

# 4. |Alpha| from global CAPM (lower = more integrated)
# Invert: 1 - percentile rank of |alpha|
measures['inv_alpha'] = bh_result.set_index('date')['abs_alpha_ann']
measures['inv_alpha'] = 1 - measures['inv_alpha'].rank(pct=True)

# 5. Global beta (higher = more integrated, up to a point)
measures['global_beta'] = bh_result.set_index('date')['beta_global']

# Standardize to percentile ranks
for col in ['dcc_corr', 'pr_r2', 'gl_ratio', 'inv_alpha', 'global_beta']:
    measures[f'{col}_rank'] = measures[col].rank(pct=True)

# Composite = equal-weighted average of ranks
rank_cols = [c for c in measures.columns if c.endswith('_rank')]
measures['composite'] = measures[rank_cols].mean(axis=1)

# Smooth with 6-month moving average
measures['composite_smooth'] = measures['composite'].rolling(6).mean()
fig, ax = plt.subplots(figsize=(14, 6))

ax.fill_between(measures.index, measures['composite_smooth'],
                 alpha=0.3, color='#2C5F8A')
ax.plot(measures.index, measures['composite_smooth'],
        color='#2C5F8A', linewidth=2)

# Add event markers
for _, event in liberalization_events.iterrows():
    if event['date'] >= measures.index.min():
        color = event_colors.get(event['category'], 'gray')
        ax.axvline(x=event['date'], color=color,
                   linewidth=1.5, linestyle='--', alpha=0.6)

ax.set_ylabel('Integration Index (0 = segmented, 1 = integrated)')
ax.set_title('Vietnam Equity Market Integration: Composite Index')
ax.set_ylim([0, 1])

# Legend for event categories
from matplotlib.patches import Patch
legend_patches = [Patch(facecolor=c, label=cat)
                   for cat, c in event_colors.items() if cat in
                   liberalization_events['category'].values]
ax.legend(handles=legend_patches, fontsize=7, loc='lower right', ncol=2)

plt.tight_layout()
plt.show()
Figure 38.4

38.6 Structural Break Detection

38.6.1 Bai-Perron Tests for Integration Regime Shifts

We test whether Vietnam’s integration trajectory contains discrete structural breaks (i.e., sudden shifts in the integration level) rather than a smooth trend:

def detect_breaks_cusum(series, significance=0.05):
    """
    CUSUM-based structural break detection.
    """
    y = series.dropna().values
    T = len(y)
    
    # Recursive residuals from rolling mean
    cumsum = np.cumsum(y - y.mean()) / (y.std() * np.sqrt(T))
    
    # Brown-Durbin-Evans critical values (approximate)
    # At 5%: ±0.948
    critical = 0.948
    
    breaks = []
    for t in range(1, T - 1):
        if abs(cumsum[t]) > critical:
            breaks.append(t)
    
    return cumsum, breaks

# Apply to composite index
composite_clean = measures['composite_smooth'].dropna()
cusum, break_points = detect_breaks_cusum(composite_clean)

# Alternative: Chow test at key liberalization dates
def chow_test(y, breakpoint_idx):
    """Simple Chow test for structural break."""
    T = len(y)
    y1 = y[:breakpoint_idx]
    y2 = y[breakpoint_idx:]
    
    # Full sample regression (on constant)
    rss_full = np.sum((y - y.mean()) ** 2)
    
    # Split samples
    rss1 = np.sum((y1 - y1.mean()) ** 2)
    rss2 = np.sum((y2 - y2.mean()) ** 2)
    rss_split = rss1 + rss2
    
    k = 1  # Number of parameters
    f_stat = ((rss_full - rss_split) / k) / (rss_split / (T - 2 * k))
    p_val = 1 - stats.f.cdf(f_stat, k, T - 2 * k)
    
    return f_stat, p_val

print("Chow Tests for Structural Breaks at Key Dates:")
print(f"{'Event':<30} {'Date':>12} {'F-stat':>10} {'p-value':>10}")
print("-" * 62)

for _, event in liberalization_events.iterrows():
    if event['date'] < composite_clean.index.min():
        continue
    # Find nearest date
    nearest = composite_clean.index.searchsorted(event['date'])
    if nearest < 12 or nearest > len(composite_clean) - 12:
        continue
    
    f_stat, p_val = chow_test(composite_clean.values, nearest)
    sig = '***' if p_val < 0.01 else '**' if p_val < 0.05 else '*' if p_val < 0.1 else ''
    print(f"{event['event']:<30} {event['date'].strftime('%Y-%m'):>12} "
          f"{f_stat:>10.2f} {p_val:>10.4f} {sig}")

38.7 The Segmentation Premium for Vietnam

38.7.1 Cross-Sectional Evidence

Under partial segmentation, stocks with higher foreign ownership should have lower expected returns (because foreign investors can diversify away local risk). This yields a testable prediction: foreign ownership should be negatively associated with expected returns, controlling for global risk exposure.

# Get monthly stock returns with foreign ownership
stock_data = client.get_monthly_returns(
    exchanges=['HOSE', 'HNX'],
    start_date='2008-01-01',
    end_date='2024-12-31',
    fields=['ticker', 'month_end', 'monthly_return', 'market_cap',
            'foreign_ownership_pct']
)
stock_data['month_end'] = pd.to_datetime(stock_data['month_end'])

# Fama-MacBeth: regress returns on lagged foreign ownership
stock_data = stock_data.sort_values(['ticker', 'month_end'])
stock_data['fol_lag'] = (
    stock_data.groupby('ticker')['foreign_ownership_pct'].shift(1)
)
stock_data['log_mcap'] = np.log(stock_data['market_cap'].clip(lower=1))

# Monthly cross-sectional regressions
gamma_fol = []
for month, group in stock_data.dropna(subset=['fol_lag', 'monthly_return']).groupby('month_end'):
    if len(group) < 100:
        continue
    
    y = group['monthly_return'].values
    X = sm.add_constant(group[['fol_lag', 'log_mcap']].values)
    
    try:
        model = sm.OLS(y, X).fit()
        gamma_fol.append({
            'month': month,
            'gamma_fol': model.params[1],
            'gamma_size': model.params[2],
            'n': len(group)
        })
    except:
        pass

gamma_fol_df = pd.DataFrame(gamma_fol)

mean_gamma = gamma_fol_df['gamma_fol'].mean()
se_gamma = gamma_fol_df['gamma_fol'].std() / np.sqrt(len(gamma_fol_df))
t_gamma = mean_gamma / se_gamma

print(f"Fama-MacBeth: Foreign Ownership and Expected Returns")
print(f"  γ_FOL (monthly):  {mean_gamma:.6f}")
print(f"  γ_FOL (ann.):     {mean_gamma * 12:.4f}")
print(f"  t-statistic:      {t_gamma:.2f}")
print(f"  Interpretation:   A 10pp increase in foreign ownership is "
      f"associated with a {mean_gamma * 12 * 10:.2f}% change in "
      f"annual expected returns")
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Panel A: Quintile returns by foreign ownership
fol_quintiles = stock_data.dropna(subset=['fol_lag', 'monthly_return']).copy()
fol_quintiles['fol_q'] = (
    fol_quintiles.groupby('month_end')['fol_lag']
    .transform(lambda x: pd.qcut(x.rank(method='first'), 5,
                                    labels=['Q1\n(Low FOL)', 'Q2', 'Q3', 'Q4',
                                            'Q5\n(High FOL)']))
)

q_returns = (
    fol_quintiles.groupby('fol_q')['monthly_return']
    .mean() * 12 * 100
)

colors_q = plt.cm.RdYlGn_r(np.linspace(0.2, 0.8, 5))
axes[0].bar(range(5), q_returns.values, color=colors_q,
            edgecolor='white', alpha=0.85)
axes[0].set_xticks(range(5))
axes[0].set_xticklabels(q_returns.index)
axes[0].set_ylabel('Ann. Return (%)')
axes[0].set_title('Panel A: Returns by Foreign Ownership Quintile')
axes[0].axhline(y=0, color='black', linewidth=0.5)

# Panel B: Rolling FM coefficient
gamma_fol_df['date'] = pd.to_datetime(gamma_fol_df['month'])
rolling_gamma = gamma_fol_df.set_index('date')['gamma_fol'].rolling(24).mean() * 12

axes[1].plot(rolling_gamma.index, rolling_gamma.values,
             color='#2C5F8A', linewidth=1.5)
axes[1].axhline(y=0, color='black', linewidth=0.5)
axes[1].set_ylabel('γ_FOL (annualized)')
axes[1].set_title('Panel B: Rolling Segmentation Premium')
axes[1].fill_between(rolling_gamma.index, rolling_gamma.values, 0,
                      where=rolling_gamma.values < 0,
                      alpha=0.3, color='#27AE60', label='Negative (integration)')
axes[1].fill_between(rolling_gamma.index, rolling_gamma.values, 0,
                      where=rolling_gamma.values >= 0,
                      alpha=0.3, color='#C0392B', label='Positive (segmentation)')
axes[1].legend(fontsize=8)

plt.tight_layout()
plt.show()
Figure 38.5

38.8 Exchange Rate Risk and Integration

38.8.1 Is Currency Risk Priced?

In a partially integrated market, exchange rate risk may carry a separate premium. Jorion (1991) and Dumas and Solnik (1995) test whether currency exposure is priced beyond global equity risk. For Vietnam, the VND/USD exchange rate is managed (a crawling peg with occasional step devaluations), creating a specific risk that is neither fully diversifiable nor fully priced by global equity factors.

# VND depreciation
fx_return = fx['rate'].pct_change()
fx_return.name = 'fx_return'

# Merge with stock data
stock_fx = stock_data.merge(
    fx_return.to_frame().reset_index().rename(columns={'date': 'month_end'}),
    on='month_end', how='left'
)

# Estimate FX beta for each stock (rolling 60-month)
# Then test in Fama-MacBeth whether FX beta is priced
fx_betas = {}
for ticker, group in stock_fx.groupby('ticker'):
    if len(group) < 60:
        continue
    group = group.sort_values('month_end')
    y = group['monthly_return']
    x = group[['fx_return']].dropna()
    common = y.dropna().index.intersection(x.index)
    if len(common) < 48:
        continue
    model = sm.OLS(y[common], sm.add_constant(x.loc[common])).fit()
    fx_betas[ticker] = model.params.get('fx_return', np.nan)

fx_beta_series = pd.Series(fx_betas, name='fx_beta')

# Cross-sectional test: do stocks with higher FX beta earn different returns?
stock_fx_beta = stock_fx.merge(
    fx_beta_series.to_frame().reset_index().rename(columns={'index': 'ticker'}),
    on='ticker', how='left'
)

# Quintile sort on FX beta
fx_q = stock_fx_beta.dropna(subset=['fx_beta', 'monthly_return'])
fx_q['fx_quintile'] = pd.qcut(fx_q['fx_beta'].rank(method='first'),
                                 5, labels=False)

fx_premium = fx_q.groupby('fx_quintile')['monthly_return'].mean() * 12
print("Returns by FX Beta Quintile:")
for q, ret in fx_premium.items():
    print(f"  Q{q+1}: {ret*100:.2f}% ann.")
print(f"  Q5-Q1: {(fx_premium.iloc[-1] - fx_premium.iloc[0])*100:.2f}% ann.")

38.9 Contagion vs. Interdependence

During global crises, correlations between Vietnam and world markets spike. The question is whether this represents contagion (a structural change in the transmission mechanism) or simply interdependence (normal co-movement amplified by higher volatility). Longin and Solnik (2001) show that correlation increases mechanically with volatility even without any change in the underlying dependence structure.

def forbes_rigobon_test(r_local, r_global, crisis_dates, tranquil_dates):
    """
    Forbes-Rigobon (2002) contagion test.
    Adjusts for heteroskedasticity-induced bias in correlation.
    
    H0: No contagion (correlation increase is explained by volatility)
    H1: Contagion (correlation increase exceeds what volatility explains)
    """
    r_l_crisis = r_local[crisis_dates]
    r_g_crisis = r_global[crisis_dates]
    r_l_tranquil = r_local[tranquil_dates]
    r_g_tranquil = r_global[tranquil_dates]
    
    # Unadjusted correlations
    rho_crisis = r_l_crisis.corr(r_g_crisis)
    rho_tranquil = r_l_tranquil.corr(r_g_tranquil)
    
    # Volatility ratio
    delta = r_g_crisis.var() / r_g_tranquil.var() - 1
    
    # Adjusted correlation
    rho_adj = rho_crisis / np.sqrt(1 + delta * (1 - rho_crisis ** 2))
    
    # Fisher z-test on adjusted vs tranquil
    z_adj = np.arctanh(rho_adj)
    z_tranquil = np.arctanh(rho_tranquil)
    se = np.sqrt(1 / (len(r_l_crisis) - 3) + 1 / (len(r_l_tranquil) - 3))
    z_stat = (z_adj - z_tranquil) / se
    p_val = 2 * (1 - stats.norm.cdf(abs(z_stat)))
    
    return {
        'rho_crisis_raw': rho_crisis,
        'rho_crisis_adj': rho_adj,
        'rho_tranquil': rho_tranquil,
        'delta': delta,
        'z_stat': z_stat,
        'p_value': p_val,
        'contagion': p_val < 0.05
    }

# Define crisis and tranquil periods
crises = {
    'GFC (2008-09)': (pd.Timestamp('2008-09-01'), pd.Timestamp('2009-03-31')),
    'European Debt (2011-12)': (pd.Timestamp('2011-06-01'), pd.Timestamp('2012-06-30')),
    'COVID (2020)': (pd.Timestamp('2020-02-01'), pd.Timestamp('2020-06-30')),
    'Fed Tightening (2022)': (pd.Timestamp('2022-01-01'), pd.Timestamp('2022-12-31')),
}

# Tranquil = 24 months before each crisis
vn_aligned = indices_aligned['VIETNAM']
world_aligned = indices_aligned['MSCI_WORLD']

print("Contagion Tests (Forbes-Rigobon):")
print(f"{'Crisis':<28} {'ρ(raw)':>8} {'ρ(adj)':>8} {'ρ(calm)':>8} "
      f"{'z-stat':>8} {'p-val':>8} {'Result':>12}")
print("-" * 80)

for name, (start, end) in crises.items():
    crisis_mask = (vn_aligned.index >= start) & (vn_aligned.index <= end)
    tranquil_start = start - pd.DateOffset(months=24)
    tranquil_mask = ((vn_aligned.index >= tranquil_start) &
                      (vn_aligned.index < start))
    
    crisis_dates = vn_aligned.index[crisis_mask]
    tranquil_dates = vn_aligned.index[tranquil_mask]
    
    if len(crisis_dates) < 3 or len(tranquil_dates) < 12:
        continue
    
    result = forbes_rigobon_test(vn_aligned, world_aligned,
                                  crisis_dates, tranquil_dates)
    
    verdict = 'CONTAGION' if result['contagion'] else 'Interdependence'
    print(f"{name:<28} {result['rho_crisis_raw']:>8.3f} "
          f"{result['rho_crisis_adj']:>8.3f} {result['rho_tranquil']:>8.3f} "
          f"{result['z_stat']:>8.2f} {result['p_value']:>8.3f} "
          f"{verdict:>12}")

38.10 ASEAN Peer Comparison

Vietnam’s integration trajectory is best understood in the context of its ASEAN peers, which share similar starting conditions but have followed different liberalization paths:

fig, ax = plt.subplots(figsize=(14, 6))

asean_markets = {
    'VIETNAM': '#C0392B',
    'MSCI_THAILAND': '#2C5F8A',
    'MSCI_INDONESIA': '#27AE60',
    'MSCI_PHILIPPINES': '#E67E22',
    'MSCI_MALAYSIA': '#8E44AD'
}

for market, color in asean_markets.items():
    if market not in global_indices.columns:
        continue
    
    corr = (
        global_indices[['MSCI_WORLD', market]]
        .rolling(36)
        .corr()
        .unstack()['MSCI_WORLD'][market]
    )
    
    label = market.replace('MSCI_', '').replace('_', ' ').title()
    if market == 'VIETNAM':
        ax.plot(corr.index, corr.values, color=color,
                linewidth=2.5, label=label, zorder=5)
    else:
        ax.plot(corr.index, corr.values, color=color,
                linewidth=1.5, label=label, alpha=0.7)

ax.set_ylabel('Correlation with MSCI World')
ax.set_title('ASEAN Market Integration: Rolling 36-Month Correlation')
ax.legend(fontsize=9)
ax.set_ylim([-0.2, 0.9])
ax.axhline(y=0, color='black', linewidth=0.5)

plt.tight_layout()
plt.show()
Figure 38.6

38.11 Practical Implications

The degree of integration determines which asset pricing model is appropriate for Vietnamese equities. The evidence in this chapter supports several practical conclusions:

Vietnam is partially integrated and trending toward integration. The composite index shows a clear upward trajectory, with the post-2015 period representing the highest sustained integration in the market’s history. However, Vietnam remains less integrated than Thailand or Malaysia, and far from fully integrated with global markets.

Local factors dominate global factors for pricing Vietnamese stocks. The rolling \(R^2\) comparison shows that local Vietnamese factors consistently explain more return variation than global factors. This means that researchers studying Vietnamese cross-sectional returns should use local factor models (Vietnamese FF5) rather than global factors. Global factors are useful primarily for international investors assessing co-movement risk.

The segmentation premium is shrinking but not zero. The Fama-MacBeth evidence shows that stocks with higher foreign ownership earn lower returns, consistent with partial segmentation. The magnitude has declined over time as foreign ownership limits have been relaxed, but a residual premium persists—likely driven by remaining ownership caps in banking and strategic sectors.

Crisis-period co-movement is mostly interdependence, not contagion. The Forbes-Rigobon adjusted correlations show that the spike in Vietnam-World correlation during crises is largely explained by increased global volatility, not a structural change in the transmission mechanism. This is reassuring for diversification: Vietnam continues to offer meaningful diversification benefits even during global stress.

The FTSE/MSCI upgrade path matters. Vietnam’s potential upgrade from frontier to emerging market status would trigger mandatory index rebalancing by passive funds, increasing foreign flows and likely accelerating integration. Researchers and investors should monitor upgrade criteria and their implications for the cost of capital.

38.12 Summary

Table 38.1: Summary of integration measures for the Vietnamese equity market.
Measure What It Captures Vietnam Range Current Level Trend
DCC correlation (World) Co-movement 0.0–0.5 ~0.35–0.45 Rising
PR R² (global PCs) Global factor exposure 0.05–0.50 ~0.30–0.40 Rising
Global/Local R² ratio Relative pricing power 0.1–0.8 ~0.5–0.6 Rising
Global CAPM α Pricing error 0–15% ann. ~3–5% ann. Falling
FOL premium (γ) Segmentation cost -5% to +2% ~-1% to 0% Shrinking