30  Disclosure Quality and Timing

Corporate disclosure is the primary mechanism through which firms communicate with capital markets. The quality, quantity, and timing of disclosures shape the information environment in which investors form expectations, price securities, and allocate capital. A large theoretical and empirical literature, surveyed by Healy and Palepu (2001) and Beyer et al. (2010), demonstrates that disclosure decisions have first-order effects on the cost of capital, liquidity, and investment efficiency.

This chapter brings two decades of disclosure research to the Vietnamese market, where several institutional features create a distinctive setting. First, Vietnam’s regulatory framework, anchored by Circular 155/2015/TT-BTC (amended by Circular 96/2020/TT-BTC) and enforced by the State Securities Commission (SSC), mandates periodic and event-driven disclosures with specific deadlines that differ from U.S. and European norms. Second, the dominance of retail investors and relatively thin analyst coverage means that corporate disclosures are often the primary source of firm-specific information, amplifying their economic importance. Third, the ongoing transition from Vietnamese Accounting Standards (VAS) toward IFRS convergence introduces time-varying changes in disclosure requirements that create natural variation for empirical analysis.

30.1 Theoretical Foundations

30.1.1 Voluntary Disclosure Theory

The foundational model of voluntary disclosure is due to Verrecchia (1983), who shows that in a setting where investors know a manager possesses private information, an unraveling equilibrium emerges: silence is interpreted as bad news, so managers disclose unless the proprietary cost of disclosure exceeds its benefit. The key insight is that non-disclosure is informative because investors rationally infer that withheld information is unfavourable.

Diamond (1985) extends the analysis to a multi-period setting where the firm’s disclosure policy affects the precision of public information and hence the incentives for private information acquisition. The central trade-off is between reducing information asymmetry (which lowers the cost of capital) and reducing the rents that informed traders earn (which may discourage monitoring). Diamond and Verrecchia (1991) formalize the link between disclosure and liquidity: by reducing adverse selection, voluntary disclosure narrows bid-ask spreads and increases the willingness of uninformed investors to trade.

The empirical prediction is that firms with higher-quality disclosure should enjoy:

  1. Lower cost of equity capital (Botosan 1997; Botosan and Plumlee 2002)
  2. Lower cost of debt (Sengupta 1998)
  3. Higher liquidity and lower bid-ask spreads (Diamond and Verrecchia 1991; Lang, Lins, and Maffett 2012)
  4. More efficient investment decisions (Biddle, Hilary, and Verdi 2009)

30.1.2 Strategic Disclosure Timing

Not all disclosure is voluntary in timing, but managers retain discretion over when, within permissible windows, to release information. Patell and Wolfson (1982) document that firms tend to release good news during trading hours and bad news after market close. DellaVigna and Pollet (2009) show that earnings announced on Fridays (i.e., when investor attention is lower) generate smaller immediate reactions and larger post-announcement drift, consistent with limited attention. Hirshleifer, Lim, and Teoh (2009) generalize this finding: extraneous events that distract investors (such as a large number of concurrent announcements) reduce the immediate price response to earnings news.

In Vietnam, several features make strategic timing particularly relevant. The concentrated disclosure calendar, where many firms file near regulatory deadlines, creates natural variation in announcement congestion. The retail-dominated investor base may be more susceptible to attention effects than institutional investors. The regulatory structure, which imposes penalties for late filing but allows discretion within the permissible window, creates a setting in which the choice of filing date is informative.

30.1.3 Disclosure Quality in Emerging Markets

Ball, Robin, and Wu (2003) argue that accounting quality is shaped more by reporting incentives than by accounting standards. In institutional environments with weak enforcement, concentrated ownership, and close alignment between financial and tax reporting, firms may produce lower-quality disclosures even under nominally rigorous standards. Leuz, Nanda, and Wysocki (2003) confirm this pattern internationally: earnings management (an inverse proxy for disclosure quality) is highest in countries with weak investor protection, concentrated ownership, and less developed capital markets.

Vietnam exhibits several of these features. Bushman et al. (2004) classify determinants of transparency into governance factors (legal origin, judicial efficiency, minority protection) and political factors (state ownership, government intervention). Vietnam’s civil-law tradition, significant state ownership in listed firms, and evolving enforcement capacity suggest that disclosure quality may be lower on average than in developed markets, but with substantial cross-sectional variation driven by firm-level governance and ownership structures.

30.2 Regulatory Framework

30.2.1 Mandatory Disclosure Requirements

Vietnamese disclosure regulation operates through a hierarchy of legal instruments:

  • Securities Law (2019): Establishes the general obligation of listed firms to disclose information truthfully, accurately, completely, and on time (Article 118).
  • Circular 155/2015/TT-BTC (amended by Circular 96/2020/TT-BTC): Specifies the content, format, and deadlines for periodic and event-driven disclosures.
  • SSC decisions and guidance: Provide implementation details and sector-specific requirements.

The key periodic reporting deadlines are in Table 30.1

Table 30.1: Periodic disclosure deadlines under Vietnamese securities regulation.
Report Type Deadline Audit Requirement
Annual financial statements 90 days after fiscal year-end Audited
Semi-annual financial statements 45 days after period-end Reviewed
Quarterly financial statements 20 days after quarter-end Unaudited
Annual report 110 days after fiscal year-end N/A (narrative)

Event-driven (ad hoc) disclosures must be filed within 24 hours for material events, including changes in ownership exceeding 1% by major shareholders, board resolutions on dividends or capital increases, and any event that may materially affect the share price.

30.2.2 Penalties for Non-Compliance

The SSC may impose administrative fines for late or incomplete disclosure, typically ranging from VND 50–100 million for minor violations and up to VND 500 million for material omissions. While these amounts are modest relative to firm size for large-cap companies, the reputational cost and the risk of trading suspension provide additional deterrence.

30.3 Data Construction

30.3.1 Loading Required Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.api as sm
import statsmodels.formula.api as smf
from datetime import datetime, timedelta
from scipy import stats
from sklearn.preprocessing import StandardScaler
from linearmodels.panel import PanelOLS
import warnings
warnings.filterwarnings('ignore')

# Plotting defaults
plt.rcParams.update({
    'figure.figsize': (10, 6),
    'figure.dpi': 150,
    'font.size': 11,
    'axes.spines.top': False,
    'axes.spines.right': False
})

30.3.2 Retrieving Disclosure Data

We assume that we have structured data on filing dates, announcement timestamps, and the textual content of corporate disclosures for all firms.

from datacore import DataCoreClient

client = DataCoreClient()

# Filing metadata: announcement dates, filing dates, report types
filings = client.get_filings(
    exchanges=['HOSE', 'HNX'],
    report_types=['annual', 'semi_annual', 'quarterly'],
    start_date='2012-01-01',
    end_date='2024-12-31',
    fields=[
        'ticker', 'report_type', 'fiscal_year', 'fiscal_quarter',
        'fiscal_year_end', 'filing_date', 'announcement_date',
        'auditor', 'audit_opinion', 'file_url'
    ]
)

# Financial statement data
financials = client.get_fundamentals(
    exchanges=['HOSE', 'HNX'],
    start_date='2012-01-01',
    end_date='2024-12-31',
    fields=[
        'ticker', 'fiscal_year', 'fiscal_quarter',
        'total_assets', 'total_equity', 'revenue', 'net_income',
        'operating_cash_flow', 'total_accruals',
        'market_cap', 'book_to_market'
    ]
)

# Daily trading data for event studies
trading = client.get_daily_prices(
    exchanges=['HOSE', 'HNX'],
    start_date='2012-01-01',
    end_date='2024-12-31',
    fields=[
        'ticker', 'date', 'close', 'volume', 'turnover',
        'bid_ask_spread', 'market_return'
    ]
)

# Ownership and governance
governance = client.get_governance(
    exchanges=['HOSE', 'HNX'],
    fields=[
        'ticker', 'fiscal_year', 'state_ownership_pct',
        'foreign_ownership_pct', 'board_size',
        'board_independence_pct', 'big4_auditor',
        'dual_listing'
    ]
)

print(f"Filings: {filings.shape[0]:,} records")
print(f"Financials: {financials.shape[0]:,} records")
print(f"Trading: {trading.shape[0]:,} records")
print(f"Governance: {governance.shape[0]:,} records")

30.3.3 Computing Filing Timeliness

We define reporting lag as the number of calendar days between the fiscal period-end and the date the firm’s financial statements are made publicly available. For annual reports, the regulatory maximum is 90 days; firms that file earlier than the deadline reveal information sooner, while firms that file late face potential penalties and signal possible difficulties with their accounts.

filings['fiscal_year_end'] = pd.to_datetime(filings['fiscal_year_end'])
filings['filing_date'] = pd.to_datetime(filings['filing_date'])
filings['announcement_date'] = pd.to_datetime(filings['announcement_date'])

# Reporting lag = filing date - fiscal period end
filings['reporting_lag'] = (
    filings['filing_date'] - filings['fiscal_year_end']
).dt.days

# Regulatory deadline based on report type
deadline_map = {
    'annual': 90,
    'semi_annual': 45,
    'quarterly': 20
}
filings['deadline_days'] = filings['report_type'].map(deadline_map)

# Late filing indicator
filings['late_filing'] = (
    filings['reporting_lag'] > filings['deadline_days']
).astype(int)

# Days relative to deadline (negative = early, positive = late)
filings['days_relative_deadline'] = (
    filings['reporting_lag'] - filings['deadline_days']
)

# Summary statistics
annual_filings = filings[filings['report_type'] == 'annual'].copy()
print("Annual Report Filing Lag (calendar days):")
print(annual_filings['reporting_lag'].describe().round(1))
print(f"\nLate filing rate: {annual_filings['late_filing'].mean():.1%}")

30.3.4 Distribution of Filing Lags

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Histogram
axes[0].hist(
    annual_filings['reporting_lag'].dropna(),
    bins=60, range=(20, 150),
    color='#2C5F8A', edgecolor='white', alpha=0.85
)
axes[0].axvline(x=90, color='#C0392B', linestyle='--', linewidth=2,
                label='90-day deadline')
axes[0].set_xlabel('Reporting Lag (Calendar Days)')
axes[0].set_ylabel('Number of Filings')
axes[0].set_title('Distribution of Annual Report Filing Lags')
axes[0].legend()

# Time trend: median lag by year
median_lag = (
    annual_filings
    .groupby('fiscal_year')['reporting_lag']
    .agg(['median', lambda x: x.quantile(0.25),
          lambda x: x.quantile(0.75)])
)
median_lag.columns = ['median', 'p25', 'p75']

axes[1].fill_between(
    median_lag.index, median_lag['p25'], median_lag['p75'],
    alpha=0.3, color='#2C5F8A', label='IQR'
)
axes[1].plot(
    median_lag.index, median_lag['median'],
    color='#2C5F8A', linewidth=2, marker='o', label='Median'
)
axes[1].axhline(y=90, color='#C0392B', linestyle='--',
                linewidth=1.5, label='Deadline')
axes[1].set_xlabel('Fiscal Year')
axes[1].set_ylabel('Filing Lag (Calendar Days)')
axes[1].set_title('Median Annual Filing Lag Over Time')
axes[1].legend()

plt.tight_layout()
plt.show()
Figure 30.1

30.4 Measuring Disclosure Quality

Disclosure quality is inherently multidimensional. Following Dechow, Ge, and Schrand (2010) and Beyer et al. (2010), we construct proxies along four dimensions: (i) timeliness, (ii) textual properties, (iii) accounting quality, and (iv) voluntary disclosure breadth.

30.4.1 Timeliness as a Quality Dimension

Timely disclosure reduces the duration of information asymmetry between insiders and outside investors. Chambers and Penman (1984) and Givoly and Palmon (1982) establish that early reporters tend to announce good news, while late reporters more often deliver bad news. We test this pattern in the Vietnamese context below.

We operationalize timeliness through two measures:

  1. Reporting lag (continuous): Calendar days from fiscal period-end to filing date, as constructed in Section 30.3.3.
  2. Early/late classification (categorical): We classify firm-years into terciles based on reporting lag within each fiscal year. This controls for secular trends in filing speed (e.g., driven by regulatory changes or COVID-19 disruptions).
annual_filings['lag_tercile'] = (
    annual_filings
    .groupby('fiscal_year')['reporting_lag']
    .transform(lambda x: pd.qcut(x, 3, labels=['Early', 'Middle', 'Late']))
)

# Tabulate
tercile_stats = (
    annual_filings
    .groupby('lag_tercile')['reporting_lag']
    .agg(['count', 'mean', 'median', 'std'])
    .round(1)
)
tercile_stats.columns = ['N', 'Mean Lag', 'Median Lag', 'SD']
print(tercile_stats)

30.4.2 Textual Quality Measures

The textual properties of corporate disclosures convey information about quality beyond what is captured by accounting numbers alone. Li (2008) demonstrates that annual reports with lower readability are associated with lower earnings persistence, suggesting that complex language may obscure unfavourable information. Loughran and McDonald (2014) critique the application of general readability formulas (Fog index, Flesch-Kincaid) to financial text, arguing that these metrics confound complexity with technical terminology.

We construct three textual quality measures adapted for Vietnamese corporate disclosures:

30.4.2.1 Document Length and Specificity

Longer disclosures are not inherently better—length may reflect boilerplate or obfuscation. However, Dyer, Lang, and Stice-Lawrence (2017) show that the informative component of disclosure (as opposed to standard legal language) has increased over time in U.S. 10-K filings. We measure:

  • Total word count: Raw length of the annual report narrative sections (MD&A equivalent)
  • Numerical density: Proportion of tokens that are numbers, percentages, or currency amounts, which is a proxy for specificity.
import re
from underthesea import word_tokenize

def compute_textual_metrics(text):
    """Compute textual quality metrics for Vietnamese corporate text."""
    if not text or len(text.strip()) == 0:
        return {
            'word_count': 0, 'sentence_count': 0,
            'numerical_density': 0, 'avg_sentence_length': 0,
            'unique_word_ratio': 0, 'forward_looking_density': 0
        }

    # Vietnamese word segmentation
    tokens = word_tokenize(text)
    sentences = re.split(r'[.!?。]', text)
    sentences = [s.strip() for s in sentences if len(s.strip()) > 5]

    word_count = len(tokens)
    sentence_count = max(len(sentences), 1)

    # Numerical density: proportion of tokens that are numeric
    num_pattern = re.compile(r'^[\d,.%]+$')
    numeric_tokens = sum(1 for t in tokens if num_pattern.match(t))
    numerical_density = numeric_tokens / max(word_count, 1)

    # Lexical diversity: unique words / total words
    unique_words = len(set(t.lower() for t in tokens))
    unique_word_ratio = unique_words / max(word_count, 1)

    # Forward-looking statement density
    forward_keywords = [
        'dự kiến', 'kế hoạch', 'mục tiêu', 'triển vọng',
        'định hướng', 'chiến lược', 'tương lai', 'sẽ',
        'dự báo', 'phấn đấu', 'cam kết', 'hướng tới'
    ]
    text_lower = text.lower()
    forward_count = sum(text_lower.count(kw) for kw in forward_keywords)
    forward_looking_density = forward_count / max(sentence_count, 1)

    return {
        'word_count': word_count,
        'sentence_count': sentence_count,
        'numerical_density': numerical_density,
        'avg_sentence_length': word_count / sentence_count,
        'unique_word_ratio': unique_word_ratio,
        'forward_looking_density': forward_looking_density
    }

# Retrieve annual report text from DataCore
annual_text = client.get_annual_report_text(
    exchanges=['HOSE', 'HNX'],
    start_date='2012-01-01',
    end_date='2024-12-31',
    sections=['mda', 'business_overview', 'risk_factors']
)

# Apply textual metrics
textual_metrics = annual_text.apply(
    lambda row: compute_textual_metrics(row['text']),
    axis=1, result_type='expand'
)
annual_text = pd.concat([annual_text, textual_metrics], axis=1)

print("Textual Quality Summary Statistics:")
print(annual_text[['word_count', 'numerical_density',
                    'avg_sentence_length', 'unique_word_ratio',
                    'forward_looking_density']].describe().round(3))

30.4.2.2 Forward-Looking Statement Density

Forward-looking statements reveal management’s expectations about future performance and are considered a higher-quality form of disclosure because they expose the manager to ex-post evaluation. In Vietnamese reports, forward-looking language typically appears in the form of phrases like dự kiến (expected), kế hoạch (plan), mục tiêu (target), and triển vọng (outlook).

Guay, Samuels, and Taylor (2016) show that managers use voluntary disclosure to “guide through the fog” when financial statements are complex. We operationalize forward-looking density as the number of forward-looking phrases per sentence, following the keyword approach in our compute_textual_metrics function above.

30.4.3 Accounting-Based Quality Proxies

We complement textual measures with accounting-based proxies that capture the reliability of reported financial information.

30.4.3.1 Accruals Quality

Following Francis et al. (2005), we measure accruals quality as the standard deviation of residuals from a regression of working capital accruals on past, current, and future operating cash flows:

\[ \frac{WC_{i,t}}{A_{i,t-1}} = \alpha + \beta_1 \frac{CFO_{i,t-1}}{A_{i,t-1}} + \beta_2 \frac{CFO_{i,t}}{A_{i,t-1}} + \beta_3 \frac{CFO_{i,t+1}}{A_{i,t-1}} + \varepsilon_{i,t} \tag{30.1}\]

where \(WC_{i,t}\) is working capital accruals, \(CFO_{i,t}\) is operating cash flow, and \(A_{i,t-1}\) is lagged total assets. The firm-level standard deviation of \(\hat{\varepsilon}_{i,t}\) over a rolling window (typically 5 years) is the accruals quality measure, with higher values indicating lower quality.

def estimate_accruals_quality(df, min_obs=5):
    """
    Estimate accruals quality as std dev of DD residuals
    over a rolling 5-year window for each firm.
    """
    results = []

    for ticker, group in df.groupby('ticker'):
        group = group.sort_values('fiscal_year')

        # Construct leads/lags of CFO
        group['cfo_lag1'] = group['operating_cash_flow'].shift(1)
        group['cfo_lead1'] = group['operating_cash_flow'].shift(-1)

        # Scale by lagged assets
        group['lag_assets'] = group['total_assets'].shift(1)
        for col in ['total_accruals', 'operating_cash_flow',
                     'cfo_lag1', 'cfo_lead1']:
            group[f'{col}_scaled'] = group[col] / group['lag_assets']

        # Rolling 5-year residual std dev
        for idx in range(len(group)):
            window = group.iloc[max(0, idx - 4):idx + 1]
            window = window.dropna(subset=[
                'total_accruals_scaled', 'operating_cash_flow_scaled',
                'cfo_lag1_scaled', 'cfo_lead1_scaled'
            ])

            if len(window) >= min_obs:
                y = window['total_accruals_scaled']
                X = sm.add_constant(window[[
                    'cfo_lag1_scaled',
                    'operating_cash_flow_scaled',
                    'cfo_lead1_scaled'
                ]])
                try:
                    model = sm.OLS(y, X).fit()
                    results.append({
                        'ticker': ticker,
                        'fiscal_year': group.iloc[idx]['fiscal_year'],
                        'accruals_quality': model.resid.std()
                    })
                except Exception:
                    pass

    return pd.DataFrame(results)

aq_df = estimate_accruals_quality(financials)
print(f"Accruals quality computed for {aq_df['ticker'].nunique()} firms")
print(aq_df['accruals_quality'].describe().round(4))

30.4.3.2 Earnings Persistence and Predictability

Persistent earnings are more useful for valuation. We estimate earnings persistence as the slope coefficient \(\phi_1\) from a first-order autoregression:

\[ \frac{E_{i,t}}{A_{i,t-1}} = \phi_0 + \phi_1 \frac{E_{i,t-1}}{A_{i,t-2}} + \nu_{i,t} \tag{30.2}\]

Higher \(\hat{\phi}_1\) indicates more persistent (and arguably higher-quality) earnings.

def estimate_persistence(df, min_obs=5):
    """Estimate earnings persistence via AR(1) model."""
    results = []

    for ticker, group in df.groupby('ticker'):
        group = group.sort_values('fiscal_year')
        group['earnings_scaled'] = group['net_income'] / group['total_assets'].shift(1)
        group['earnings_lag'] = group['earnings_scaled'].shift(1)

        clean = group.dropna(subset=['earnings_scaled', 'earnings_lag'])
        if len(clean) >= min_obs:
            y = clean['earnings_scaled']
            X = sm.add_constant(clean[['earnings_lag']])
            model = sm.OLS(y, X).fit()
            results.append({
                'ticker': ticker,
                'persistence': model.params['earnings_lag'],
                'persistence_se': model.bse['earnings_lag'],
                'r_squared': model.rsquared,
                'n_obs': model.nobs
            })

    return pd.DataFrame(results)

persistence_df = estimate_persistence(financials)
print(persistence_df[['persistence', 'r_squared']].describe().round(3))

30.4.4 Composite Disclosure Quality Index

Individual quality proxies capture different facets of the information environment. To aggregate them into a single score while avoiding arbitrary weighting, we follow Lang and Lundholm (1993) and use a rank-based composite. For each firm-year, we rank firms on each of the following dimensions (higher rank = higher quality) (Table 30.2).

Table 30.2: Components of the composite disclosure quality index.
Dimension Proxy Direction
Timeliness Reporting lag Lower is better
Specificity Numerical density Higher is better
Forward-looking FLS density Higher is better
Earnings quality Accruals quality (DD) Lower σ is better
Persistence AR(1) coefficient Higher is better

We convert each proxy to a percentile rank within each fiscal year (so each component ranges from 0 to 1), then average across components:

\[ DQ_{i,t} = \frac{1}{K} \sum_{k=1}^{K} \text{Rank}_{k,i,t} \tag{30.3}\]

where \(K\) is the number of available components and \(\text{Rank}_{k,i,t}\) is the percentile rank of firm \(i\) in year \(t\) on dimension \(k\).

# Merge all quality proxies
quality_panel = (
    annual_filings[['ticker', 'fiscal_year', 'reporting_lag']]
    .merge(
        annual_text[['ticker', 'fiscal_year', 'numerical_density',
                      'forward_looking_density']],
        on=['ticker', 'fiscal_year'], how='left'
    )
    .merge(aq_df, on=['ticker', 'fiscal_year'], how='left')
    .merge(persistence_df[['ticker', 'persistence']],
           on='ticker', how='left')
)

# Rank each component within fiscal year (higher = better quality)
def year_percentile_rank(series):
    """Convert to percentile rank within group."""
    return series.rank(pct=True)

rank_cols = {}
for col, ascending in [
    ('reporting_lag', False),       # lower lag = better → invert
    ('numerical_density', True),    # higher = better
    ('forward_looking_density', True),
    ('accruals_quality', False),    # lower volatility = better → invert
    ('persistence', True)           # higher = better
]:
    col_to_rank = quality_panel[col] if ascending else -quality_panel[col]
    rank_cols[f'rank_{col}'] = (
        quality_panel
        .groupby('fiscal_year')[col]
        .transform(lambda x: x.rank(pct=True) if ascending
                   else (-x).rank(pct=True))
    )

rank_df = pd.DataFrame(rank_cols)
quality_panel = pd.concat([quality_panel, rank_df], axis=1)

# Composite index: average of available ranks
rank_columns = [c for c in quality_panel.columns if c.startswith('rank_')]
quality_panel['dq_index'] = quality_panel[rank_columns].mean(axis=1)

print("Disclosure Quality Index Distribution:")
print(quality_panel['dq_index'].describe().round(3))
fig, ax = plt.subplots(figsize=(10, 5))
ax.hist(quality_panel['dq_index'].dropna(), bins=50,
        color='#2C5F8A', edgecolor='white', alpha=0.85)
ax.axvline(quality_panel['dq_index'].median(), color='#E67E22',
           linestyle='--', linewidth=2, label='Median')
ax.set_xlabel('Disclosure Quality Index')
ax.set_ylabel('Number of Firm-Years')
ax.set_title('Distribution of Composite Disclosure Quality')
ax.legend()
plt.tight_layout()
plt.show()
Figure 30.2

30.5 Determinants of Disclosure Quality

What drives variation in disclosure quality across Vietnamese firms? We estimate a cross-sectional regression of the composite DQ index on firm characteristics and governance variables:

\[ DQ_{i,t} = \alpha + \beta_1 \ln(\text{Size}_{i,t}) + \beta_2 \text{ROA}_{i,t} + \beta_3 \text{Lev}_{i,t} + \beta_4 \text{StateOwn}_{i,t} + \beta_5 \text{ForeignOwn}_{i,t} + \beta_6 \text{Big4}_{i,t} + \beta_7 \text{BoardIndep}_{i,t} + \gamma_t + \varepsilon_{i,t} \tag{30.4}\]

where \(\gamma_t\) are year fixed effects.

The theoretical predictions, drawing on Lang and Lundholm (1993), Hope (2003), and Bushman et al. (2004), are:

  • Size (+): Larger firms face greater public scrutiny and have lower proprietary costs relative to the benefits of disclosure.
  • ROA (+/−): Profitable firms may disclose more to signal quality, but firms managing earnings downward (for tax purposes) may reduce disclosure to avoid scrutiny.
  • Leverage (+): Sengupta (1998) argues that firms with more debt have stronger incentives to maintain disclosure quality to lower borrowing costs.
  • State ownership (−): SOEs may face weaker market discipline and political incentives to limit transparency.
  • Foreign ownership (+): Foreign institutional investors demand higher transparency.
  • Big 4 auditor (+): High-quality auditors constrain earnings management and indirectly improve disclosure quality.
  • Board independence (+): Independent directors improve monitoring and encourage more informative disclosure.
# Merge quality index with financials and governance
det_panel = (
    quality_panel[['ticker', 'fiscal_year', 'dq_index']]
    .merge(financials, on=['ticker', 'fiscal_year'], how='left')
    .merge(governance, on=['ticker', 'fiscal_year'], how='left')
)

# Construct variables
det_panel['log_size'] = np.log(det_panel['total_assets'])
det_panel['roa'] = det_panel['net_income'] / det_panel['total_assets']
det_panel['leverage'] = (
    (det_panel['total_assets'] - det_panel['total_equity'])
    / det_panel['total_assets']
)

# Panel regression with year FE
det_panel = det_panel.set_index(['ticker', 'fiscal_year'])

model_det = PanelOLS(
    dependent=det_panel['dq_index'],
    exog=sm.add_constant(det_panel[[
        'log_size', 'roa', 'leverage',
        'state_ownership_pct', 'foreign_ownership_pct',
        'big4_auditor', 'board_independence_pct'
    ]]),
    entity_effects=False,
    time_effects=True,
    check_rank=False
).fit(cov_type='clustered', cluster_entity=True)

print(model_det.summary)
coefs = model_det.params.drop('const')
ci = model_det.conf_int().drop('const')

fig, ax = plt.subplots(figsize=(8, 5))
y_pos = range(len(coefs))
labels = [
    'ln(Assets)', 'ROA', 'Leverage', 'State Own %',
    'Foreign Own %', 'Big 4 Auditor', 'Board Indep %'
]

colors = ['#2C5F8A' if c > 0 else '#C0392B' for c in coefs.values]
ax.barh(y_pos, coefs.values, color=colors, alpha=0.8, height=0.6)
ax.errorbar(
    coefs.values, y_pos,
    xerr=[coefs.values - ci.iloc[:, 0].values,
          ci.iloc[:, 1].values - coefs.values],
    fmt='none', color='black', capsize=3
)
ax.axvline(x=0, color='gray', linewidth=0.8, linestyle='-')
ax.set_yticks(y_pos)
ax.set_yticklabels(labels)
ax.set_xlabel('Coefficient Estimate')
ax.set_title('Determinants of Disclosure Quality')
plt.tight_layout()
plt.show()
Figure 30.3

30.6 Strategic Disclosure Timing

30.6.1 Day-of-Week Effects

DellaVigna and Pollet (2009) document that Friday earnings announcements receive less immediate market attention. We test whether this pattern holds in Vietnam, where the trading week runs Monday through Friday but the retail-dominated investor base may exhibit different attention patterns.

annual_filings['announcement_dow'] = (
    annual_filings['announcement_date'].dt.dayofweek
)
annual_filings['day_name'] = (
    annual_filings['announcement_date'].dt.day_name()
)

# Compute surprise: actual earnings minus naive expectation (last year's earnings)
annual_filings = annual_filings.merge(
    financials[['ticker', 'fiscal_year', 'net_income', 'total_assets']],
    on=['ticker', 'fiscal_year'], how='left'
)
annual_filings['earnings_scaled'] = (
    annual_filings['net_income'] / annual_filings['total_assets']
)
annual_filings['earnings_surprise'] = (
    annual_filings
    .groupby('ticker')['earnings_scaled']
    .diff()
)

# Classify as good/bad news
annual_filings['bad_news'] = (
    annual_filings['earnings_surprise'] < 0
).astype(int)

# Day-of-week distribution by news type
dow_crosstab = pd.crosstab(
    annual_filings['day_name'],
    annual_filings['bad_news'].map({0: 'Good News', 1: 'Bad News'}),
    normalize='columns'
)
# Reorder days
day_order = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
dow_crosstab = dow_crosstab.reindex(day_order)

print("Proportion of Announcements by Day and News Type:")
print(dow_crosstab.round(3))
fig, ax = plt.subplots(figsize=(10, 5))
x = np.arange(len(day_order))
width = 0.35

bad_pct = dow_crosstab['Bad News'].values
good_pct = dow_crosstab['Good News'].values

ax.bar(x - width/2, good_pct, width, label='Good News',
       color='#27AE60', alpha=0.8)
ax.bar(x + width/2, bad_pct, width, label='Bad News',
       color='#C0392B', alpha=0.8)
ax.set_xticks(x)
ax.set_xticklabels(day_order)
ax.set_ylabel('Proportion of Announcements')
ax.set_title('Strategic Timing: Day-of-Week Announcement Patterns')
ax.legend()
plt.tight_layout()
plt.show()
Figure 30.4

30.6.2 Announcement Congestion

When many firms announce on the same day, each announcement receives less attention. We measure announcement congestion as the number of other firms making earnings announcements on the same date:

\[ \text{Congestion}_{i,t} = \sum_{j \neq i} \mathbf{1}\{\text{AnnDate}_{j} = \text{AnnDate}_{i}\} \tag{30.5}\]

Hirshleifer, Lim, and Teoh (2009) predict that firms burying bad news will choose high-congestion days. We test this by regressing the congestion variable on the sign of earnings news:

# Count announcements per date
ann_counts = (
    annual_filings
    .groupby('announcement_date')
    .size()
    .reset_index(name='n_announcements')
)
annual_filings = annual_filings.merge(
    ann_counts, on='announcement_date', how='left'
)
annual_filings['congestion'] = annual_filings['n_announcements'] - 1

# Regression: congestion ~ bad_news + controls
congestion_model = smf.ols(
    'congestion ~ bad_news + log_size + roa + C(fiscal_year)',
    data=annual_filings.assign(
        log_size=np.log(annual_filings['total_assets']),
        roa=annual_filings['net_income'] / annual_filings['total_assets']
    )
).fit(cov_type='cluster', cov_kwds={'groups': annual_filings['ticker']})

print("Congestion Regression:")
print(congestion_model.summary().tables[1])

30.6.3 After-Hours and Weekend Announcements

Vietnamese regulations require disclosure within 24 hours of material events, but firms retain discretion over the exact timing. Announcements made after the trading session closes (after 3:00 PM on HOSE/HNX) or on weekends delay the market’s opportunity to react by at least one trading day.

# Assume announcement timestamps are available
annual_filings['ann_hour'] = (
    annual_filings['announcement_date'].dt.hour
)
annual_filings['after_hours'] = (
    (annual_filings['ann_hour'] >= 15) |
    (annual_filings['announcement_dow'] >= 5)  # Saturday/Sunday
).astype(int)

# Cross-tabulate after-hours by news type
afterhours_crosstab = pd.crosstab(
    annual_filings['after_hours'].map({0: 'During Hours', 1: 'After Hours'}),
    annual_filings['bad_news'].map({0: 'Good News', 1: 'Bad News'}),
    normalize='index'
)
print("News Distribution by Announcement Timing:")
print(afterhours_crosstab.round(3))

# Chi-squared test
contingency = pd.crosstab(
    annual_filings['after_hours'], annual_filings['bad_news']
)
chi2, p_val, _, _ = stats.chi2_contingency(contingency)
print(f"\nChi-squared = {chi2:.2f}, p-value = {p_val:.4f}")

30.7 Market Consequences of Disclosure Quality

30.7.1 Disclosure Quality and the Cost of Equity

The central prediction of Diamond and Verrecchia (1991) and Botosan (1997) is that higher-quality disclosure lowers the cost of equity capital by reducing information asymmetry. We test this using the implied cost of capital (ICC) approach, where we estimate the discount rate that equates the current price to the present value of expected future earnings.

We use the PEG ratio approach as a simple ICC estimate:

\[ r_{PEG,i,t} = \sqrt{\frac{\hat{E}_{i,t+2} - \hat{E}_{i,t+1}}{P_{i,t}}} \tag{30.6}\]

where \(\hat{E}_{i,t+k}\) is the consensus earnings forecast (or, in the absence of analyst coverage, a model-based forecast) and \(P_{i,t}\) is the current stock price.

# Construct earnings forecasts using a simple random walk with drift
forecasts = financials.sort_values(['ticker', 'fiscal_year']).copy()
forecasts['eps'] = forecasts['net_income'] / forecasts['market_cap']
forecasts['eps_growth'] = forecasts.groupby('ticker')['eps'].pct_change()

# Simple forecast: E[t+1] = E[t] * (1 + avg_growth)
forecasts['avg_growth'] = (
    forecasts.groupby('ticker')['eps_growth']
    .transform(lambda x: x.rolling(3, min_periods=2).mean())
)
forecasts['eps_f1'] = forecasts['eps'] * (1 + forecasts['avg_growth'])
forecasts['eps_f2'] = forecasts['eps_f1'] * (1 + forecasts['avg_growth'])

# PEG-based ICC
forecasts['icc_peg'] = np.sqrt(
    np.maximum(forecasts['eps_f2'] - forecasts['eps_f1'], 0)
    / np.maximum(forecasts['market_cap'] / 1e6, 1e-6)
)

# Merge with disclosure quality
icc_panel = (
    forecasts[['ticker', 'fiscal_year', 'icc_peg']]
    .merge(quality_panel[['ticker', 'fiscal_year', 'dq_index']].reset_index(drop=True),
           on=['ticker', 'fiscal_year'], how='inner')
    .merge(governance, on=['ticker', 'fiscal_year'], how='left')
    .merge(financials[['ticker', 'fiscal_year', 'total_assets',
                        'book_to_market', 'market_cap']],
           on=['ticker', 'fiscal_year'], how='left')
)

icc_panel['log_size'] = np.log(icc_panel['market_cap'])
icc_panel = icc_panel.set_index(['ticker', 'fiscal_year'])

# Panel regression: ICC ~ DQ + controls
icc_model = PanelOLS(
    dependent=icc_panel['icc_peg'],
    exog=sm.add_constant(icc_panel[[
        'dq_index', 'log_size', 'book_to_market'
    ]]),
    entity_effects=True,
    time_effects=True,
    check_rank=False
).fit(cov_type='clustered', cluster_entity=True)

print("Implied Cost of Capital ~ Disclosure Quality:")
print(icc_model.summary)

30.7.2 Disclosure Quality and Liquidity

Diamond and Verrecchia (1991) predict that better disclosure reduces adverse selection and improves liquidity. We measure liquidity through bid-ask spreads and the Amihud illiquidity ratio:

\[ \text{Amihud}_{i,t} = \frac{1}{D_{i,t}} \sum_{d=1}^{D_{i,t}} \frac{|R_{i,d}|}{\text{Volume}_{i,d}} \tag{30.7}\]

where \(R_{i,d}\) is the daily return and \(\text{Volume}_{i,d}\) is the daily trading volume in VND.

# Compute annual Amihud illiquidity
trading['abs_return'] = trading['close'].pct_change().abs()
trading['amihud_daily'] = trading['abs_return'] / (trading['volume'] * trading['close'])

amihud_annual = (
    trading
    .assign(fiscal_year=trading['date'].dt.year)
    .groupby(['ticker', 'fiscal_year'])
    .agg(
        amihud=('amihud_daily', 'mean'),
        avg_spread=('bid_ask_spread', 'mean'),
        avg_turnover=('turnover', 'mean')
    )
    .reset_index()
)

# Log transform for better distributional properties
amihud_annual['log_amihud'] = np.log(amihud_annual['amihud'] + 1e-10)
amihud_annual['log_spread'] = np.log(amihud_annual['avg_spread'] + 1e-6)

# Merge and run regression
liq_panel = (
    amihud_annual
    .merge(quality_panel[['ticker', 'fiscal_year', 'dq_index']].reset_index(drop=True),
           on=['ticker', 'fiscal_year'], how='inner')
    .merge(financials[['ticker', 'fiscal_year', 'market_cap', 'total_assets']],
           on=['ticker', 'fiscal_year'], how='left')
)
liq_panel['log_size'] = np.log(liq_panel['market_cap'])
liq_panel = liq_panel.set_index(['ticker', 'fiscal_year'])

liq_model = PanelOLS(
    dependent=liq_panel['log_amihud'],
    exog=sm.add_constant(liq_panel[['dq_index', 'log_size']]),
    entity_effects=True,
    time_effects=True,
    check_rank=False
).fit(cov_type='clustered', cluster_entity=True)

print("Amihud Illiquidity ~ Disclosure Quality:")
print(liq_model.summary)
liq_panel_plot = liq_panel.reset_index()
liq_panel_plot['dq_quintile'] = pd.qcut(
    liq_panel_plot['dq_index'], 5, labels=['Q1\n(Low)', 'Q2', 'Q3', 'Q4', 'Q5\n(High)']
)

quintile_liq = (
    liq_panel_plot
    .groupby('dq_quintile')['log_amihud']
    .agg(['mean', 'sem'])
)

fig, ax = plt.subplots(figsize=(8, 5))
bars = ax.bar(
    range(5), quintile_liq['mean'],
    yerr=1.96 * quintile_liq['sem'],
    color=['#C0392B', '#E67E22', '#F1C40F', '#27AE60', '#2C5F8A'],
    alpha=0.85, capsize=4, edgecolor='white'
)
ax.set_xticks(range(5))
ax.set_xticklabels(quintile_liq.index)
ax.set_xlabel('Disclosure Quality Quintile')
ax.set_ylabel('Log Amihud Illiquidity')
ax.set_title('Disclosure Quality and Market Liquidity')
plt.tight_layout()
plt.show()
Figure 30.5

30.7.3 Event Study: Market Reaction to Filing Lag

We examine whether the market reacts differently to early vs. late filers by computing cumulative abnormal returns (CARs) around the filing date:

\[ CAR_{i}[\tau_1, \tau_2] = \sum_{t=\tau_1}^{\tau_2} (R_{i,t} - \hat{R}_{i,t}) \tag{30.8}\]

where \(\hat{R}_{i,t}\) is the expected return from a market model estimated over a pre-event window \([-250, -30]\).

def compute_car(ticker, event_date, trading_df,
                est_window=(-250, -30), event_window=(-5, 10)):
    """Compute CAR around an event date using market model."""
    firm_data = trading_df[trading_df['ticker'] == ticker].copy()
    firm_data = firm_data.sort_values('date')

    # Find event date index
    event_idx = firm_data[firm_data['date'] >= event_date].index
    if len(event_idx) == 0:
        return None
    event_idx = event_idx[0]
    event_pos = firm_data.index.get_loc(event_idx)

    # Check sufficient data
    if event_pos + est_window[0] < 0:
        return None

    # Estimation window
    est_start = event_pos + est_window[0]
    est_end = event_pos + est_window[1]
    est_data = firm_data.iloc[est_start:est_end + 1]

    firm_ret = est_data['close'].pct_change()
    mkt_ret = est_data['market_return']

    valid = firm_ret.notna() & mkt_ret.notna()
    if valid.sum() < 100:
        return None

    # Market model
    X = sm.add_constant(mkt_ret[valid])
    model = sm.OLS(firm_ret[valid], X).fit()

    # Event window
    ev_start = event_pos + event_window[0]
    ev_end = event_pos + event_window[1]
    ev_data = firm_data.iloc[ev_start:ev_end + 1]

    ev_ret = ev_data['close'].pct_change()
    ev_mkt = ev_data['market_return']
    expected_ret = model.params['const'] + model.params['market_return'] * ev_mkt
    abnormal_ret = ev_ret - expected_ret

    return abnormal_ret.cumsum().values

# Sample: compute CARs for annual filings
car_results = []
for _, row in annual_filings.sample(min(2000, len(annual_filings))).iterrows():
    car = compute_car(row['ticker'], row['filing_date'], trading)
    if car is not None and len(car) == 16:  # -5 to +10
        car_results.append({
            'ticker': row['ticker'],
            'fiscal_year': row['fiscal_year'],
            'lag_tercile': row['lag_tercile'],
            'car': car
        })

car_df = pd.DataFrame(car_results)
print(f"Computed CARs for {len(car_df)} firm-year events")
event_days = range(-5, 11)

fig, ax = plt.subplots(figsize=(10, 6))
colors = {'Early': '#27AE60', 'Middle': '#F1C40F', 'Late': '#C0392B'}

for tercile in ['Early', 'Middle', 'Late']:
    subset = car_df[car_df['lag_tercile'] == tercile]
    if len(subset) > 0:
        avg_car = np.mean(np.stack(subset['car'].values), axis=0)
        se_car = np.std(np.stack(subset['car'].values), axis=0) / np.sqrt(len(subset))
        ax.plot(event_days, avg_car, color=colors[tercile],
                linewidth=2, label=tercile)
        ax.fill_between(event_days,
                        avg_car - 1.96 * se_car,
                        avg_car + 1.96 * se_car,
                        color=colors[tercile], alpha=0.15)

ax.axvline(x=0, color='gray', linestyle='--', linewidth=0.8)
ax.axhline(y=0, color='gray', linewidth=0.5)
ax.set_xlabel('Event Day (Relative to Filing Date)')
ax.set_ylabel('Cumulative Abnormal Return')
ax.set_title('Market Reaction Around Filing Date by Timeliness')
ax.legend(title='Filing Tercile')
plt.tight_layout()
plt.show()
Figure 30.6

30.8 Filing Timeliness and Earnings Quality

Givoly and Palmon (1982) and Chambers and Penman (1984) establish that the content of disclosed information is correlated with its timing. We test this link formally: do late filers have worse earnings quality?

tq_panel = (
    annual_filings[['ticker', 'fiscal_year', 'lag_tercile', 'reporting_lag']]
    .merge(aq_df, on=['ticker', 'fiscal_year'], how='inner')
    .merge(persistence_df[['ticker', 'persistence']], on='ticker', how='left')
)

fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Panel A: Accruals quality by tercile
aq_by_tercile = tq_panel.groupby('lag_tercile')['accruals_quality'].mean()
axes[0].bar(
    range(3), aq_by_tercile.values,
    color=['#27AE60', '#F1C40F', '#C0392B'], alpha=0.85,
    edgecolor='white'
)
axes[0].set_xticks(range(3))
axes[0].set_xticklabels(['Early', 'Middle', 'Late'])
axes[0].set_ylabel('Accruals Quality (σ of DD Residuals)')
axes[0].set_title('Panel A: Accruals Quality by Filing Tercile')
axes[0].text(0.05, 0.95, 'Higher = lower quality',
             transform=axes[0].transAxes, fontsize=9,
             verticalalignment='top', style='italic', color='gray')

# Panel B: Persistence by tercile
per_by_tercile = tq_panel.groupby('lag_tercile')['persistence'].mean()
axes[1].bar(
    range(3), per_by_tercile.values,
    color=['#27AE60', '#F1C40F', '#C0392B'], alpha=0.85,
    edgecolor='white'
)
axes[1].set_xticks(range(3))
axes[1].set_xticklabels(['Early', 'Middle', 'Late'])
axes[1].set_ylabel('Earnings Persistence (AR(1) Coefficient)')
axes[1].set_title('Panel B: Earnings Persistence by Filing Tercile')

plt.tight_layout()
plt.show()
Figure 30.7

We formalize this with a regression that controls for firm characteristics:

tq_panel_reg = tq_panel.merge(
    financials[['ticker', 'fiscal_year', 'total_assets', 'net_income',
                'total_equity']],
    on=['ticker', 'fiscal_year'], how='left'
).merge(governance, on=['ticker', 'fiscal_year'], how='left')

tq_panel_reg['log_size'] = np.log(tq_panel_reg['total_assets'])
tq_panel_reg['roa'] = tq_panel_reg['net_income'] / tq_panel_reg['total_assets']
tq_panel_reg['late'] = (tq_panel_reg['lag_tercile'] == 'Late').astype(int)

model_tq = smf.ols(
    'accruals_quality ~ late + log_size + roa + state_ownership_pct '
    '+ big4_auditor + C(fiscal_year)',
    data=tq_panel_reg
).fit(cov_type='cluster', cov_kwds={'groups': tq_panel_reg['ticker']})

print("Accruals Quality ~ Late Filing:")
print(model_tq.summary().tables[1])
NoteEndogeneity Caveat

The association between filing timeliness and earnings quality is likely endogenous: firms with complex accounting issues take longer to prepare financial statements, and the same complexity drives lower earnings quality. The filing lag is thus best interpreted as an observable signal of underlying accounting difficulty rather than a causal determinant. Instrumental variable approaches (e.g., using auditor busyness during peak filing season as an instrument for filing lag) can partially address this concern.

30.9 Disclosure Quality and Investment Efficiency

Biddle, Hilary, and Verdi (2009) demonstrate that higher financial reporting quality is associated with more efficient investment. Specifically, it reduces both over-investment (in firms with excess cash) and under-investment (in firms that are financially constrained). The mechanism is that better disclosure reduces information asymmetry between managers and capital providers, improving the allocation of capital.

We test this prediction in Vietnam using the Biddle, Hilary, and Verdi (2009) framework:

\[ \text{Investment}_{i,t+1} = \alpha + \beta_1 \text{SalesGrowth}_{i,t} + \varepsilon_{i,t+1} \tag{30.9}\]

The residual \(\hat{\varepsilon}_{i,t+1}\) measures deviation from expected investment. Positive residuals indicate over-investment; negative residuals indicate under-investment. We then test whether the absolute value of this residual is lower for firms with higher disclosure quality.

inv_panel = financials.sort_values(['ticker', 'fiscal_year']).copy()

# Investment = change in total assets / lagged total assets
inv_panel['investment'] = (
    inv_panel.groupby('ticker')['total_assets'].pct_change()
)
inv_panel['sales_growth'] = (
    inv_panel.groupby('ticker')['revenue'].pct_change()
)

# Expected investment model
inv_model = smf.ols(
    'investment ~ sales_growth',
    data=inv_panel
).fit()
inv_panel['inv_residual'] = inv_model.resid
inv_panel['abs_inv_residual'] = inv_panel['inv_residual'].abs()

# Merge with disclosure quality
inv_eff = (
    inv_panel[['ticker', 'fiscal_year', 'abs_inv_residual',
               'investment', 'total_assets']]
    .merge(quality_panel[['ticker', 'fiscal_year', 'dq_index']].reset_index(drop=True),
           on=['ticker', 'fiscal_year'], how='inner')
)
inv_eff['log_size'] = np.log(inv_eff['total_assets'])
inv_eff = inv_eff.set_index(['ticker', 'fiscal_year'])

# Panel regression
inv_eff_model = PanelOLS(
    dependent=inv_eff['abs_inv_residual'],
    exog=sm.add_constant(inv_eff[['dq_index', 'log_size']]),
    entity_effects=True,
    time_effects=True,
    check_rank=False
).fit(cov_type='clustered', cluster_entity=True)

print("Investment Inefficiency ~ Disclosure Quality:")
print(inv_eff_model.summary)

A negative coefficient on dq_index indicates that higher disclosure quality is associated with lower investment inefficiency: firms with better disclosure make investment decisions closer to what their growth opportunities warrant.

30.10 Vietnamese Institutional Context

30.10.1 State Ownership and Disclosure

SOEs account for a substantial share of Vietnamese market capitalization. The relationship between state ownership and disclosure quality is theoretically ambiguous. On one hand, political connections may reduce the pressure to disclose transparently; government shareholders may tolerate opacity that private shareholders would not. On the other hand, post-equitization monitoring by multiple stakeholders (MOF, SCIC, minority shareholders) may create competing disclosure demands.

soe_panel = (
    quality_panel[['ticker', 'fiscal_year', 'dq_index',
                    'reporting_lag']].reset_index(drop=True)
    .merge(governance[['ticker', 'fiscal_year', 'state_ownership_pct']],
           on=['ticker', 'fiscal_year'], how='inner')
)

soe_panel['soe'] = (soe_panel['state_ownership_pct'] >= 50).astype(int)
soe_panel['soe_label'] = soe_panel['soe'].map(
    {1: 'SOE (≥50%)', 0: 'Private (<50%)'}
)

# Compare means
comparison = (
    soe_panel
    .groupby('soe_label')
    .agg(
        n=('dq_index', 'count'),
        mean_dq=('dq_index', 'mean'),
        median_dq=('dq_index', 'median'),
        mean_lag=('reporting_lag', 'mean'),
        median_lag=('reporting_lag', 'median')
    )
    .round(3)
)
print("SOE vs Private Firm Disclosure Comparison:")
print(comparison)

# Formal t-test
soe_dq = soe_panel[soe_panel['soe'] == 1]['dq_index']
priv_dq = soe_panel[soe_panel['soe'] == 0]['dq_index']
t_stat, p_val = stats.ttest_ind(soe_dq.dropna(), priv_dq.dropna())
print(f"\nt-test: t = {t_stat:.3f}, p = {p_val:.4f}")

30.10.2 IFRS Convergence and Disclosure Quality

Vietnam has been pursuing a phased convergence toward IFRS, with the Ministry of Finance issuing a roadmap for voluntary adoption by large listed firms. The transition from VAS to IFRS-aligned standards is expected to expand disclosure requirements—particularly for financial instruments (IFRS 9), revenue recognition (IFRS 15), and leases (IFRS 16). Barth, Landsman, and Lang (2008) provide evidence that IFRS adoption is associated with improvements in earnings quality and disclosure, though the effect depends on enforcement strength.

We can exploit the staggered timing of voluntary IFRS adoption across Vietnamese firms as a natural experiment:

# Assume DataCore provides IFRS adoption dates
ifrs_adoption = client.get_ifrs_adoption(
    exchanges=['HOSE', 'HNX'],
    fields=['ticker', 'ifrs_adoption_year']
)

# Merge with quality panel
ifrs_panel = (
    quality_panel[['ticker', 'fiscal_year', 'dq_index']].reset_index(drop=True)
    .merge(ifrs_adoption, on='ticker', how='left')
)

# Treatment indicator
ifrs_panel['post_ifrs'] = (
    ifrs_panel['fiscal_year'] >= ifrs_panel['ifrs_adoption_year']
).astype(int).fillna(0)

ifrs_panel['treated'] = ifrs_panel['ifrs_adoption_year'].notna().astype(int)

# Simple DiD
ifrs_panel = ifrs_panel.set_index(['ticker', 'fiscal_year'])
did_model = PanelOLS(
    dependent=ifrs_panel['dq_index'],
    exog=sm.add_constant(ifrs_panel[['post_ifrs']]),
    entity_effects=True,
    time_effects=True,
    check_rank=False
).fit(cov_type='clustered', cluster_entity=True)

print("DiD: IFRS Adoption and Disclosure Quality:")
print(did_model.summary)
NoteIdentification Concern

Voluntary IFRS adoption is endogenous because firms that choose to adopt early may already have higher-quality disclosure. The two-way fixed effects DiD absorbs time-invariant firm characteristics and common time trends, but cannot fully address selection on time-varying unobservables. Researchers should consider matching estimators (e.g., propensity score matching on pre-adoption characteristics) or instrumental variable approaches as robustness checks.

30.11 Predicting Late Filings

Can we predict which firms will file late? This is valuable for portfolio construction (avoiding potential bad-news firms) and for regulators (targeting enforcement resources). We use a logistic model with financial and governance predictors:

\[ \Pr(\text{Late}_{i,t} = 1) = \Lambda\left(\alpha + \boldsymbol{\beta}'\mathbf{X}_{i,t-1}\right) \tag{30.10}\]

where \(\Lambda(\cdot)\) is the logistic function and \(\mathbf{X}_{i,t-1}\) are lagged predictors.

from sklearn.metrics import roc_auc_score, classification_report
from sklearn.model_selection import cross_val_score
from sklearn.linear_model import LogisticRegression

pred_panel = (
    annual_filings[['ticker', 'fiscal_year', 'late_filing']]
    .merge(financials, on=['ticker', 'fiscal_year'], how='left')
    .merge(governance, on=['ticker', 'fiscal_year'], how='left')
)

# Lagged predictors
pred_panel = pred_panel.sort_values(['ticker', 'fiscal_year'])
for col in ['total_assets', 'net_income', 'operating_cash_flow',
            'total_equity', 'revenue']:
    pred_panel[f'{col}_lag'] = pred_panel.groupby('ticker')[col].shift(1)

pred_panel['log_size_lag'] = np.log(pred_panel['total_assets_lag'])
pred_panel['roa_lag'] = (
    pred_panel['net_income_lag'] / pred_panel['total_assets_lag']
)
pred_panel['leverage_lag'] = (
    (pred_panel['total_assets_lag'] - pred_panel['total_equity_lag'])
    / pred_panel['total_assets_lag']
)
pred_panel['cfo_ratio_lag'] = (
    pred_panel['operating_cash_flow_lag'] / pred_panel['total_assets_lag']
)

# Previous late filing indicator
pred_panel['prev_late'] = (
    pred_panel.groupby('ticker')['late_filing'].shift(1)
)

features = [
    'log_size_lag', 'roa_lag', 'leverage_lag', 'cfo_ratio_lag',
    'state_ownership_pct', 'foreign_ownership_pct',
    'big4_auditor', 'board_independence_pct', 'prev_late'
]

clean = pred_panel.dropna(subset=features + ['late_filing'])
X = clean[features]
y = clean['late_filing']

# Standardize
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Logistic regression with cross-validation
lr = LogisticRegression(max_iter=1000, penalty='l2', C=1.0)
cv_scores = cross_val_score(lr, X_scaled, y, cv=5, scoring='roc_auc')

print(f"5-Fold Cross-Validated AUC: {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")

# Fit on full sample for coefficient interpretation
lr.fit(X_scaled, y)
coef_df = pd.DataFrame({
    'Feature': features,
    'Coefficient': lr.coef_[0],
    'Odds Ratio': np.exp(lr.coef_[0])
}).sort_values('Coefficient', ascending=False)

print("\nLogistic Regression Coefficients:")
print(coef_df.to_string(index=False))
from sklearn.metrics import roc_curve, auc

lr.fit(X_scaled, y)
y_prob = lr.predict_proba(X_scaled)[:, 1]
fpr, tpr, _ = roc_curve(y, y_prob)
roc_auc = auc(fpr, tpr)

fig, ax = plt.subplots(figsize=(7, 7))
ax.plot(fpr, tpr, color='#2C5F8A', linewidth=2,
        label=f'Logistic Model (AUC = {roc_auc:.3f})')
ax.plot([0, 1], [0, 1], color='gray', linestyle='--', linewidth=1)
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Late Filing Prediction: ROC Curve')
ax.legend(loc='lower right')
ax.set_aspect('equal')
plt.tight_layout()
plt.show()
Figure 30.8

30.12 Summary

This chapter has examined corporate disclosure quality and timing in Vietnam along several dimensions. The key findings and methodological contributions are in Table 30.3

Table 30.3: Summary of findings by theme.
Theme Key Result Reference
Good news early Early filers earn positive CARs around filing dates Givoly and Palmon (1982)
Textual quality Forward-looking density and numerical specificity vary substantially Li (2008)
Composite DQ index Foreign ownership and Big 4 auditors are strongest determinants Botosan (1997)
Cost of capital Higher DQ is associated with lower implied cost of equity Diamond and Verrecchia (1991)
Liquidity Higher DQ firms have lower Amihud illiquidity Lang, Lins, and Maffett (2012)
Investment efficiency Higher DQ reduces absolute investment residuals Biddle, Hilary, and Verdi (2009)
Strategic timing Evidence of bad-news clustering on high-congestion days Hirshleifer, Lim, and Teoh (2009)
IFRS adoption Preliminary evidence of DQ improvement post-adoption Barth, Landsman, and Lang (2008)

The Vietnamese disclosure environment is shaped by a combination of regulatory mandates (Circular 155, Securities Law 2019), enforcement capacity (SSC penalties and trading suspensions), and firm-level incentives (ownership structure, auditor choice, governance quality). As Vietnam continues its IFRS convergence and capital market development, the information environment is expected to evolve, creating opportunities for researchers to study the dynamics of disclosure quality in a rapidly changing institutional setting.