The Intuition

Pairs trading exploits long-run equilibrium relationships between two economically related assets. When two stocks (or ETFs, or futures) share common risk factors — same sector, same supply chain, same macro driver — their prices should move together in the long run. Short-term divergences from that co-movement relationship are the tradeable signal.

The statistical foundation is cointegration, formalised by Engle and Granger (1987). Two non-stationary price series are cointegrated if there exists a linear combination of them that is stationary. In practice, this means the spread between two assets (after accounting for a hedge ratio) should have a finite, bounded variance — it oscillates around a constant, rather than random-walking away.

Gatev, Goetzmann, and Rouwenhorst (2006) showed that pairs trading generated positive Sharpe ratios in US equities from 1962–2002, attributing the returns to temporary liquidity provision: when one stock falls and another rises, pairs traders buy the loser and short the winner, effectively providing liquidity to the market. The convergence then earns the liquidity premium.

Key assumptions: (1) The pair is genuinely cointegrated — confirmed by the Engle-Granger test (p-value < 0.05). (2) The hedge ratio from OLS is stable over the trading period (it can drift in practice). (3) The spread z-score is a reliable entry/exit signal — the spread actually converges. If the fundamental relationship breaks (e.g., one company gets acquired, or a supply shock hits only one), the spread can diverge permanently, causing large losses.

Classic pairs in practice: Coke vs. Pepsi, gold vs. gold miners (GDX), crude oil vs. natural gas during linked periods, treasury futures of adjacent maturities. The strategy has become crowded in equities since the 2000s, compressing returns. Modern practitioners use cointegration in futures term structures, FX crosses, or across-country equity index pairs where fewer arbitrageurs operate.

The Math

Read this as a compact model summary: what the signal sees, what it ignores, and where fragility can creep in.

log_A(t) = log(Close_A(t)) log_B(t) = log(Close_B(t)) hedge_ratio = OLS coefficient of log_A on log_B spread(t) = log_A(t) - hedge_ratio \times log_B(t) z(t) = (spread(t) - mean(spread[t-n:t])) / std(spread[t-n:t]) Entry: z(t) < -entry_z \to long spread (long A, short B) z(t) > entry_z \to short spread Exit: |z(t)| < exit_z \to flat

Parameters

Parameter	Type	Default	Description
ticker_b	str	QQQ	Second ticker for the pair
window	int	60	Rolling window for spread z-score
entry_z	float	2.0	Z-score threshold to enter trade
exit_z	float	0.5	Z-score threshold to exit trade

Source Code

def run(ticker: str, start: str, end: str, **params) -> dict:
    ticker_b = str(params.get("ticker_b", "QQQ"))
    window = int(params.get("window", 60))
    entry_z = float(params.get("entry_z", 2.0))
    exit_z = float(params.get("exit_z", 0.5))
    cost_bps = float(params.get("cost_bps", 0.0) or 0.0)
    slippage_bps = float(params.get("slippage_bps", 0.0) or 0.0)
    oos_split_pct = float(params.get("oos_split_pct", 0.0) or 0.0)

    df_a = fetch_ohlcv(ticker, start, end)
    df_b = fetch_ohlcv(ticker_b, start, end)

    # Align on common dates
    common = df_a.index.intersection(df_b.index)
    if len(common) < max(window + 3, 6):
        raise ValueError("Not enough overlapping history to run the selected pair.")
    df_a = df_a.loc[common]
    df_b = df_b.loc[common]

    log_a = np.log(df_a["Close"])
    log_b = np.log(df_b["Close"])

    # Engle-Granger cointegration test
    _, pvalue, _ = coint(log_a, log_b)
    coint_warning = pvalue > 0.05

    # OLS hedge ratio on full series
    X = add_constant(log_b.values)
    model = OLS(log_a.values, X).fit()
    hedge_ratio = model.params[1]

    spread = log_a - hedge_ratio * log_b

    # Rolling z-score of spread
    mu = spread.rolling(window).mean()
    sigma = spread.rolling(window).std()
    z = (spread - mu) / sigma.replace(0, np.nan)

    pos = pd.Series(0.0, index=df_a.index)
    for i in range(1, len(z)):
        zi = z.iloc[i]
        if pd.isna(zi):
            continue
        prev = pos.iloc[i - 1]
        if prev == 0:
            if zi < -entry_z:
                pos.iloc[i] = 1.0
            elif zi > entry_z:
                pos.iloc[i] = -1.0
        else:
            if abs(zi) < exit_z:
                pos.iloc[i] = 0.0
            else:
                pos.iloc[i] = prev

    gross_notional = 1.0 + abs(float(hedge_ratio))
    weight_a = 1.0 / gross_notional
    weight_b = -float(hedge_ratio) / gross_notional
    leg_a = pos * weight_a
    leg_b = pos * weight_b
    ret_a = df_a["Close"].pct_change().shift(-1).fillna(0.0)
    ret_b = df_b["Close"].pct_change().shift(-1).fillna(0.0)
    raw_pair_returns = (leg_a * ret_a + leg_b * ret_b).fillna(0.0)
    gross_turnover = leg_a.diff().abs().fillna(leg_a.abs()) + leg_b.diff().abs().fillna(leg_b.abs())

    result = run_backtest(
        df_a,
        pos,
        ticker=ticker,
        start=start,
        end=end,
        strategy=METADATA["slug"],
        params={
            "ticker_b": ticker_b,
            "window": window,
            "entry_z": entry_z,
            "exit_z": exit_z,
        },
        cost_bps=cost_bps,
        slippage_bps=slippage_bps,
        oos_split_pct=oos_split_pct,
        include_return_series=True,
        raw_strategy_returns=raw_pair_returns,
        turnover=gross_turnover,
        trade_price_series=spread,
    )
    result["ticker_b"] = ticker_b
    result["pair_weights"] = {"ticker_a": round(weight_a, 4), "ticker_b": round(weight_b, 4)}
    result["hedge_ratio"] = round(float(hedge_ratio), 4)
    result["spread_latest"] = round(float(spread.iloc[-1]), 4) if len(spread) else None
    result["zscore_latest"] = round(float(z.iloc[-1]), 4) if len(z) and not pd.isna(z.iloc[-1]) else None
    result["coint_pvalue"] = round(float(pvalue), 4)
    if coint_warning:
        result["warning"] = f"Cointegration p-value {pvalue:.3f} > 0.05 — pair may not be cointegrated"
    return result

When It Works / When It Fails

Works

Asset pairs with demonstrated cointegration relationship
Sector pairs with structural long-run price linkage
Stable vol, non-directional (Transition) regimes

Fails

Cointegration breaks — M&A, sector rotation, regime change
OLS hedge ratio shifts over time (non-stationary relationship)
Stressed-vol regimes where the spread blows out sharply

Regime Fit

Transition / Calm vol

Best conditions at 1.0. No directional regime bias; spread reverts cleanly around equilibrium.

Transition / Normal vol

Strong at 0.85. Regime is non-directional; slightly more noise in spread dynamics.

Bull or Bear / Calm vol

0.85. Directional regime adds risk but low vol keeps spread behavior contained.

Any regime / Stressed vol

0.0–0.25. Spread can gap violently; cointegration assumption most likely to break here.

Compared to Alternatives

vs OU Reversion

OU applies mean-reversion to a single series; Pairs uses two-asset cointegration with an OLS hedge ratio. Pairs is dollar-neutral by construction.

vs Zscore

Zscore standardizes single-asset return deviations; Pairs requires OLS hedge ratio estimation on two assets and adds a cointegration validation step.

vs Bollinger

Bollinger mean-reverts a single price series to its SMA; Pairs uses a spread between two assets. Pairs is market-neutral; Bollinger retains single-asset directional exposure.

Run This Strategy →

Bollinger Band Mean Reversion Connors RSI Reversion Ornstein-Uhlenbeck Mean Reversion Read strategy methodology

Pairs Trading (Cointegration)

The Intuition

The Math

Parameters

Source Code

Further Reading

When It Works / When It Fails

Regime Fit

Compared to Alternatives

Pairs Trading (Cointegration)

The Intuition

The Math

Parameters

Source Code

Further Reading

When It Works / When It Fails

Regime Fit

Compared to Alternatives

Related Mean Reversion Strategies