Pairs Trading (Cointegration)
Trade the spread between two cointegrated assets back to equilibrium.
The Intuition
Pairs trading exploits long-run equilibrium relationships between two economically related assets. When two stocks (or ETFs, or futures) share common risk factors — same sector, same supply chain, same macro driver — their prices should move together in the long run. Short-term divergences from that co-movement relationship are the tradeable signal.
The statistical foundation is cointegration, formalised by Engle and Granger (1987). Two non-stationary price series are cointegrated if there exists a linear combination of them that is stationary. In practice, this means the spread between two assets (after accounting for a hedge ratio) should have a finite, bounded variance — it oscillates around a constant, rather than random-walking away.
Gatev, Goetzmann, and Rouwenhorst (2006) showed that pairs trading generated positive Sharpe ratios in US equities from 1962–2002, attributing the returns to temporary liquidity provision: when one stock falls and another rises, pairs traders buy the loser and short the winner, effectively providing liquidity to the market. The convergence then earns the liquidity premium.
Key assumptions: (1) The pair is genuinely cointegrated — confirmed by the Engle-Granger test (p-value < 0.05). (2) The hedge ratio from OLS is stable over the trading period (it can drift in practice). (3) The spread z-score is a reliable entry/exit signal — the spread actually converges. If the fundamental relationship breaks (e.g., one company gets acquired, or a supply shock hits only one), the spread can diverge permanently, causing large losses.
Classic pairs in practice: Coke vs. Pepsi, gold vs. gold miners (GDX), crude oil vs. natural gas during linked periods, treasury futures of adjacent maturities. The strategy has become crowded in equities since the 2000s, compressing returns. Modern practitioners use cointegration in futures term structures, FX crosses, or across-country equity index pairs where fewer arbitrageurs operate.
The Math
Read this as a compact model summary: what the signal sees, what it ignores, and where fragility can creep in.
log_A(t) = log(Close_A(t))
log_B(t) = log(Close_B(t))
hedge_ratio = OLS coefficient of log_A on log_B
spread(t) = log_A(t) - hedge_ratio × log_B(t)
z(t) = (spread(t) - mean(spread[t-n:t])) / std(spread[t-n:t])
Entry: z(t) < -entry_z → long spread (long A, short B)
z(t) > entry_z → short spread
Exit: |z(t)| < exit_z → flat
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| ticker_b | str | QQQ | Second ticker for the pair |
| window | int | 60 | Rolling window for spread z-score |
| entry_z | float | 2.0 | Z-score threshold to enter trade |
| exit_z | float | 0.5 | Z-score threshold to exit trade |
Source Code
def run(ticker: str, start: str, end: str, **params) -> dict:
ticker_b = str(params.get("ticker_b", "QQQ"))
window = int(params.get("window", 60))
entry_z = float(params.get("entry_z", 2.0))
exit_z = float(params.get("exit_z", 0.5))
cost_bps = float(params.get("cost_bps", 0.0) or 0.0)
slippage_bps = float(params.get("slippage_bps", 0.0) or 0.0)
oos_split_pct = float(params.get("oos_split_pct", 0.0) or 0.0)
df_a = fetch_ohlcv(ticker, start, end)
df_b = fetch_ohlcv(ticker_b, start, end)
# Align on common dates
common = df_a.index.intersection(df_b.index)
if len(common) < max(window + 3, 6):
raise ValueError("Not enough overlapping history to run the selected pair.")
df_a = df_a.loc[common]
df_b = df_b.loc[common]
log_a = np.log(df_a["Close"])
log_b = np.log(df_b["Close"])
# Engle-Granger cointegration test
_, pvalue, _ = coint(log_a, log_b)
coint_warning = pvalue > 0.05
# OLS hedge ratio on full series
X = add_constant(log_b.values)
model = OLS(log_a.values, X).fit()
hedge_ratio = model.params[1]
spread = log_a - hedge_ratio * log_b
# Rolling z-score of spread
mu = spread.rolling(window).mean()
sigma = spread.rolling(window).std()
z = (spread - mu) / sigma.replace(0, np.nan)
pos = pd.Series(0.0, index=df_a.index)
for i in range(1, len(z)):
zi = z.iloc[i]
if pd.isna(zi):
continue
prev = pos.iloc[i - 1]
if prev == 0:
if zi < -entry_z:
pos.iloc[i] = 1.0
elif zi > entry_z:
pos.iloc[i] = -1.0
else:
if abs(zi) < exit_z:
pos.iloc[i] = 0.0
else:
pos.iloc[i] = prev
gross_notional = 1.0 + abs(float(hedge_ratio))
weight_a = 1.0 / gross_notional
weight_b = -float(hedge_ratio) / gross_notional
leg_a = pos * weight_a
leg_b = pos * weight_b
ret_a = df_a["Close"].pct_change().shift(-1).fillna(0.0)
ret_b = df_b["Close"].pct_change().shift(-1).fillna(0.0)
raw_pair_returns = (leg_a * ret_a + leg_b * ret_b).fillna(0.0)
gross_turnover = leg_a.diff().abs().fillna(leg_a.abs()) + leg_b.diff().abs().fillna(leg_b.abs())
result = run_backtest(
df_a,
pos,
ticker=ticker,
start=start,
end=end,
strategy=METADATA["slug"],
params={
"ticker_b": ticker_b,
"window": window,
"entry_z": entry_z,
"exit_z": exit_z,
},
cost_bps=cost_bps,
slippage_bps=slippage_bps,
oos_split_pct=oos_split_pct,
include_return_series=True,
raw_strategy_returns=raw_pair_returns,
turnover=gross_turnover,
trade_price_series=spread,
)
result["ticker_b"] = ticker_b
result["pair_weights"] = {"ticker_a": round(weight_a, 4), "ticker_b": round(weight_b, 4)}
result["hedge_ratio"] = round(float(hedge_ratio), 4)
result["spread_latest"] = round(float(spread.iloc[-1]), 4) if len(spread) else None
result["zscore_latest"] = round(float(z.iloc[-1]), 4) if len(z) and not pd.isna(z.iloc[-1]) else None
result["coint_pvalue"] = round(float(pvalue), 4)
if coint_warning:
result["warning"] = f"Cointegration p-value {pvalue:.3f} > 0.05 — pair may not be cointegrated"
return result
Further Reading
- Engle, R. & Granger, C. (1987). Co-integration and Error Correction. Econometrica, 55(2), 251–276.
- Gatev, E., Goetzmann, W. & Rouwenhorst, K. (2006). Pairs Trading. Review of Financial Studies, 19(3), 797–827.
- Vidyamurthy, G. (2004). Pairs Trading: Quantitative Methods and Analysis. Wiley.
- Chan, E. (2013). Algorithmic Trading, Ch. 3. Wiley.
When It Works / When It Fails
- Asset pairs with demonstrated cointegration relationship
- Sector pairs with structural long-run price linkage
- Stable vol, non-directional (Transition) regimes
- Cointegration breaks — M&A, sector rotation, regime change
- OLS hedge ratio shifts over time (non-stationary relationship)
- Stressed-vol regimes where the spread blows out sharply