51 · Multi-Market Backtest Results — 2026-05-18

Generated by scripts/_backtest-stats.mjs from the BacktestResult table after the May 2026 recompute. Source: 285,857 priced rows (priceAtTrade > 0), covering 227,594 BUY signals and 55,004 SELL signals across 17 markets with at least one priced trade.

Stats persisted to Setting:

stats.backtest.global
stats.backtest.<ISO> for each market

Raw JSON: /tmp/backtest-stats-2026-05-18.json.

TL;DR

Cohort	n	Avg T+90	Win T+90	Sharpe ann	Max DD
BUY · full universe	227,594	-1.70 %	41.82 %	-0.08	-100 %
BUY · out-of-sample (last 6 mo)	9,473	+2.69 %	53.27 %	+0.11	-100 %

Reading: the raw cross-market BUY universe is essentially flat / slightly negative on T+90. Sigma alpha is in selection, not in buying every insider ticket. Out-of-sample (post-2025-11-18) the picture inverts to +2.7 % T+90 win 53 %, consistent with the v5.1 scoring rerun in audit 43.

maxDDPct = -100 % for most equity curves is an arithmetic artifact of the BUY universe containing names that compounded to ruin (delisted / bankrupt). Each row is treated as an isolated +90d lottery ticket, summed in log space along pubDate — when a single row contributes ln(1 + r/100) with r ≈ -99 %, the curve floors. This is intentional disclosure, not a portfolio metric. The portfolio-level max DD remains the audited -17.9 % from rerun 43-scoring-rerun-2026-05.md.

Score-band hit rate (global, BUY 90d)

Band	n	Mean 90d	Win %
50-60	749	+7.87 %	56.61 %
60-70	82	+26.11 %	82.93 %
70-80	0	n/a	n/a
80+	0	n/a	n/a

Score bands above 70 are empty because composite signalScore is currently only computed on the AMF (FR) subset under the v5.1 path. Non-FR rows carry signalScore = null. The 60-70 band's 82.9 % win rate / +26 % mean is the clearest empirical confirmation that the composite filter selects.

Top 5 markets by Sharpe (BUY 90d, n ≥ 50)

Rank	Market	nBuys	Avg 90d	Win %	Sharpe ann
1	AT (FMA)	1,022	+4.80 %	72.84 %	+0.625
2	ES (CNMV)	2,448	+4.35 %	62.89 %	+0.482
3	CH (SIX)	184	+5.44 %	68.81 %	+0.422
4	DK (FI-DK)	3,104	+4.69 %	54.61 %	+0.328
5	NO (Oslo)	361	+13.67 %	54.29 %	+0.324

Bottom 5 markets (red flags)

Rank	Market	nBuys	Avg 90d	Win %	Sharpe ann	Comment
1	IT (CONSOB)	503	-2.48 %	51.54 %	-0.299	Insider signal anti-predictive on Italian mid-caps; OOS recovers to +3.9 % so probably a stale-data tail.
2	US (SEC)	190,055	-2.28 %	40.37 %	-0.09	Universe dominated by automated 10b5-1 plans, dilution issuances. No selection = no edge.
3	FR (AMF)	17,013	-0.24 %	43.74 %	-0.083	The unfiltered FR universe is flat. Selection through the six Sigma filters lifts it to the +13.2 % cohort in `STRATEGY_PROOF`.
4	AU (ASX)	264	+113.11 %	100 %	n/a	Data quality flag: priced rows are penny stocks where Yahoo close prices are corrupted or split-unaware. Suspend display.
5	CA (SEDI)	30	+10 %	100 %	n/a	n < 50, statistically meaningless. Drop from public UI.

UK (RNS) has only 63 priced rows with return90d != null and no Sharpe is computable. IE has 1 row. These markets need a separate price-coverage backfill before any claim can be made.

Out-of-sample (last 6 months, post-2025-11-18)

Global OOS metrics improve materially:

BUY universe n = 9,473, avg T+90 +2.69 %, win rate 53.27 %, Sharpe +0.11.
Best OOS market: AT with avg T+90 +19.23 %, win 95.65 %, Sharpe +1.46.
Worst OOS market: BE with avg T+90 -3.72 %, Sharpe -0.66.

This split is consistent with the walk-forward window in audit 43 (combined portfolio Sharpe 1.32, hit rate 52 %, DSR +0.31).

All 17 markets — full table

Market	nBuys	Avg 90d	Win %	Sharpe ann	nOOS	OOS avg	OOS sharpe
US	190,055	-2.28 %	40.37 %	-0.09	3,783	+0.74 %	-0.02
FR	17,013	-0.24 %	43.74 %	-0.08	1,546	+4.53 %	+0.25
FI	5,676	+2.06 %	52.58 %	+0.14	1,278	+0.67 %	-0.04
NL	4,762	+2.63 %	55.11 %	+0.16	430	+1.08 %	+0.01
DK	3,104	+4.69 %	54.61 %	+0.33	240	+4.36 %	+0.62
ES	2,448	+4.35 %	62.89 %	+0.48	515	+1.71 %	+0.15
AT	1,022	+4.80 %	72.84 %	+0.62	73	+19.23 %	+1.46
BR	978	+2.94 %	47.44 %	+0.19	0	n/a	n/a
BE	906	+1.23 %	50.00 %	+0.03	303	-3.72 %	-0.66
IT	503	-2.48 %	51.54 %	-0.30	438	+3.89 %	+0.31
NO	361	+13.67 %	54.29 %	+0.32	361	+13.67 %	+0.32
AU	264	+113.11 %	100.00 %	n/a	264	+113.11 %	n/a
DE	224	+3.61 %	48.72 %	+0.26	73	+6.57 %	+0.53
CH	184	+5.44 %	68.81 %	+0.42	75	n/a	n/a
UK	63	n/a	n/a	n/a	63	n/a	n/a
CA	30	+10.00 %	100.00 %	n/a	30	+10.00 %	n/a
IE	1	n/a	n/a	n/a	1	n/a	n/a

Insights

Which score bands actually win. Only 60-70 is empirically convincing (n=82, 82.9 % win, +26 % mean). The 50-60 band is positive but smaller in magnitude. The 70+ bands are empty in the BUY-priced subset — the scoring pipeline currently produces very few rows above 70, suggesting either an overly conservative composite cap or the cluster/role gates filtering too aggressively. Action: re-examine signalScore clamps.
Cluster contrarian flip status. Not directly measured here (we don't carry the cluster flag in BacktestResult); the OOS uplift in FR (-0.24 % → +4.53 %) is consistent with the v5.1 contrarian-flip rule activating, but a dedicated breakdown should join Declaration.signals JSON.
Regime sensitivity. Six markets flip sign between full-window and OOS (IT, BE, FR, FI, AT, ES). The OOS Sharpe ranking is unstable enough that no single-market claim should appear on the landing page without a 6-month freshness disclaimer. Use the portfolio-level numbers from audit 43 (Sharpe 1.32 walk-forward) as the headline instead.
Universe vs subset. Confirms feedback_anti_hallucination priors: the raw BUY universe across 17 markets is a net loser at T+90 (mean -1.7 %). All advertised positive numbers must explicitly cite the filtered subset, never the universe.

Markets to suspend from public UI

AU (264 rows, suspicious +113 % mean — Yahoo split / penny-stock pricing).
UK (n_90d = 0 — price coverage absent).
CA (n = 30, below the 50-row floor).
IE (n = 1).

Keep these in the Setting payload for transparency but flag display = false in the UI render layer.

Files touched in this audit

scripts/_backtest-stats.mjs (new): full recompute + Setting persistence.
src/lib/backtest-stats.ts (new): cached fetcher of stats.backtest.*.
src/components/landing/LandingTrackRecord.tsx: snapshot now sourced from Setting rather than the live aggregate.
src/components/landing/LandingFeatures.tsx: Win rate / T+90 / Sharpe cells now read from the stored stats instead of "67%" / "+8.4%" / "1.02" hardcodes.
src/app/performance/page.tsx: backtest microcopy reflects n=227,594 rather than 24,000.