51 · Multi-Market Backtest Results — 2026-05-18
Generated by scripts/_backtest-stats.mjs from the BacktestResult table
after the May 2026 recompute. Source: 285,857 priced rows (priceAtTrade > 0),
covering 227,594 BUY signals and 55,004 SELL signals across 17 markets
with at least one priced trade.
Stats persisted to Setting:
stats.backtest.globalstats.backtest.<ISO>for each market
Raw JSON: /tmp/backtest-stats-2026-05-18.json.
TL;DR
| Cohort | n | Avg T+90 | Win T+90 | Sharpe ann | Max DD |
|---|---|---|---|---|---|
| BUY · full universe | 227,594 | -1.70 % | 41.82 % | -0.08 | -100 % |
| BUY · out-of-sample (last 6 mo) | 9,473 | +2.69 % | 53.27 % | +0.11 | -100 % |
Reading: the raw cross-market BUY universe is essentially flat / slightly
negative on T+90. Sigma alpha is in selection, not in buying every insider
ticket. Out-of-sample (post-2025-11-18) the picture inverts to +2.7 % T+90 win
53 %, consistent with the v5.1 scoring rerun in audit 43.
maxDDPct = -100 % for most equity curves is an arithmetic artifact of
the BUY universe containing names that compounded to ruin (delisted /
bankrupt). Each row is treated as an isolated +90d lottery ticket, summed in
log space along pubDate — when a single row contributes ln(1 + r/100)
with r ≈ -99 %, the curve floors. This is intentional disclosure, not a
portfolio metric. The portfolio-level max DD remains the audited
-17.9 % from rerun 43-scoring-rerun-2026-05.md.
Score-band hit rate (global, BUY 90d)
| Band | n | Mean 90d | Win % |
|---|---|---|---|
| 50-60 | 749 | +7.87 % | 56.61 % |
| 60-70 | 82 | +26.11 % | 82.93 % |
| 70-80 | 0 | n/a | n/a |
| 80+ | 0 | n/a | n/a |
Score bands above 70 are empty because composite signalScore is currently
only computed on the AMF (FR) subset under the v5.1 path. Non-FR rows carry
signalScore = null. The 60-70 band's 82.9 % win rate / +26 % mean is the
clearest empirical confirmation that the composite filter selects.
Top 5 markets by Sharpe (BUY 90d, n ≥ 50)
| Rank | Market | nBuys | Avg 90d | Win % | Sharpe ann |
|---|---|---|---|---|---|
| 1 | AT (FMA) | 1,022 | +4.80 % | 72.84 % | +0.625 |
| 2 | ES (CNMV) | 2,448 | +4.35 % | 62.89 % | +0.482 |
| 3 | CH (SIX) | 184 | +5.44 % | 68.81 % | +0.422 |
| 4 | DK (FI-DK) | 3,104 | +4.69 % | 54.61 % | +0.328 |
| 5 | NO (Oslo) | 361 | +13.67 % | 54.29 % | +0.324 |
Bottom 5 markets (red flags)
| Rank | Market | nBuys | Avg 90d | Win % | Sharpe ann | Comment |
|---|---|---|---|---|---|---|
| 1 | IT (CONSOB) | 503 | -2.48 % | 51.54 % | -0.299 | Insider signal anti-predictive on Italian mid-caps; OOS recovers to +3.9 % so probably a stale-data tail. |
| 2 | US (SEC) | 190,055 | -2.28 % | 40.37 % | -0.09 | Universe dominated by automated 10b5-1 plans, dilution issuances. No selection = no edge. |
| 3 | FR (AMF) | 17,013 | -0.24 % | 43.74 % | -0.083 | The unfiltered FR universe is flat. Selection through the six Sigma filters lifts it to the +13.2 % cohort in STRATEGY_PROOF. |
| 4 | AU (ASX) | 264 | +113.11 % | 100 % | n/a | Data quality flag: priced rows are penny stocks where Yahoo close prices are corrupted or split-unaware. Suspend display. |
| 5 | CA (SEDI) | 30 | +10 % | 100 % | n/a | n < 50, statistically meaningless. Drop from public UI. |
UK (RNS) has only 63 priced rows with return90d != null and no Sharpe is
computable. IE has 1 row. These markets need a separate price-coverage
backfill before any claim can be made.
Out-of-sample (last 6 months, post-2025-11-18)
Global OOS metrics improve materially:
- BUY universe
n = 9,473, avg T+90+2.69 %, win rate53.27 %, Sharpe+0.11. - Best OOS market: AT with avg T+90
+19.23 %, win95.65 %, Sharpe+1.46. - Worst OOS market: BE with avg T+90
-3.72 %, Sharpe-0.66.
This split is consistent with the walk-forward window in audit 43 (combined
portfolio Sharpe 1.32, hit rate 52 %, DSR +0.31).
All 17 markets — full table
| Market | nBuys | Avg 90d | Win % | Sharpe ann | nOOS | OOS avg | OOS sharpe |
|---|---|---|---|---|---|---|---|
| US | 190,055 | -2.28 % | 40.37 % | -0.09 | 3,783 | +0.74 % | -0.02 |
| FR | 17,013 | -0.24 % | 43.74 % | -0.08 | 1,546 | +4.53 % | +0.25 |
| FI | 5,676 | +2.06 % | 52.58 % | +0.14 | 1,278 | +0.67 % | -0.04 |
| NL | 4,762 | +2.63 % | 55.11 % | +0.16 | 430 | +1.08 % | +0.01 |
| DK | 3,104 | +4.69 % | 54.61 % | +0.33 | 240 | +4.36 % | +0.62 |
| ES | 2,448 | +4.35 % | 62.89 % | +0.48 | 515 | +1.71 % | +0.15 |
| AT | 1,022 | +4.80 % | 72.84 % | +0.62 | 73 | +19.23 % | +1.46 |
| BR | 978 | +2.94 % | 47.44 % | +0.19 | 0 | n/a | n/a |
| BE | 906 | +1.23 % | 50.00 % | +0.03 | 303 | -3.72 % | -0.66 |
| IT | 503 | -2.48 % | 51.54 % | -0.30 | 438 | +3.89 % | +0.31 |
| NO | 361 | +13.67 % | 54.29 % | +0.32 | 361 | +13.67 % | +0.32 |
| AU | 264 | +113.11 % | 100.00 % | n/a | 264 | +113.11 % | n/a |
| DE | 224 | +3.61 % | 48.72 % | +0.26 | 73 | +6.57 % | +0.53 |
| CH | 184 | +5.44 % | 68.81 % | +0.42 | 75 | n/a | n/a |
| UK | 63 | n/a | n/a | n/a | 63 | n/a | n/a |
| CA | 30 | +10.00 % | 100.00 % | n/a | 30 | +10.00 % | n/a |
| IE | 1 | n/a | n/a | n/a | 1 | n/a | n/a |
Insights
- Which score bands actually win. Only
60-70is empirically convincing (n=82, 82.9 % win, +26 % mean). The50-60band is positive but smaller in magnitude. The70+bands are empty in the BUY-priced subset — the scoring pipeline currently produces very few rows above 70, suggesting either an overly conservative composite cap or the cluster/role gates filtering too aggressively. Action: re-examinesignalScoreclamps. - Cluster contrarian flip status. Not directly measured here (we don't
carry the cluster flag in
BacktestResult); the OOS uplift in FR (-0.24 % → +4.53 %) is consistent with the v5.1 contrarian-flip rule activating, but a dedicated breakdown should joinDeclaration.signalsJSON. - Regime sensitivity. Six markets flip sign between full-window and OOS
(IT, BE, FR, FI, AT, ES). The OOS Sharpe ranking is unstable enough that no
single-market claim should appear on the landing page without a 6-month
freshness disclaimer. Use the portfolio-level numbers from audit
43(Sharpe 1.32 walk-forward) as the headline instead. - Universe vs subset. Confirms
feedback_anti_hallucinationpriors: the raw BUY universe across 17 markets is a net loser at T+90 (mean -1.7 %). All advertised positive numbers must explicitly cite the filtered subset, never the universe.
Markets to suspend from public UI
AU(264 rows, suspicious +113 % mean — Yahoo split / penny-stock pricing).UK(n_90d = 0 — price coverage absent).CA(n = 30, below the 50-row floor).IE(n = 1).
Keep these in the Setting payload for transparency but flag display = false
in the UI render layer.
Files touched in this audit
scripts/_backtest-stats.mjs(new): full recompute + Setting persistence.src/lib/backtest-stats.ts(new): cached fetcher ofstats.backtest.*.src/components/landing/LandingTrackRecord.tsx: snapshot now sourced fromSettingrather than the live aggregate.src/components/landing/LandingFeatures.tsx: Win rate / T+90 / Sharpe cells now read from the stored stats instead of"67%"/"+8.4%"/"1.02"hardcodes.src/app/performance/page.tsx: backtest microcopy reflectsn=227,594rather than24,000.