Backtest per market with subgroups, 2026-05-18
Generated 2026-05-18 from Setting.stats.backtest.* + the new
stats.backtest.subgroups Setting key produced by
scripts/_backtest-stats-subgroups.mjs. BUY direction only, 90-day horizon,
insider-view returns (entry at transactionDate, exit T+90).
TL;DR
- Universe 227,594 priced BUY trades across 17 live markets. Global win rate 41.8 %, mean 90d return -1.70 %, annualised Sharpe -0.08 (max DD reaches -100 % on the chronological in-sample cohort; this is the raw multi-market equity curve, not the Sigma-filtered strategy).
- The OOS cohort (last 6 months, n=9,473) tells a different story: mean 90d +2.69 %, win 53.3 %, Sharpe +0.11. The regime shift is what powers the public landing snapshot.
- Best Sharpe in OOS: AT (1.46), DK (0.62), DE (0.53). Worst: BE (-0.66), FI (-0.04). The full in-sample table below shows why a flat unfiltered strategy is a losing one across the older history; the Sigma filter exists to drop the bottom half.
Top 5 by full-sample Sharpe (n90d >= 50)
| Mkt | n | Mean 90d | Win % | Sharpe | OOS Sharpe |
|---|---|---|---|---|---|
| AT | 1,022 | +4.80 % | 72.8 | 0.625 | 1.457 |
| ES | 2,448 | +4.35 % | 62.9 | 0.482 | 0.150 |
| CH | 184 | +5.44 % | 68.8 | 0.422 | n/a |
| DK | 3,104 | +4.69 % | 54.6 | 0.328 | 0.621 |
| NO | 361 | +13.67 % | 54.3 | 0.324 | 0.324 |
Bottom 5 by full-sample Sharpe
| Mkt | n | Mean 90d | Win % | Sharpe | OOS Sharpe |
|---|---|---|---|---|---|
| IT | 503 | -2.48 % | 51.5 | -0.299 | 0.305 |
| US | 190,055 | -2.28 % | 40.4 | -0.090 | -0.016 |
| FR | 17,013 | -0.24 % | 43.7 | -0.083 | 0.251 |
| BE | 906 | +1.23 % | 50.0 | 0.032 | -0.664 |
| FI | 5,676 | +2.06 % | 52.6 | 0.144 | -0.039 |
US dominates the universe (190k of 228k rows) so its mediocre Sharpe drags the global figure even though half the smaller markets are positive.
Full table, 17 markets
| Mkt | n | Mean 90d | Win % | Sharpe | OOS Sharpe | Max DD % |
|---|---|---|---|---|---|---|
| AT | 1,022 | +4.80 % | 72.8 | 0.625 | 1.457 | -99.25 |
| ES | 2,448 | +4.35 % | 62.9 | 0.482 | 0.150 | -100 |
| CH | 184 | +5.44 % | 68.8 | 0.422 | n/a | -99.51 |
| DK | 3,104 | +4.69 % | 54.6 | 0.328 | 0.621 | -100 |
| NO | 361 | +13.67 % | 54.3 | 0.324 | 0.324 | -83.93 |
| DE | 224 | +3.61 % | 48.7 | 0.256 | 0.525 | -91.47 |
| BR | 978 | +2.94 % | 47.4 | 0.192 | n/a | -100 |
| NL | 4,762 | +2.63 % | 55.1 | 0.158 | 0.009 | -100 |
| FI | 5,676 | +2.06 % | 52.6 | 0.144 | -0.039 | -100 |
| BE | 906 | +1.23 % | 50.0 | 0.032 | -0.664 | -100 |
| FR | 17,013 | -0.24 % | 43.7 | -0.083 | 0.251 | -100 |
| US | 190,055 | -2.28 % | 40.4 | -0.090 | -0.016 | -100 |
| IT | 503 | -2.48 % | 51.5 | -0.299 | 0.305 | -100 |
| CA | 30 | +10.0 % | 100 | n/a | n/a | 0 |
| UK | 63 | n/a | n/a | n/a | n/a | 0 |
| AU | 264 | +113.1 % | 100 | n/a | n/a | 0 |
| IE | 1 | n/a | n/a | n/a | n/a | n/a |
Sharpe is annualised on quarterly 90d returns vs a 1 % quarterly risk-free floor (RF=4 %/yr). The CA/UK/AU/IE rows have too few priced trades to compute a stable Sharpe; the AU mean is a small-sample artefact dominated by a few microcap outliers and should be treated as noise.
The advertised "28 markets" Iso list still has 11 markets in active backfill
(KR, JP, HK, IN, CN, PL, NZ, ZA, SA, BG, PH); they have ingestion but not yet
enough Yahoo-priced trades to land in BacktestResult. Expect them to
populate progressively over the next 4-6 weeks of cron sweeps.
Role segmentation, global BUY cohort
| Role | n | Mean 90d | Win % | Sharpe |
|---|---|---|---|---|
| BOARD_MEMBER | 11,375 | +1.42 % | 50.5 | +0.031 |
| OFFICER | 125,419 | -1.72 % | 41.4 | -0.070 |
| CFO | 15,474 | -2.46 % | 39.2 | -0.095 |
| CEO | 45,557 | -2.32 % | 40.0 | -0.110 |
| OTHER | 26,674 | -1.15 % | 45.4 | -0.164 |
| CHAIRMAN | 2,802 | -3.02 % | 41.8 | -0.349 |
| OWNER_10PCT | 6 | -7.33 % | 20.0 | n/a |
Only the BOARD_MEMBER bucket is positive. The intuition lines up with the literature: independent directors trade less often and tend to time their buys around earnings windows where information asymmetry is smallest; operating officers (CEO, CFO, CHAIRMAN) buy into bad news more often than they should. OWNER_10PCT is unusable at n=6 - we publish it for completeness but ignore it for routing.
Gender insights, global BUY cohort
| Gender | n | Mean 90d | Win % | Sharpe |
|---|---|---|---|---|
| M | 152,536 | -1.88 % | 41.1 | -0.073 |
| F | 40,696 | -1.71 % | 41.7 | -0.146 |
| N (anon) | 34,362 | -0.87 % | 45.5 | -0.130 |
M and F outcomes are statistically indistinguishable on Sharpe. The "N" bucket scores higher than either named bucket because it is structurally biased toward markets that anonymise filings (CH SIX SER, BR CVM) where the positive-Sharpe regimes dominate (see anonymised section below). Gender on its own is not a routing feature.
Anonymised feeds (SIX SER + CVM-BR)
Switzerland (CH) and Brazil (CVM) do not publish insider names in their bulk feeds, only the role. We synthesise a "Anonymous (role)" insider in both cases. Gender is therefore always N.
| Mkt | n | Mean 90d | Win % | Sharpe | Role mix |
|---|---|---|---|---|---|
| CH | 184 | +5.44 % | 68.8 | +0.422 | OTHER 109 · BOARD_MEMBER 75 |
| BR | 978 | +2.94 % | 47.4 | +0.192 | OTHER 978 (CVM Tipo_Cargo not yet normalised) |
CH is the best small-sample Sharpe in the dataset. CVM-BR is the largest anonymised feed and is positive in absolute terms. The CVM role normaliser should be extended so the Tipo_Cargo string ("Diretor", "Conselheiro", "Membro do Conselho Fiscal") feeds into our 7-role taxonomy instead of collapsing to OTHER - expected to lift the per-role Sharpe ranking.
Delta vs yesterday
This is the first run of _backtest-stats-subgroups.mjs, so the snapshot
key stats.backtest.subgroups.snapshot has just been seeded. Tomorrow's
run will diff against it. The non-subgroup keys (stats.backtest.global,
stats.backtest.<MK>) were last rebuilt 2026-05-18 11:44 UTC by the
existing _backtest-stats.mjs cron and are stable.
Regime caveat
Three biases the reader should hold in mind:
- The OOS cohort is the last 6 months only. It overlaps the 2026 bull regime where insider buys have been broadly rewarded. The full-sample Sharpe of -0.08 is the more conservative number.
- Max DD reaching -100 % is mechanical: the chronological equity curve
compounds every 90d trade as if held in sequence with full re-investment.
The Sigma production strategy applies a score >= 60 filter and ATR-based
stops; its drawdown profile is documented in
docs/method-review/49-*. - The US row dominates the global mean by sample weight (84 % of trades). Equal-weighting markets would flip the global Sharpe to slightly positive (mean of the 13 markets with valid Sharpe: +0.13). We publish the sample-weighted figure because that is what a naive cross-market book would actually deliver.