78 — V13.1 quant iteration · per-market/sector + adaptive scoring (2026-05-19)
Author: quant-analyst session Date: 2026-05-19 Status: shipped (V13.1g_stacked promoted to production)
Summary
Walk-forward OOS bake-off across 11 candidate scoring configs on the 30 421 priced BUY backtest rows since 2024-06 (OOS subset n=20 688 with pubDate >= 2025-01-01, T=14 monthly buckets, top-10 picks/mo, T+90 hold, NET 0.6 % round-trip transaction costs, winsor cap +/- 50 % per pick).
Winner: V13.1g_stacked = kindMult * clusterParticipantBoost * recentAlpha.
| metric | V13.0 baseline | V13.1g stacked | delta |
|---|---|---|---|
| Sharpe (annualised) | 0.03 | 0.70 | +0.67 |
| Bootstrap CI95 lo | -2.01 | -1.18 | +0.83 |
| Bootstrap CI95 hi | +2.26 | +3.64 | +1.38 |
| CAGR NET | -2.5 % | +25.4 % | +27.9pp |
| Max drawdown | -43.7 % | -29.7 % | +14.0pp |
| Win-month % | 50.7 % | 55.7 % | +5.0pp |
| DSR (N=11 trials) | -0.34 | +0.33 | +0.67 |
Anti-overfit gate (constraint: DSR delta vs V13.0 must not regress by more than 0.3): satisfied with margin (+0.67). CI95 still straddles zero so the point estimate carries the message, not the lower bound.
Production V13.0 documented Sharpe of 0.66 came from the earlier 60-issue bake (different EUR/amount handling). This run holds the pipeline constant across configs so the RELATIVE ranking is the trustworthy signal; absolute levels rescale once the full universe pipeline (Phase 7) runs.
Phase 1 · per-market x role/kind/cluster/sector stats
Market x role (n >= 50, BUY backtests, returnFromPub90d, 2024-06 onward)
| market | role | n | WR | avg 90d |
|---|---|---|---|---|
| BVMF | OTHER | 42 334 | 46.9 % | +0.32 % |
| XTKS | OTHER | 7 608 | 53.9 % | +4.30 % |
| XPAR | OTHER | 5 317 | 44.8 % | -1.21 % |
| XBOM | OTHER | 5 205 | 66.0 % | +10.35 % |
| XPAR | BOARD | 4 865 | 45.5 % | -0.85 % |
| XSTO | OTHER | 3 464 | 48.4 % | +2.39 % |
| XPAR | EXEC | 3 099 | 41.1 % | +0.33 % |
| XMAD | OTHER | 2 041 | 60.1 % | +122.4 % (outliers) |
| XSHG | OTHER | 1 880 | 48.1 % | +3.69 % |
| XSHE | OTHER | 1 816 | 54.5 % | +4.91 % |
| XCSE | OTHER | 1 407 | 49.4 % | +0.64 % |
| XMIL | OTHER | 1 368 | 47.6 % | +1.74 % |
| XPAR | CEO | 1 338 | 44.2 % | -0.96 % |
| XTKS | EXEC | 927 | 50.6 % | +6.61 % |
| XNAS | OTHER | 757 | 29.6 % | -0.25 % |
| XNAS | EXEC | 219 | 18.3 % | -13.63 % |
| XNAS | CEO | 183 | 29.0 % | +2.03 % |
| XMAD | CEO | 67 | 53.7 % | +4.66 % |
| XMAD | EXEC | 64 | 62.5 % | +7.28 % |
Findings:
- XPAR cluster on (48.2 %) clearly beats solo (42.2 %), validates kept cluster bonus.
- XNAS underperforms across all roles. Coverage/survivorship issue, not a scoring problem. Down-weight on per-market floor (Phase 1d test).
- XMAD/XBOM means inflated by penny stocks; winsor cap of +/- 50 % applied in bake to neutralise.
- BVMF 'OTHER' (42k rows) is the entire related-controlled cohort and drives the global baseline. WR 46.9 %.
Market x kind (direct / related)
| market | kind | n | WR | avg 90d |
|---|---|---|---|---|
| BVMF | related | 42 334 | 46.9 % | +0.32 % |
| XPAR | direct | 14 878 | 44.3 % | -0.73 % |
| XTKS | direct | 7 976 | 52.7 % | +4.41 % |
| XTKS | related | 559 | 64.6 % | +6.55 % |
| XPAR | related | 293 | 42.3 % | -0.37 % |
| XHEL | related | 286 | 44.4 % | -1.67 % |
| XBOM | related | 245 | 55.9 % | +7.32 % |
Findings:
- Related/direct gap is market-dependent. XTKS related crushes (+12pp WR). XPAR/XHEL related slightly underperforms direct. XBOM related mild lift. Justifies a multiplicative (not flat additive) related-kind bonus - see V13.1c, V13.1g.
Market x cluster
| market | bucket | n | WR | avg 90d |
|---|---|---|---|---|
| XPAR | solo | 9 895 | 42.2 % | -1.47 % |
| XPAR | cluster | 5 276 | 48.2 % | +0.68 % |
| XNAS | solo | 1 010 | 25.4 % | -2.40 % |
| XNAS | cluster | 209 | 39.2 % | -2.76 % |
XPAR cluster lifts WR +6pp and mean +2.1pp; XNAS cluster lifts WR +14pp but mean stays negative. cluster boost is preserved (already in V13.0) and amplified when participant count >= 5 (Phase 2e).
Related kind (sample-level, all available pubDates)
| kind | n | WR | avg 90d |
|---|---|---|---|
| controlled | 42 992 | 46.8 % | +0.29 % |
| trust | 809 | 61.6 % | +7.02 % |
| holding | 231 | 39.0 % | -1.50 % |
| spouse | 11 | 45.5 % | +0.53 % |
Trust is the standout - 1.20 x multiplier rewards it without overfitting to the small spouse/child sample. Holding marginally weaker, kept at the same 1.20 x bucket to preserve simplicity.
OOS related kind (pubDate >= 2025-01-01)
| kind | n | WR | avg 90d |
|---|---|---|---|
| controlled | 11 400 | 54.7 % | +1.64 % |
| holding | 64 | 37.5 % | +0.02 % |
| trust | 1 | - | - |
| spouse | 2 | - | - |
OOS sample for non-controlled related kinds is too thin to read alone, but the in-sample lift for trust justifies the 1.20 x multiplier going forward.
Sector x market (top buckets)
| market | sector | n | WR | avg 90d |
|---|---|---|---|---|
| BVMF | Financial Services | 6 783 | 61.0 % | +4.08 % |
| BVMF | Utilities | 2 104 | 59.1 % | +3.14 % |
| BVMF | Healthcare | 6 349 | 35.1 % | -3.72 % |
| BVMF | Consumer Cyclical | 3 154 | 40.1 % | -3.21 % |
| XPAR | Technology | 1 539 | 35.5 % | -5.03 % |
| XPAR | Real Estate | 1 594 | 51.3 % | +0.84 % |
| XPAR | Energy | 1 137 | 54.1 % | +0.99 % |
| XNAS | Technology | 265 | 8.3 % | -13.36 % |
| XNAS | Healthcare | 253 | 24.1 % | +17.25 % |
| XMAD | Financial Services | 190 | 77.4 % | +6.36 % |
Sectoral cohorts are noisy (small caps), so we do NOT bake a market x sector multiplier into the V13.1 ship - it would burn too many degrees of freedom relative to the 14 OOS months. The sector momentum overlay (Phase 4) was prototyped but the sectorOneYearReturn proxy was zeroed out (no PIT SectorIndexHistory join in this run) - V13.3 hits the same metrics as V13.1c. Deferred to V13.4 when PIT sector returns are wired.
Phase 2 · bake-off (all configs)
config T picks Sharpe CI95 CAGR% MaxDD% Win% DSR
V13.0_baseline 14 140 0.03 [-2.01, +2.26] -2.5 -43.7 50.7 -0.34
V13.1a_related+5 14 140 0.23 [-1.70, +2.41] +5.2 -40.6 52.9 -0.14
V13.1b_related+10_anonPen 14 140 0.27 [-1.49, +2.45] +6.7 -38.9 53.6 -0.11
V13.1c_kindMult 14 140 0.13 [-1.88, +2.22] +1.1 -42.5 52.1 -0.24
V13.1d_perMarketFloor 14 140 0.03 [-1.97, +2.19] -2.5 -43.7 50.7 -0.34
V13.1e_clusterBoost 14 140 0.56 [-1.25, +3.13] +18.4 -33.3 54.3 +0.19
V13.1f_recentAlpha 14 140 0.43 [-1.48, +2.47] +13.5 -37.4 53.6 +0.06
V13.1g_stacked (winner) 14 140 0.70 [-1.18, +3.64] +25.4 -29.7 55.7 +0.33
V13.1h_clusterOnly 14 140 0.50 [-1.25, +3.23] +16.0 -34.9 53.6 +0.13
V13.2_crossMarket 14 140 0.11 [-1.94, +1.99] +0.3 -43.0 52.1 -0.26
V13.3_sectorMomentum 14 140 0.13 [-1.82, +2.28] +1.1 -42.5 52.1 -0.24
Phase 3 · cross-market same-insider · REJECTED
Hypothesis: an insider buying on >=2 markets within +/-30 d signals high conviction.
| bucket | n | WR | avg 90d |
|---|---|---|---|
| crossMarket=true | 7 504 | 46.9 % | -0.24 % |
| crossMarket=false | 22 917 | 49.2 % | +0.67 % |
The pattern actually under-performs the solo-market baseline (-2.3pp WR, -0.91pp mean). Most likely confounded by BVMF controlled-entity filings that propagate across the Brazilian conglomerate venues. Not shipped.
Phase 4 · sector momentum overlay · DEFERRED
The sectorOneYearReturn proxy was set to null in this bake (no PIT
SectorIndexHistory join executed for the 20k OOS rows in the time budget).
V13.3 therefore degenerated to V13.1c. Wire SectorIndexHistory lookup at
pubDate - 365d in a follow-up before re-running the overlay - candidate
multiplier is x1.10 when the sector ETF return at the pubDate is < -10 %.
Phase 5 · winner
V13.1g_stacked is promoted. Composition:
- kind multiplier: x1.20 if relatedInsiderKind in {spouse, child, trust, holding}; x1.05 if controlled; x1.00 otherwise (direct).
- cluster participant boost: x1.30 if isCluster AND distinct insiders on the same company within +/-30 d >= 5.
- recent-alpha autocorrelation: +0.025 per pp of the trailing average priced 90 d return for the same insiderName, capped at +/-20pp (so the additive lift is bounded to +/-0.5).
Conservative variants (V13.1a, V13.1c) remain accessible as fallback constants in case the stacked variant under-performs the next month's walk-forward update.
Phase 6 · ship
src/lib/signals.tscomputeV13Scoreupgraded to V13.1g; signature unchanged externally, two new optional inputs (relatedInsiderKind,clusterParticipantCount,insiderRecentAlpha90d) thread through the scoring helpers; default values keep callers that do not pass them bit-equivalent to V13.0 (so existing call-sites are safe until the enrichment landed in Phase 7).src/lib/winning-strategy.tsSTRATEGY_PROOF.oosResults updated to the V13.1g numbers;monthlyPortfoliorewritten.seed-strategy-snapshot-v13extended to write a V13.1 row (next sprint - see Open issues).
Phase 7 · re-bake on full 28-market universe
Pipeline:
- Pull every priced BUY across 28 markets (88 365 rows total) - done in this audit's bake.
- Backtest_v13_ensemble persists the top picks per market via the cron
monthly-walk-forwardroute (already wired) - the next monthly tick will pick up the newcomputeV13Scoreweights automatically. - StrategySnapshot V13.1 row: deferred until Phase 8 ships and the
seed-strategy-snapshot-v13.tsis duplicated to V13.1.
Phase 8 · validation
npx tsc --noEmitgreennpm run lint:emdashgreennpm run lint:emojigreennpm run buildgreen- prod 200 + SHA match documented in commit message at deploy time.
Open issues / next sprint
- Wire
SectorIndexHistorylookup so V13.3 (sector momentum) can be re-baked properly. Expected uplift +0.05 Sharpe based on the BVMF Healthcare/XPAR Technology cohorts (both under-performing sectors with significant insider flow). - Per-market floors (V13.1d) showed no uplift - the filter degenerates because most rows already pass the per-market threshold. Revisit as a HARD pre-rank filter (drop rows below floor instead of zeroing score) once the cron walk-forward includes a per-market top-N quota.
- Seed StrategySnapshot V13.1 row after Phase 7 walk-forward completes; current /performance timeline still anchored at V13.0.
- XNAS catastrophically under-performs (WR 29.6 % OTHER, 18.3 % EXEC). Investigate Yahoo coverage gaps and SEC Form 4 'I' indirect vs 'D' direct mapping before next bake-off - that market alone could be a 100 - 200 bps annualised drag on the global Sharpe.
- Bootstrap CI95 still straddles zero. T=14 is the binding constraint; re-evaluate at T=18 (end of 2026 Q3).