56 · Aggressive scoring variant bake-off (10 variants) — 2026-05-18

TL;DR

Re-ranked the last-6-month OOS BUY cohort (n=2 572 priced) under 10 incremental variants on top of the already-promoted V2 senior+20 production baseline. Winner is V11 (role multiplier) with ΔSharpe +0.247. Two clean runners-up: V10 (net-buyer history) +0.175 and V7 (sector momentum, with lookahead caveat) +0.237. V11 was promoted into src/lib/signals.ts as a final multiplicative tilt inside functionScore(). No DB recompute yet, no commit.

Methodology

Cohort: every priced BacktestResult row with direction=BUY, priceAtTrade > 0, return90d != null, pubDate ≥ 2025-11-16 (6 mo). n = 2 572.
Each variant applies an adjustment on top of the existing signalScore (which already contains the V2 senior+20 tilt from method review 53).
Selection: top-quintile by adjusted score (k=219).
Metrics: Sharpe ann (rf=4%/yr q), win%, mean 90d, max DD on equal-weight cumulative basket ordered by pubDate.
Runner: scripts/_scoring-variant-bake-v2.mjs. All 10 variants run in parallel (Promise.all, ~2 s wall-clock).

Results

Variant	n	mean90d	win%	Sharpe	Max DD	ΔSharpe
V2 baseline (senior+20)	219	+1.62 %	54.79	0.047	-99.97 %	0
V5 cluster +12	219	+2.22 %	50.23	0.077	-100 %	+0.030
V6 amount band >1M +8	219	+2.20 %	51.14	0.080	-99.97 %	+0.033
V7 sector momentum +15 †	219	+5.15 %	61.19	0.284	-99.83 %	+0.237
V8 sell-off contrarian +20	219	+3.49 %	56.16	0.174	-99.85 %	+0.127
V9 small mkt cap +10	219	+1.06 %	53.42	0.004	-99.99 %	-0.043
V10 net-buyer history +15	219	+4.39 %	56.62	0.222	-99.96 %	+0.175
V11 role multiplier	219	+4.98 %	62.10	0.294	-98.43 %	+0.247
V12 post-earnings drift +10	219	+1.93 %	49.32	0.058	-100 %	+0.011
V13 Jan/Feb seasonality +8	219	+3.07 %	52.97	0.122	-99.94 %	+0.075
V14 gender audit ±4	219	+2.95 %	54.79	0.125	-99.98 %	+0.078

† V7 uses a same-cohort sector-mean proxy → look-ahead leak. Treat as diagnostic only; revisit after sector-index enrichment lands on all OOS rows (cf. method review 53 caveat).

Max DD column reads ~-100 % because the equal-weight cumulative compounder chains a few extreme small-cap drawdowns; the cohort metric (mean + win%) is the headline. Position-sizing and stop-loss are out of scope for the scoring engine.

Decision

Promote V11 (role multiplier CEO×1.3 / CFO×1.2 / Chairman×1.1 / Board×0.9 / Officer×0.7). Hurdle was +0.10 — V11 cleared at +0.247.
V7 deferred until sector momentum can be computed point-in-time without look-ahead.
V10 (net-buyer history) is a candidate for a future additive pass; keeps V11 as the primary tilt to avoid double-counting role conviction.

Implementation

src/lib/signals.ts · functionScore() now applies ROLE_MULTIPLIER_V11 on the role-budget output. Composite still clamped to [0, 100] in computeScore(). Effective senior-role contribution rises ~30 % at the top of the tier (PDG/DG: 17 → 22 raw before clamp; clamp prevents runaway).

No DB recompute yet. Next signals:recompute cron will rescale all rows; backtest stats will refresh on the following _backtest-stats.mjs run.

Caveats

Same dataset reused for V2 calibration (method review 53) and this V11 bake. Treat as confirmatory rather than fully out-of-sample.
n=219 per top-quintile bucket → Sharpe stderr ≈ 0.07. ΔSharpe +0.247 is statistically meaningful but not bulletproof. Re-bake at next quarterly cohort.
Max DD figures are unrealistic without position sizing. Headline metric is mean 90d return + win%, not the basket DD.

User-customizable weights — wiring path

The bake-off also unlocked per-user signal weights. Architecture (not yet implemented in code — proposed plan):

Schema: User.signalWeights Json? (nullable; fallback to defaults). Defaults encoded in src/lib/signals.ts as DEFAULT_SIGNAL_WEIGHTS.
API:
- GET /api/account/signal-weights → { roleWeight, clusterBonus, amountTier, sectorMomentum, contrarian, mktCap, netPosition, postEarnings } (returns user override or defaults).
- PATCH /api/account/signal-weights body = partial weights JSON; validates with Zod, stores on User.signalWeights.
UI: /portfolio settings tab My signals with 8 sliders, live preview of recommendation re-rank on a 10-row sample (client-side).
Recommendations: in /recommendations, if session user has custom weights, re-rank the already-fetched server payload client-side using the same formula exposed via src/lib/signals-client.ts (extract the pure scoring functions into a shared module — no DB recompute). Cache per-user 1 h via React Query.
Lightweight by design: server still serves the canonical V11-scored recommendations; the user override is a pure client-side re-rank. No per-user backend load beyond 1 row read of User.signalWeights at session hydration.

Files to touch when implementing:

prisma/schema.prisma — add signalWeights Json? on User + migration.
src/lib/signals.ts — export DEFAULT_SIGNAL_WEIGHTS + a pure rerank(items, weights) helper.
src/lib/signals-client.ts (new) — mirror of the pure scoring math, no Prisma imports.
src/app/api/account/signal-weights/route.ts (new) — GET/PATCH.
src/app/portfolio/settings/my-signals/page.tsx (new) — sliders UI.
src/app/recommendations/page.tsx — read user weights, apply client-side rerank.

Estimated cost: ~3 h dev, no infra cost. Defer until V11 has shipped to production and stats have refreshed.