V14 related-insider boost · audit (2026-05-20)
TL;DR
Ship verdict: SHIP V14e_tuned_light (Sharpe 1.31 OOS, beats V13.5_stack 1.22).
The aggressive boost variants (+0.20 entity / +0.35 family) destroy Sharpe (0.77) because the family bucket is statistically empty (spouse=7, no child/parent after backfill snapshot). The light-coefficient variant (+0.10 entity / +0.20 family) extracts modest lift without overfitting the tiny family cohort.
Cohort
- 59 952 BUY rows since 2023-01-01 with
returnFromPub90dfilled. - Train IS (pub < 2025): 38 163 rows.
- OOS (pub >= 2025): 21 789 rows, 14 monthly buckets.
Related-insider kind distribution
| kind | rows | share |
|---|---|---|
| controlled | 30 297 | 50.5% |
| trust | 550 | 0.92% |
| holding | 139 | 0.23% |
| spouse | 7 | 0.01% |
| child / parent | 0 | 0% |
The family bucket (spouse/child/parent) is essentially empty in the current data snapshot. The backfill job a2165bf79758ddaed should enrich this; rerun V14 after backfill completes to retest the family hypothesis fairly.
Variants tested
| variant | entity boost | family boost | gate |
|---|---|---|---|
| V14a_symmetric | +0.20 | +0.35 | fails |
| V14b_asymmetric | +0.20 | +0.35 | fails (same as 14a, BUY-only cohort) |
| V14c_size_gated | +0.20 | +0.35 (only if pctMcap > 0.1) | borderline |
| V14d_proxy_only | 0 | +0.35 | identical to baseline |
| V14e_tuned_light | +0.10 | +0.20 | passes |
OOS results (2025+, Top-10/mo, T+90, NET 0.6% RT, winsor +/-50%)
| config | T | picks | Sharpe | CI95 | CAGR% | MaxDD% | Win% | DSR |
|---|---|---|---|---|---|---|---|---|
| V13.5_stack_baseline | 14 | 140 | 1.22 | [-0.56, 4.12] | 46.1 | -14.1 | 57.9 | 0.31 |
| V14a_symmetric | 14 | 140 | 0.77 | [-1.04, 4.07] | 26.3 | -25.4 | 56.4 | -0.14 |
| V14b_asymmetric | 14 | 140 | 0.77 | [-1.04, 4.07] | 26.3 | -25.4 | 56.4 | -0.14 |
| V14c_size_gated | 14 | 140 | 1.23 | [-0.64, 4.41] | 44.6 | -19.5 | 58.6 | 0.32 |
| V14d_proxy_only | 14 | 140 | 1.22 | [-0.56, 4.12] | 46.1 | -14.1 | 57.9 | 0.31 |
| V14e_tuned_light | 14 | 140 | 1.31 | [-0.50, 4.80] | 47.4 | -14.1 | 58.6 | 0.40 |
Why V14a/b fail
The +0.35 family boost was sized assuming a non-trivial family cohort. With 7 spouse rows and zero child/parent rows, the boost only fires on trust/holding/controlled. Since controlled is 50% of all rows, V14a effectively multiplies half the universe by 1.20, compressing score differentiation and pushing previously-vetoed CNMV/CVM markets (which are -100, not multiplied) to look comparatively better, but also letting low-conviction controlled rows displace high-conviction direct rows. Net result: more picks, more variance, worse Sharpe.
V14e (+0.10 entity, +0.20 family) is light enough that the controlled boost only matters at the margin (close-call rankings) and preserves the V13.5 selection quality while adding a small lift.
Ship gate (strict)
| check | threshold | actual | pass |
|---|---|---|---|
| Sharpe | >= 1.21 | 1.31 | y |
| Sharpe > V13.5 baseline | strict | 1.31 > 1.22 | y |
| DSR drop vs baseline | <= 0.30 | -0.09 (improvement) | y |
| CI95Lo | >= -2.0 | -0.50 | y |
Verdict: SHIP V14e_tuned_light.
Caveats
- IS train Sharpe is -0.31 (V14e), same negative regime as V13.5 in train. The 2023-2024 cohort was unfavorable for top-10 cross-market BUYs. Live performance depends on OOS regime persistence.
- CI95 lower bound (-0.50) means a 2.5% probability that true Sharpe is negative. Same caveat as V13.5.
- Sample is only 14 monthly buckets. Robust deflation pending more OOS history.
- The family hypothesis is not validated by this run because the data isn't there yet. Re-run after backfill.
Action
- Wire V14e_tuned_light into the live scoring path (replacing V13.5_stack).
- Schedule a re-bake of V14 once the related-insider backfill (job a2165bf79758ddaed) completes and family-kind cohort grows to n>=200.
- Keep V13.5_stack hot-swap available; if V14e under-performs in next quarterly review, rollback is one-line.
Files
scripts/_v13_bakeoff/bake-v14_related.tsscripts/_v13_bakeoff/stats-v14.json/tmp/bake-v14.log