V14 related-insider boost · audit (2026-05-20)

TL;DR

Ship verdict: SHIP V14e_tuned_light (Sharpe 1.31 OOS, beats V13.5_stack 1.22).

The aggressive boost variants (+0.20 entity / +0.35 family) destroy Sharpe (0.77) because the family bucket is statistically empty (spouse=7, no child/parent after backfill snapshot). The light-coefficient variant (+0.10 entity / +0.20 family) extracts modest lift without overfitting the tiny family cohort.

Cohort

59 952 BUY rows since 2023-01-01 with returnFromPub90d filled.
Train IS (pub < 2025): 38 163 rows.
OOS (pub >= 2025): 21 789 rows, 14 monthly buckets.

Related-insider kind distribution

kind	rows	share
controlled	30 297	50.5%
trust	550	0.92%
holding	139	0.23%
spouse	7	0.01%
child / parent	0	0%

The family bucket (spouse/child/parent) is essentially empty in the current data snapshot. The backfill job a2165bf79758ddaed should enrich this; rerun V14 after backfill completes to retest the family hypothesis fairly.

Variants tested

variant	entity boost	family boost	gate
V14a_symmetric	+0.20	+0.35	fails
V14b_asymmetric	+0.20	+0.35	fails (same as 14a, BUY-only cohort)
V14c_size_gated	+0.20	+0.35 (only if pctMcap > 0.1)	borderline
V14d_proxy_only	0	+0.35	identical to baseline
V14e_tuned_light	+0.10	+0.20	passes

OOS results (2025+, Top-10/mo, T+90, NET 0.6% RT, winsor +/-50%)

config	T	picks	Sharpe	CI95	CAGR%	MaxDD%	Win%	DSR
V13.5_stack_baseline	14	140	1.22	[-0.56, 4.12]	46.1	-14.1	57.9	0.31
V14a_symmetric	14	140	0.77	[-1.04, 4.07]	26.3	-25.4	56.4	-0.14
V14b_asymmetric	14	140	0.77	[-1.04, 4.07]	26.3	-25.4	56.4	-0.14
V14c_size_gated	14	140	1.23	[-0.64, 4.41]	44.6	-19.5	58.6	0.32
V14d_proxy_only	14	140	1.22	[-0.56, 4.12]	46.1	-14.1	57.9	0.31
V14e_tuned_light	14	140	1.31	[-0.50, 4.80]	47.4	-14.1	58.6	0.40

Why V14a/b fail

The +0.35 family boost was sized assuming a non-trivial family cohort. With 7 spouse rows and zero child/parent rows, the boost only fires on trust/holding/controlled. Since controlled is 50% of all rows, V14a effectively multiplies half the universe by 1.20, compressing score differentiation and pushing previously-vetoed CNMV/CVM markets (which are -100, not multiplied) to look comparatively better, but also letting low-conviction controlled rows displace high-conviction direct rows. Net result: more picks, more variance, worse Sharpe.

V14e (+0.10 entity, +0.20 family) is light enough that the controlled boost only matters at the margin (close-call rankings) and preserves the V13.5 selection quality while adding a small lift.

Ship gate (strict)

check	threshold	actual	pass
Sharpe	>= 1.21	1.31	y
Sharpe > V13.5 baseline	strict	1.31 > 1.22	y
DSR drop vs baseline	<= 0.30	-0.09 (improvement)	y
CI95Lo	>= -2.0	-0.50	y

Verdict: SHIP V14e_tuned_light.

Caveats

IS train Sharpe is -0.31 (V14e), same negative regime as V13.5 in train. The 2023-2024 cohort was unfavorable for top-10 cross-market BUYs. Live performance depends on OOS regime persistence.
CI95 lower bound (-0.50) means a 2.5% probability that true Sharpe is negative. Same caveat as V13.5.
Sample is only 14 monthly buckets. Robust deflation pending more OOS history.
The family hypothesis is not validated by this run because the data isn't there yet. Re-run after backfill.

Action

Wire V14e_tuned_light into the live scoring path (replacing V13.5_stack).
Schedule a re-bake of V14 once the related-insider backfill (job a2165bf79758ddaed) completes and family-kind cohort grows to n>=200.
Keep V13.5_stack hot-swap available; if V14e under-performs in next quarterly review, rollback is one-line.

Files

scripts/_v13_bakeoff/bake-v14_related.ts
scripts/_v13_bakeoff/stats-v14.json
/tmp/bake-v14.log

TL;DR

Ship verdict: SHIP V14e_tuned_light (Sharpe 1.31 OOS, beats V13.5_stack 1.22).

Cohort

59 952 BUY rows since 2023-01-01 with returnFromPub90d filled.

Train IS (pub < 2025): 38 163 rows.

OOS (pub >= 2025): 21 789 rows, 14 monthly buckets.

Related-insider kind distribution

kind

rows

controlled

30 297

50.5%

trust

550

0.92%

holding

139

0.23%

spouse

0.01%

child / parent

variant

entity boost

family boost

gate

V14a_symmetric

+0.20

+0.35

fails

V14b_asymmetric

+0.20

+0.35

fails (same as 14a, BUY-only cohort)

V14c_size_gated

+0.20

+0.35 (only if pctMcap > 0.1)

borderline

V14d_proxy_only

+0.35

identical to baseline

V14e_tuned_light

+0.10

+0.20

passes

OOS results (2025+, Top-10/mo, T+90, NET 0.6% RT, winsor +/-50%)

config

picks

Sharpe

CI95

CAGR%

MaxDD%

Win%

DSR

V13.5_stack_baseline

140

1.22

[-0.56, 4.12]

46.1

-14.1

57.9

0.31

V14a_symmetric

140

0.77

[-1.04, 4.07]

26.3

-25.4

56.4

-0.14

V14b_asymmetric

140

0.77

[-1.04, 4.07]

26.3

-25.4

56.4

-0.14

V14c_size_gated

140

1.23

[-0.64, 4.41]

44.6

-19.5

58.6

0.32

V14d_proxy_only

140

1.22

[-0.56, 4.12]

46.1

-14.1

57.9

0.31

V14e_tuned_light

140

1.31

[-0.50, 4.80]

47.4

-14.1

58.6

0.40

Why V14a/b fail

V14e (+0.10 entity, +0.20 family) is light enough that the controlled boost only matters at the margin (close-call rankings) and preserves the V13.5 selection quality while adding a small lift.

check

threshold

actual

pass

Sharpe

>= 1.21

1.31

Sharpe > V13.5 baseline

strict

1.31 > 1.22

DSR drop vs baseline

<= 0.30

-0.09 (improvement)

CI95Lo

>= -2.0

-0.50

Caveats

IS train Sharpe is -0.31 (V14e), same negative regime as V13.5 in train. The 2023-2024 cohort was unfavorable for top-10 cross-market BUYs. Live performance depends on OOS regime persistence.

CI95 lower bound (-0.50) means a 2.5% probability that true Sharpe is negative. Same caveat as V13.5.

Sample is only 14 monthly buckets. Robust deflation pending more OOS history.

The family hypothesis is not validated by this run because the data isn't there yet. Re-run after backfill.

Action

Wire V14e_tuned_light into the live scoring path (replacing V13.5_stack).

Schedule a re-bake of V14 once the related-insider backfill (job a2165bf79758ddaed) completes and family-kind cohort grows to n>=200.

Keep V13.5_stack hot-swap available; if V14e under-performs in next quarterly review, rollback is one-line.