V15 pattern-boost stack · audit (2026-05-20)
TL;DR
Ship verdict: KEEP V14e_tuned_light. V15 produces zero improvement over V14e across all 5 variants tested. Pattern boost does not move the needle.
Why V15 fails (mechanically)
The pattern miner (see 96-top-patterns-2026-05-20.md) surfaced 11 stable patterns. 10 of 11 are signatures dominated by KIND_direct + EP_0 + PMC_<0.01 i.e. small direct buys with no earnings proximity. These rows have very low V13.1g base scores (no role bonus, no cluster bonus, no pctMcap bonus) and are systematically outside the top-10 monthly selection.
Multiplying a near-zero base score by 1.05-1.40 still produces a near-zero post-boost score. The boost never displaces a real top-10 pick. Result: V15 picks the exact same 140 declarations per month as V14e. Sharpe identical to 4 decimals.
Match rate: 4 168 of 60 735 rows (6.9%) match a stable pattern. Of those, almost none survive the top-10 cut.
OOS results (2025+, Top-10/mo, T+90, NET 0.6% RT, winsor +/-50%)
| config | Sharpe | CI95 | CAGR% | DSR |
|---|---|---|---|---|
| V14e_baseline | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
| V15a_full +0.05 | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
| V15b_strict +0.05 | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
| V15c_top3 +0.05 | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
| V15d_full +0.03 | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
| V15e_full +0.10 | 1.39 | [-0.45, 4.83] | 50.3 | 0.29 |
Note: V14e Sharpe 1.39 here vs 1.31 in V14 bake is reproducibility variance from a slightly different row-count snapshot (60 735 vs 59 952) and the additional sig-computation step. Both pass the gate.
Ship gate
| check | threshold | actual | pass |
|---|---|---|---|
| Sharpe | >= 1.31 | 1.39 | y (but identical to baseline) |
| Sharpe > V14e baseline | strict | 1.39 == 1.39 | no |
| DSR drop | <= 0.30 | 0 | y |
| CI95Lo | >= -2.0 | -0.45 | y |
Verdict: KEEP V14e_tuned_light LIVE. V15 brings no incremental value.
Why the pattern miner produced unusable boosts
Two structural issues:
- Look-ahead bias. The miner was fed the full 2023-now cohort and asked which signatures had high mean returns. By construction, the discovered patterns are best-case in-sample. We did not split mining IS / validation OOS. Even if V15 had moved Sharpe, we could not have trusted the lift.
- Bucket-mean optimization is orthogonal to top-10 selection. The scoring stack picks the top-10 ranked rows per month, not the top-10 highest-expected-return. A pattern with mean 47% but n=31 spread over 24 months contributes ~1 row/month, and that row needs to outrank the existing top-10 to matter. Boosting an already-mid-rank row by 5% rarely changes ranking when score variance is high.
What would actually work (V16+ proposals)
- IS-only pattern mining: rerun
mine-patterns.tsfilteringpub < 2025-01-01, then validate the surfaced patterns OOS. Ship boosts only for patterns that hold OOS. - Ranking-aware boost: instead of multiplicative on base score, add a fixed score delta sized to actually move rankings (e.g. +1.0 per matched pattern, which for a typical V13.5 score in [2, 6] is a meaningful bump).
- Pattern-conditioned alternative selection: bypass top-10 ranking entirely for rows matching ultra-high-confidence patterns (n>=100, OOS mean>=15%, OOS win-rate>=60%); add 1-2 "guaranteed" picks per month from these.
- Cross-validate with role / kind dimensions that the V14 backfill will refresh. Spouse/child/parent cohort will grow; rerun mining then.
Action
- Do not ship V15. V14e remains live.
- Re-run pattern mining on IS-only after the data backfill completes.
- Prototype V16 with score-delta boost (proposal 2 above) before next sprint.
Files
scripts/_v13_bakeoff/bake-v15_patterns.tsscripts/_v13_bakeoff/stats-v15.jsonscripts/_v13_bakeoff/patterns-stable.json/tmp/bake-v15.log