106 - signalScoreV13 (V14e) Distribution Audit - 2026-05-21
Context
After the rescore job ran on 2026-05-20, the T+90 coherence audit flagged near-zero
coverage for signalScoreV13 > 30 over the last 90 days. This document records the
real distribution, explains why seuil thresholds from legacy signalScore (V12) do
not transfer to V14e, and documents the fix applied.
DB Snapshot (2026-05-21)
Queried on production Neon (DATABASE_URL).
V14e (signalScoreV13) distribution
| Metric | Value |
|---|---|
| Rows scored | 595,069 |
| Min | -100 |
| Max | 19.92 |
| Mean | -90.33 |
| p25 | -100 |
| p50 | -100 |
| p75 | -100 |
| p95 | 2.24 |
Note: -100 is the sentinel value for blacklisted rows (31,937 confirmed in earlier audit). The effective score range for non-blacklisted rows is 0 to 19.92.
V14e coverage breakdown
| Threshold | Row count |
|---|---|
| signalScoreV13 > 0 | 50,402 |
| signalScoreV13 > 1 | 42,168 |
| signalScoreV13 > 5 | 5,585 |
| signalScoreV13 > 10 | 665 |
| signalScoreV13 = -100 (blacklisted) | 538,986 |
Recent 30-day window
- Total declarations in last 30d: 17,230
- Declarations with
signalScoreV13 > 0in last 30d: 1,496
The rescore job propagated successfully to recent declarations (1,496 active out of 17,230 is consistent with the EU-only filter applied post-fix).
Legacy V12 (signalScore) distribution (reference)
| Metric | Value |
|---|---|
| Min | 0 |
| Max | 73 |
| Mean | 16.37 |
| p50 | 13 |
| p95 | 38 |
Why legacy thresholds do not apply to V14e
The V12 signalScore was calibrated on a 0-100 scale where the majority of eligible
rows scored 10-40 and the best signals reached 60-73. A threshold of signalScore >= 15
was meaningful: it passed roughly 50% of scored rows.
V14e (signalScoreV13) uses a different normalization: the formula produces scores in
[0, 20] for non-blacklisted rows after the rescore (observed max: 19.92). Applying the
legacy >= 15 floor would pass only ~3 rows per market per day instead of the intended
top-N per bucket.
Additionally, the -100 sentinel for blacklisted rows (previously 0 in V12) means that
a naive >= 0 check now passes only genuine positives rather than all scored rows.
Fix applied (2026-05-21)
Updated the BUY candidate SQL in src/lib/recommendation-engine.ts from:
AND d."signalScore" >= 15
to:
AND (
(d."signalScoreV13" IS NOT NULL AND d."signalScoreV13" > 0)
OR (d."signalScoreV13" IS NULL AND d."signalScore" >= 15)
)
This produces a V14e-aware threshold (> 0) for rows that have been rescored, while
preserving the V12 >= 15 floor as fallback for rows where the rescore cron has not yet
run. The COALESCE ranking key (signalScoreV13, signalScore) ensures rescored rows
still sort above un-rescored ones at equivalent quality.
Recommendation for future calibration
Once the backfill is complete (all eligible rows have signalScoreV13 IS NOT NULL),
consider:
- Raising the threshold to
signalScoreV13 >= 1or>= 2to filter marginal rows. - Running a precision/recall curve on the EU_strict cohort to find the optimal cutoff.
- Removing the V12 fallback branch once backfill coverage exceeds 99%.
The current > 0 threshold is conservative (passes ~50k rows) and appropriate for
the transition period. With the EU_strict filter active, the effective candidate pool
is already reduced to 4,012 EU rows with signalScoreV13 > 0 in the last 90 days
across 8 markets.
EU_strict validation query result
SELECT c.market, COUNT(*) AS recos_eu_only
FROM "Declaration" d JOIN "Company" c ON c.id=d."companyId"
WHERE d."signalScoreV13" > 0
AND c.market IN ('XPAR','XAMS','XWBO','XBRU','XHEL','XOSL','XSTO','XETR')
AND d."pubDate" >= NOW() - INTERVAL '90 days'
GROUP BY c.market ORDER BY recos_eu_only DESC;
| market | recos_eu_only |
|---|---|
| XSTO | 1,278 |
| XHEL | 998 |
| XPAR | 897 |
| XAMS | 323 |
| XOSL | 237 |
| XBRU | 160 |
| XWBO | 85 |
| XETR | 34 |
| Total | 4,012 |
Healthy signal density across all 8 EU venues. The per-market ROW_NUMBER partition in the BUY query will surface the best signal from each venue before ranking globally.