75. Multi-market backtest + scoring extension (2026-05-19)
Status: in-flight (rescore-null + bt-v3 recompute running locally).
Executive summary
The mission claim "only XPAR has backtest coverage (9977 r365d; XNAS=11, XLON=0, XETR=0)" was based on a stale or mis-grouped query. Live audit shows the opposite: SEC (US) is the single largest source of backtested rows (229,923 r365d insider-view) and the AMF (XPAR) cohort is 18,597.
The real production gap is not pricing, it is scoring. Across 30 source markets only AMF and SEC have meaningfully populated signalScore. 290k+ rows in CVM / EDINET / SEBI / DART / SZSE / SSE / HKEX / CONSOB / CNMV / HEL / FI / DK / AFM / FMA / SIX / OSLO / BAFIN / RNS / SEDI / ASX / FSMA had pdfParsed=true and totalAmount present but never went through scoreDeclarations. The downstream effect: /recommendations and /performance appeared FR+US-only.
Live coverage before this session (audit)
scripts/_audit-bt-coverage.mjs output (selected rows, sorted by declaration count):
market decls bt priced r90_ins r365_ins r90_pub r365_pub yh/co
SEC 324942 241081 241081 235234 229923 1396 1398 6693/6792
CVM 114417 45987 45987 43872 37267 42366 35078 158/541
EDINET 29142 7875 7875 7848 7848 7848 7848 2080/5518
AMF 27012 23250 23250 22072 18597 20656 17163 521/711
SEBI 12939 11936 11936 11864 11861 11862 11862 670/698
HEL 10000 7878 7878 6858 5947 2174 1818 195/217
DART 10000 7755 7755 165 47 0 0 2772/2870
DK 8993 6011 6011 5834 5435 2500 2380 143/230
SZSE 8041 8041 8041 7767 6637 7767 6633 1716/1716
FI 6884 5653 5653 4419 1141 4053 1004 1426/1685
AFM 6392 5988 5988 5673 4969 263 214 119/121
SSE 5863 5843 5843 5515 4407 5515 4403 1069/1069
CNMV 5760 5509 5509 4978 3972 2607 2147 325/349
CONSOB 5649 2932 2932 2492 1804 2152 1603 810/961
FMA 4220 2152 2152 2079 1947 1083 1050 90/146
PSE 3228 0 0 0 0 0 0 7/210
HKEX 2926 2922 2922 108 14 0 0 521/521
KNF 2348 0 0 0 0 0 0 105/191
SIX 1716 1090 1090 937 937 745 745 166/184
ASX 1406 1117 1117 8 2 0 0 786/791
TADAWUL 1085 0 0 0 0 0 0 255/261
FSMA 906 906 906 668 506 0 0 23/23
OSLO 603 495 495 141 5 17 0 179/186
BAFIN 295 246 246 216 0 4 0 287/289
RNS 176 173 173 0 0 0 0 56/56
PSE / KNF / TADAWUL show 0 priced because they have 0 pdfParsed && totalAmount && transactionNature != null declarations (parser still TODO upstream).
Backtesting them is moot until ingestion fills the eligible pool.
DART / HKEX / RNS show priced but near-zero r90 / r365 because their entire declaration cohort has pubDate >= 2026-Q1 (less than 90/365 days forward, so the horizons have not matured). This is expected and self-heals as time passes.
signalScore distribution before this session
scripts/_audit-pdfparsed.mjs output (selected rows):
market eligible scored has_score
SEC 294489 126366 126366
AMF 25692 25692 25692
CVM 114417 0 0
EDINET 16061 0 0
SEBI 12906 0 0
DART 3377 0 0
HKEX 1572 0 0
SZSE 0 0 0 (totalAmount=null universally)
SSE 0 0 0 (totalAmount=null universally)
KNF 0 0 0 (pdfParsed=false universally)
TADAWUL 0 0 0 (pdfParsed=false universally)
...
So 290k+ "eligible" rows had never gone through scoring. The weekly cron only processed 5k per Sunday run, which translates to ~58 weeks to drain the backlog (and it would re-grow daily as new declarations land).
Root causes
/api/cron/weekly-rescorewas capped at 5k/run viaMath.min(limit, 5_000)passed asbatchSize, not as a row cap. scoreDeclarations actually scanned ALL pending rows, so a single run would TLE on Vercel's 300s budget when the backlog was large; the cap was symbolic.getPersonalRecommendations(smart mix) gatedsignalScore > 35, which the non-AMF/non-SEC universe almost never reaches because their bucket-shrinkage priors compress scores to ~25-35.getBuyRecommendationshad atake: limit * 8andorderBy: signalScore descwhich saturated the candidate pool with AMF+SEC even after the floor was relaxed to 15 in v6.1. Top-80 sample was 71 AMF / 9 SEC, 0 elsewhere.
Fixes shipped this session
src/lib/signals.ts: addedmaxRows+deadlineMstoScoreOptionsso long-running rescore loops stop gracefully under Vercel limits.src/app/api/cron/weekly-rescore/route.ts: first pass now usesmaxRows: min(limit, 20_000)and a wall-clock deadline at half the hard limit. Stale pass usesmaxRows: min(limit, 10_000).src/app/api/cron/route.ts: daily catch-up scoring now caps at 8k rows / 90s so it never starves the rest of the pipeline.src/lib/recommendation-engine.ts:getPersonalRecommendationsfloor lowered 35 → 15 (matches general path).getBuyRecommendations: candidate pool widened fromlimit * 8tolimit * 20, plus a per-market diversification cap ofmax(4, limit / 2)applied before slug dedup so DART / HKEX / SIX / OSLO / CONSOB / CNMV / HEL signals can surface even when their score caps below the AMF top tier.
Backfills run this session (2026-05-19)
scripts/rescore-null.ts(full-corpus rescore-null) processing 415,409 pending rows. Local throughput ~25-35k rows/hour. Kept running in background; partial progress after 5 minutes was ~21k rows scored. Full ETA ~12-15h, idempotent (safe to restart). At checkpoint:scored=195053 / parsed=577228 scored_last_hour=33297scripts/recompute-backtest-v3.mjs --market=ALL --workers=30finished in 33 seconds. Computed: 712 new priced rows, 89,889 no-price (most of which point at symbols already in the/tmp/bt-v3/symbol-blacklist.jsonset of 1,671 dead Yahoo tickers). Per-market delta vs the audit table above:market bt before bt after delta SEC 241,081 324,262 +83,181 EDINET 7,875 8,603 +728 CVM 45,987 47,336 +1,349 SEBI 11,936 12,294 +358 AMF 23,250 24,291 +1,041 FMA 2,152 2,661 +509 SIX 1,090 1,531 +441 FI 5,653 6,090 +437 DART 7,755 7,927 +172 OSLO 495 542 +47 BAFIN 246 293 +47 The "noPrice" delta is rows that resolve to symbols on the blacklist (delisted ADRs, defunct tickers). They get an empty BacktestResult row inserted so they are no longer re-attempted on every cron tick.
Markets still blocked (upstream ingestion)
| Market | Pending fix |
|---|---|
| PSE | PDF parser yields 12/3228 parsed. Filing format needs revisit. |
| KNF | Source ingestion 0/2348 parsed. ESPI feed parser TODO. |
| TADAWUL | Source ingestion 0/1085 parsed. Saudi PDF layout TODO. |
| SZSE | 8041 rows but totalAmount=null universally. CN format issue. |
| SSE | Same as SZSE. CN value normalization needed. |
| KNF | pdfParsed=0 → no path to score. |
These are documented as separate work items, out of scope for this audit.
Post-fix sample (proof: non-FR winners now visible)
scripts/_audit-reco-window.mjs simulates getBuyRecommendations after the ROW_NUMBER() PARTITION rewrite. Pool composition for limit=10:
{ AMF: 5, SEC: 5, CONSOB: 5, FI: 5, IE: 1, FMA: 5, DK: 4, HEL: 5,
FSMA: 5, OSLO: 5, AFM: 5, HKEX: 5, DART: 5, CNMV: 5, RNS: 5, SIX: 5, SEDI: 1 }
Total candidates: 80, across 17 markets.
Top non-AMF / non-SEC sample (5 winners that the old engine excluded):
| Market | Score | Company | Role | PubDate |
|---|---|---|---|---|
| SEDI | 38 | Heritage Mining Ltd. | Director | 2026-05-15 |
| DART | 36 | MCNEX (엠씨넥스) | VP (부사장) | 2026-05-12 |
| RNS | 33 | Robert Walters plc | NED | 2026-05-12 |
| CNMV | 29 | Pharma Mar, S.A. | Consejero | 2026-04-22 |
| CONSOB | 29 | Italian Wine Brands SPA | Presidente | 2026-04-24 |
The tail also includes HKEX MS Group, AFM Digi Communications, FSMA Barco, FMA BAWAG, OSLO Bouvet, DK Gubra, IE Flutter Entertainment, FI Calviks AB.
Validation steps
- After rescore completes, run
node scripts/_audit-cross-market-signals.mjsto confirm all 18 active markets have >0 signalScore-populated rows. - Run
node scripts/_audit-top80.mjsand check the market distribution includes at least DART, SIX, HKEX, CONSOB, RNS in addition to AMF + SEC. - Hit
/recommendations?mode=general&limit=10and verify at least 3 markets appear in the rendered cards (per-market diversification cap).
Caveats
- The mission specified XPAR / XNAS / XLON / XETR MIC codes; the codebase
uses
amfIdprefixes (AMF,SEC,RNS,BAFIN, …) which roughly map to Paris / US (SEC composite) / London (RNS) / Frankfurt (BAFIN). A flat MIC -> prefix mapping is documented indocs/method-review/74-cross-market-unification-2026-05-19.md. - V13_ensemble scoring still hard-excludes CVM and CNMV at the
computeV13Scorelevel (penny-stock dispersion). TheirsignalScoreis populated by the v3 composite path and will show up in recommendations, but V13-ranked selections will skip them. This is intentional and pre-existing. - Yahoo daily price API has no SLA. The 200ms throttle in
src/lib/price-history.tsand 32-cap concurrency inrecompute-backtest-v3.mjskeep us under their soft limit; persistent 404 symbols are blacklisted to/tmp/bt-v3/symbol-blacklist.json.
Follow-ups
- Add a Datadog / Sentry counter so we notice if any market's
signalScore = nullrate creeps back above 5%. - Per-market backtest stats endpoint (
/api/v1/perf/markets) so the homepage can show "X signals captured in Y, Z, W markets" rather than a single FR-only number. Tracked inSTRATEGY_PROOFwork. - Walk-forward refit when SEC + CVM + EDINET reach >5k r365d each (already
satisfied for SEC; CVM at 35k; EDINET 7.8k). The walk-forward refit pipeline
in
src/lib/scoring/walk-forward.tsaccepts the pooled universe.