47 · Site state audit + 20 improvement leads

Date: 2026-05-17 Author: read-only audit agent (no code changed) Scope: snapshot the live state of the codebase, data, content, and infrastructure, then propose the top 20 improvement levers ranked by impact-over-effort.

Sources cited inline as file:line where applicable. Anything not verifiable from this read-only pass is marked "unverified".

A. Data coverage

Per-source row counts come from docs/method-review/41-history-depth-audit.md (last full snapshot 2026-05-17). Sample sufficient = ≥100 rows.

Source	Rows	Oldest	Newest	Depth	Sample OK?	Notes
AMF (FR)	25,733	2020-03-17	2026-05-14	6.2y	yes	gold standard
SEC Form 4 (US)	156,888	2021-05-03	2026-05-15	5.0y	yes	could push to 20y
BaFin (DE)	295	2025-05-02	2026-05-15	1.0y	yes	rolling 12mo cap
SIX SER (CH)	226	2026-04-15	2026-05-15	0.1y	yes	RSS 30d cap
RNS (UK)	192	2026-05-12	2026-05-16	0.0y	yes	backfill 49 mo possible
SEDI (CA)	68	2026-04-30	2026-05-16	0.0y	borderline	ceo.ca 30d cap
CONSOB (IT)	360	2026-04-02	2026-05-15	0.1y	yes	backfill 59 mo possible
CNMV (ES)	101	2026-04-16	2026-05-15	0.1y	yes	backfill 60 mo possible
AFM (NL)	6,392	2006-11-01	2026-05-13	19.5y	yes	bulk XML, exceptional
FSMA (BE)	906	2020-03-31	2026-05-15	6.1y	yes	50/issuer rolling
Oslo Børs (NO)	601	2025-12-23	2026-05-15	0.4y	yes	backfill 56 mo possible
Nasdaq Helsinki (FI)	114	2026-05-13	2026-05-16	0.0y	yes	backfill 61 mo
Nasdaq Copenhagen (DK)	182	2026-02-25	2026-05-15	0.2y	yes	backfill 58 mo
ASX (AU)	202	2026-05-13	2026-05-15	0.0y	yes	Markit ~3 mo cap
FMA (AT)	50	2026-03-31	2026-05-15	0.1y	no	thin, backfill 59 mo
Euronext Dublin (IE)	2	2026-05-12	2026-05-16	0.0y	no	live but empty
SGX (SG)	1	2026-05-17	2026-05-17	0.0y	no	live but empty
CVM (BR)	29,307	2025-02-01	2026-05-15	1.3y	yes	backfill 45 mo, top ROI
DART (KR)	672 (per task brief)	unverified	unverified	unverified	yes (per brief)	history audit run before fresh ingest
EDINET (JP)	971 (per task brief)	unverified	unverified	unverified	yes (per brief)	key live, backfill in flight
SEBI (IN)	0	n/a	n/a	0	no	first ingest pending
HKEX	unverified	unverified	unverified	unverified	unverified	seed mentioned, not in history audit

Aggregate (verified from audit doc): 221k rows across 18 sources with measurable history. Task brief claims 162,615 declarations and 8.3k companies; this number is plausible for the DIRIGEANTS view only since SEC Form 4 lives in its own table. Both numbers should be re-stated as "220k filings, ~8.3k companies" on the landing to remove ambiguity.

Sources with <100 rows that are publicly listed as covered markets: SEDI (68), FMA-AT (50), Euronext Dublin (2), SGX (1), SEBI (0). All of these appear in homepage copy (src/app/page.tsx:34, src/components/landing/LandingHero.tsx:147) and on regulator cards on / (commit d72aa72 says "all 16 markets LIVE on cards"). For Dublin, Singapore, and SEBI this is technically pre-mature.

B. Ingestion health

Cron topology (from vercel.json):

Hourly sync-latest (legacy AMF heartbeat).
Three regional wave-cron triggers, each fired 3x/day in UTC:
- wave-eu at 06:05 / 14:05 / 22:05
- wave-americas at 06:25 / 14:25 / 22:25
- wave-apac at 06:45 / 14:45 / 22:45
sec-form4 separate 3x/day cron (05:30 / 13:30 / 21:30).
Individual fallback crons still wired for each regulator (BAFIN, RNS, SEDI, CONSOB, CNMV, FMA-AT, FI-SE, FSMA, ASX, DK, AFM, OSLO-NO, HEL-FI, Euronext-IE, DART-KR, CVM-BR, SEBI-IN). Mild duplication.
enrich-new (logo, description, gender) at 07:00 / 15:00.
sources-watchdog daily at 09:30 (alerts on stale sources).
alerts-realtime every 15 min, plus daily/weekly digest.
backtest/compute weekly Sun 05:00.

Last 3 runs per source / 7-day success rate: powered by IngestionRun table and exposed at /admin/sources/page.tsx:84 (last 10 runs as sparklines) and /status (public). Live values not captured in this read-only pass , flag as unverified, but infrastructure is correct and per-source success-rate badges render on the admin source page. Recommended: have a separate agent run a production curl on /admin/sources and capture the JSON.

Drift risk: the per-source crons that are listed in vercel.json alongside the wave-* crons can over-ingest if both are kept. Consider deleting the per-regulator crons once the wave triggers are proven stable.

C. Code quality

npx tsc --noEmit → 19 errors in 6 files. Top offender: missing prisma.jpFiling accessor in src/lib/ingest/jp-edinet.ts:439 (model declared in migrations but Prisma client not regenerated, or schema.prisma not updated). TS2339 ×13, TS7006 ×6.
npm run lint:emdash → green, "no em-dashes in user-facing copy".
npm run lint:emoji → red, 26 violations. Visible offender: src/lib/email.ts:833 ("Top X signaux vente"). Purge agent flagged as in-flight in task brief.
Prisma migration drift: migration prisma/migrations/20260517200000_blog_articles/migration.sql creates BlogArticle but prisma/schema.prisma has no BlogArticle model. Other DB tables (*Filing migrations for KR, JP, HK, SG, IN) likewise appear in migrations only , schema.prisma is the source of TS2339 errors. Either prisma db pull to regenerate or hand-add the models.

D. Feature inventory

57 distinct page.tsx files (find src/app -name page.tsx | wc -l).

Public, indexable

/, /methodologie, /performance, /heatmap, /companies, /companies/by-market/[market], /companies/by-sector/[sector], /company/[slug], /insider/[slug], /insiders, /insiders/by-market/[market], /insiders/by-role/[role], /leaderboard/insiders, /top-movers, /earnings-radar, /clusters/recent, /recommendations, /backtest, /portfolio (mixed), /hubs/cluster-signals, /hubs/insider-buying-this-week, /compare/openinsider, /compare/quiver-quant, /use-cases/quant-fund, /pricing, /blog, /blog/[slug], /blog/category/[key], /docs, /docs/competitive, /docs/mcp, /docs/method-review, /docs/method-review/[slug], /status, /privacy, /terms.

Auth

/auth/login, /auth/register, /auth/magic, /auth/verify, /auth/forgot-password, /auth/reset-password.

Account (beta-gated logged-in)

/account/alerts, /account/api-keys, /companies/add.

Admin

/admin (overview shell tabs), plus dedicated pages: /admin/overview, /admin/pipeline, /admin/alerts, /admin/users, /admin/analytics, /admin/audits, /admin/settings, /admin/sources, /admin/recos/quality, /admin/tech, /admin/legacy. 12 pages.

API

~40 route handlers under src/app/api/, plus the cron/*, v1/* REST, mcp/* MCP server, billing/checkout, billing/webhook, weekly-digest, and openapi.json.

E. Scoring and recos

Scoring v5.1 with MARKET_WEIGHTS referenced repeatedly in commit log; live values per recommendation visible at /recommendations.
RecoSnapshot model in prisma/schema.prisma:239 with verifiedAt field used by the 3h coherence cron.
Reco verifier admin panel exists at /admin/recos/quality/page.tsx.
Quant rerun on enriched dataset noted as in-flight in task brief (separate agent). Latest backtest doc is 30-backtest-final-17markets.md plus alpha-discovery rounds 1 and 2 (32-, 34-). No 36-, 38-, 39-, 40-, 43-, 44-, 46- in the method-review/ folder (skipped numbers, normal because tasks reserve slots).

F. Marketing surfaces

Surface	Latest numbers shown	Status
`/` hero	"16 regulators" copy at `page.tsx:34`, `LandingHero.tsx:147`	stale , 17 ingested per task brief, 21 declared
`/` JSON-LD Org	"16 regulators" + offers `Developer 19 / Pro 99` (`page.tsx:404`)	conflicts with /pricing which lists Free/Pro 19/Quant 99
`/` regulator cards	"all 16 markets LIVE" per commit `d72aa72`	over-states Dublin (2 rows), SGX (1 row), SEBI (0 rows)
Hero subtitle	"5 markets · FX-normalised · API + MCP" (`LandingHero.tsx:127`)	stale, contradicts the headline
`/methodologie`	unverified , likely text is current	unverified
`/performance`	quant rerun pending, may show pre-v5.1 numbers	likely stale
`/compare/*`	flagged in-flight in task brief	in-flight
`/pricing`	Free / Pro €19 / Quant €99 (`pricing/page.tsx:71,90`)	OK

Recommended pre-launch: single sweep on every number on /, /methodologie, /performance to reach one canonical set ("~220k filings, 8.3k companies, 17 live markets, 4 capped, 2 seeding").

G. SEO

Sitemaps: 11 distinct sitemap routes for EN/FR companies, insiders, landings, docs, static, plus index /sitemap.xml. Entry counts not captured live but the architecture is well-segmented.
Hreflang: page-level alternates plus layout-level fallback at layout.tsx:200-206. Only EN+FR pair declared. Good but minimal.
Internal mesh: 26-internal-linking-audit.md exists and a refresh shipped in 45924c6. The blog → method-review → hubs spine is the weakest link; homepage barely surfaces the blog.
Schema.org coverage: Organization, SoftwareApplication with Offers on /. No Article, no Dataset, no FAQPage, no BreadcrumbList visible from this pass.
Structured-data validation: never run end-to-end. Recommended.

H. Performance

.next/static size on local build is ~3.9 MB , healthy.
No Sentry, no Datadog, no Web Vitals collector wired. Only mention is admin/tech/page.tsx:872 recommending Sentry.
LCP/CLS field data: missing , nothing in /admin/analytics exposes CWV. Either wire web-vitals lib or pull from CrUX.
Biggest pages by source size: /admin shell (AdminDashboard.tsx 46.6 KB raw, AdminShell.tsx 23.1 KB). Public pages all <50 KB raw.
globals.css is 128 KB. Worth a Tailwind purge audit (Tailwind 4 postcss is in postcss.config.mjs).

I. Security

Cron routes gated by Authorization: Bearer $CRON_SECRET (wave-americas/route.ts:9, all the /cron/* files).
/api/migrate defense behind ALLOW_MIGRATE_ROUTE env (default 404).
Admin pages call getCurrentUser() (src/lib/auth.ts:92); not directly verified that every admin page redirects unauthenticated visitors , sample analytics/page.tsx does redirect(...).
Magic-link auth + password auth + JWT (jose). Reset / forgot flows shipped. Email verification present.
Public /api/v1/* should use API-key gating (rows present in ApiKey model at schema.prisma:373).
.env.example documents UNSUB secret, ALERT webhooks (Slack + Discord), Stripe stubs.
No secret scanner in CI; no Dependabot config visible. Worth adding.
No CSP header set in next.config.ts (unverified line-by-line).

J. Pricing and billing

Tiers and copy live at /pricing. Plans: Free €0, Pro €19/mo, Quant €99/mo, FR + EN dictionaries.
STRIPE_* env vars documented but not wired. api/billing/checkout/route.ts returns 503 until env set. api/billing/webhook/route.ts is a logging stub. CTAs on /pricing render as disabled "Coming soon" when env unset (per file header).
Quota enforcement uses UserPageView table (src/lib/quota/page-quota.ts). Fails open on infra hiccup. Sandbox 10-free-call API mechanism flagged as in-flight in task brief.
Inconsistent offers in homepage JSON-LD (Free / Developer 19 / Pro 99) vs pricing page (Free / Pro 19 / Quant 99). Fix before any schema validation run.

K. Observability

No Sentry, no PostHog, no Logflare wired.
Server logs default to Vercel function logs.
Email events tracked in EmailEvent table.
/status public page exists (commit b4413ac).
Admin debug panels: /admin/sources, /admin/tech, /admin/audits, /admin/analytics, /admin/recos/quality. Per-page verification of "do they work" not done in this pass.
Alert paths: ALERT_SLACK_WEBHOOK_URL and ALERT_DISCORD_WEBHOOK_URL documented in .env.example. sources-watchdog cron is wired. No PagerDuty / OnCall.

L. Honest gaps (claims vs reality)

Claim on site	Reality	Action
"16 regulators live" (hero, JSON-LD)	17 ingested with >1 row, 4 of those have <250 rows total and 2 have <5 rows	Restate as "17 regulators, ~220k filings, 4 markets in seed phase".
"5 markets" (hero subtitle)	Contradicts the headline, leftover copy	Single-pass purge.
"All 16 markets LIVE on cards" (commit `d72aa72`)	Dublin (2), SGX (1), SEBI (0)	Mark these "seeding" with a clear badge, not "live".
Schema.org `Offer`s: Free / Developer 19 / Pro 99	Pricing page sells Free / Pro 19 / Quant 99	Align JSON-LD with `/pricing`.
"Open backtest" (hero)	`/backtest` exists; results from quant rerun in flight	Verify `/backtest` shows v5.1 numbers post-rerun.
50 blog articles FR+EN with categories	`BlogArticle` table created via migration but no Prisma model in `schema.prisma`	Either generate the model or document drift, otherwise tsc fails.
"MCP server"	`/api/mcp/*` and `/docs/mcp` exist; live registry listing unverified	Confirm public MCP endpoint reachable and listed somewhere indexable.
"tsc clean" (implicit)	19 errors in 6 files	Fix before next deploy.
`lint:emoji` (per AGENTS.md no-emoji policy)	26 emoji violations remain	Purge agent finish required before claiming policy enforced.
162,615 declarations	Plausible only if SEC Form 4 (~157k) is counted alongside the `DIRIGEANTS` AMF view; per-table sum is ~221k.	Adopt one canonical KPI; show derivation in `/methodologie`.

M. Top 20 improvement leads (ranked impact-over-effort)

Effort: S < ½ day, M 1-3 days, L >3 days. Impact: high / med / low.

1. Fix tsc + Prisma drift before next deploy

Impact high (CI green is a precondition for everything else). Effort S. First action: prisma db pull + diff schema.prisma, then resolve the 13× TS2339. Owner: backend-engineer.

2. One-pass copy sweep on landing numbers

Impact high (founder-credibility, also unblocks press / pitch). Effort S. First action: produce a single canonical KPI block (filings, companies, markets-live, markets-seeding, history depth) and substitute into /, /methodologie, /performance, JSON-LD, social OG. Owner: content-strategist + product-marketing.

3. CVM Brazil 5-year backfill

Impact high (quadruples CVM dataset, unlocks LATAM alpha). Effort M. First action: extend scripts/ingest-cvm-br.ts to walk weekly ZIPs back to 2021. Owner: data-engineer.

4. CONSOB + Oslo + CNMV 5y backfill bundle

Impact high (3 high-PDMR-discipline markets, same scrape patterns, parallelisable). Effort M. First action: stub three worktrees, one per regulator, run in parallel. Owner: data-engineer.

5. Programmatic SEO insider + company hubs with rich Article schema

Impact high (long-tail org search is dominant traffic for this vertical; competitor openinsider.com thrives on it). Effort M. First action: add Article and BreadcrumbList JSON-LD to /company/[slug] and /insider/[slug] plus per-page meta tuned to recent-trade-context. Owner: seo-specialist.

6. Stripe wire-up sequence

Impact high (zero revenue today). Effort M. First action: wire test-mode keys, ship checkout for Pro, observe one real success, then live mode. Owner: billing-engineer.

7. Sentry + web-vitals + CrUX board

Impact high (one ingestion-cron failure or one bad LCP regression is currently invisible). Effort S. First action: install @sentry/nextjs, wire DSN, send Web Vitals to a /api/vitals collector backed by Postgres. Owner: sre-engineer.

8. Cluster-trading detection on multi-market dataset

Impact high (proprietary alpha angle, competitor differentiator). Effort L. First action: define "cluster" (>=3 PDMRs at same issuer, same direction, same 5-day window) and run sweep over Declaration. Owner: quant-analyst.

9. Backtest in-browser UI on the `/backtest` page

Impact high (live demo lets prospects test before they buy). Effort M. First action: ship a constrained interactive sweep (market, period, top-N filings) feeding /api/backtest/compute. Owner: product-engineer + quant-analyst.

10. Onboarding + first-run flow with sandbox key

Impact high (currently signup → empty page; massive activation loss). Effort M. First action: post-verify, drop the user on a guided 60-second tour ending on a one-click sandbox key issuance. Owner: ui-designer + growth-engineer.

11. Tighten admin auth gate audit

Impact med (currently relying on per-page getCurrentUser + redirect; one missing call leaks the panel). Effort S. First action: middleware that forces role === ADMIN for /admin/* and /api/admin/*. Owner: security-engineer.

12. Watchlist + custom-alerts builder

Impact high (single highest-asked feature in this category; raises DAU and converts free users). Effort L. First action: ship "follow this insider/company" CTA on /insider/[slug] and /company/[slug], backed by UserAlert. Owner: product-engineer.

13. CSV + parquet export

Impact med (one-click "give me everything" is a Quant-tier hook). Effort S. First action: signed-URL export from /api/v1/export gated by Quant tier. Owner: backend-engineer.

14. Discord + Slack webhook deliveries for user alerts

Impact med (sticky integration; competitor quiverquant lacks it). Effort S. First action: extend UserAlert channel enum to SLACK | DISCORD. Owner: integration-engineer.

15. GDPR + MAR disclosure polish + right-to-erasure UI

Impact med (regulatory risk reduction). Effort S. First action: /privacy review + add /account/delete self-service. Owner: compliance-officer + product-engineer.

16. Image + font + ISR perf pass

Impact med (LCP shave 200-500ms, helps SEO). Effort S. First action: subset JetBrains Mono + main display font; convert hero PNGs to AVIF; tune revalidate on /company/[slug]. Owner: frontend-engineer.

17. Schema.org enrichment: Article, FAQPage, Dataset, BreadcrumbList

Impact med. Effort S. First action: ship Article on every blog post and method-review page; Dataset on /api. Owner: seo-specialist.

18. Annual discount + team seats + enterprise tier

Impact med (annual lifts ARR per active user, team seats unlock agencies). Effort M. First action: add Stripe price IDs for annual; price page UX for billing toggle. Owner: billing-engineer.

19. Editorial calendar + weekly digest cadence + founder voice channel

Impact med (organic content moat compounds slowly). Effort L. First action: lock 12-week editorial plan, queue the weekly-digest cron output as a public newsletter page with sign-up. Owner: content-strategist.

20. MCP marketplace listing + Zapier connector

Impact med (distribution leverage for the API). Effort M. First action: submit MCP server to the public registry; build a Zapier trigger on "new cluster signal" + "new buy by tracked insider". Owner: partnerships-engineer.

N. Skipped / unverified

DART (KR), EDINET (JP), HKEX live counts , task brief gives values but the per-source history audit (41-) wasn't re-run for these.
Live BO debug panel checks (browser-based) skipped to keep this read-only.
Bundle analysis (next build --profile) skipped.
CSP headers in next.config.ts not fully audited.
/admin/analytics cohort numbers not pulled.
Live /status JSON body not curled.

Pick these up in a follow-up agent that is allowed to hit production.

47 · Site state audit + 20 improvement leads

Sources cited inline as file:line where applicable. Anything not verifiable from this read-only pass is marked "unverified".

A. Data coverage

Per-source row counts come from docs/method-review/41-history-depth-audit.md (last full snapshot 2026-05-17). Sample sufficient = ≥100 rows.

Source	Rows	Oldest	Newest	Depth	Sample OK?	Notes
AMF (FR)	25,733	2020-03-17	2026-05-14	6.2y	yes	gold standard
SEC Form 4 (US)	156,888	2021-05-03	2026-05-15	5.0y	yes	could push to 20y
BaFin (DE)	295	2025-05-02	2026-05-15	1.0y	yes	rolling 12mo cap
SIX SER (CH)	226	2026-04-15	2026-05-15	0.1y	yes	RSS 30d cap
RNS (UK)	192	2026-05-12	2026-05-16	0.0y	yes	backfill 49 mo possible
SEDI (CA)	68	2026-04-30	2026-05-16	0.0y	borderline	ceo.ca 30d cap
CONSOB (IT)	360	2026-04-02	2026-05-15	0.1y	yes	backfill 59 mo possible
CNMV (ES)	101	2026-04-16	2026-05-15	0.1y	yes	backfill 60 mo possible
AFM (NL)	6,392	2006-11-01	2026-05-13	19.5y	yes	bulk XML, exceptional
FSMA (BE)	906	2020-03-31	2026-05-15	6.1y	yes	50/issuer rolling
Oslo Børs (NO)	601	2025-12-23	2026-05-15	0.4y	yes	backfill 56 mo possible
Nasdaq Helsinki (FI)	114	2026-05-13	2026-05-16	0.0y	yes	backfill 61 mo
Nasdaq Copenhagen (DK)	182	2026-02-25	2026-05-15	0.2y	yes	backfill 58 mo
ASX (AU)	202	2026-05-13	2026-05-15	0.0y	yes	Markit ~3 mo cap
FMA (AT)	50	2026-03-31	2026-05-15	0.1y	no	thin, backfill 59 mo
Euronext Dublin (IE)	2	2026-05-12	2026-05-16	0.0y	no	live but empty
SGX (SG)	1	2026-05-17	2026-05-17	0.0y	no	live but empty
CVM (BR)	29,307	2025-02-01	2026-05-15	1.3y	yes	backfill 45 mo, top ROI
DART (KR)	672 (per task brief)	unverified	unverified	unverified	yes (per brief)	history audit run before fresh ingest
EDINET (JP)	971 (per task brief)	unverified	unverified	unverified	yes (per brief)	key live, backfill in flight
SEBI (IN)	0	n/a	n/a	0	no	first ingest pending
HKEX	unverified	unverified	unverified	unverified	unverified	seed mentioned, not in history audit

B. Ingestion health

Cron topology (from vercel.json):

Hourly sync-latest (legacy AMF heartbeat).
Three regional wave-cron triggers, each fired 3x/day in UTC:
- wave-eu at 06:05 / 14:05 / 22:05
- wave-americas at 06:25 / 14:25 / 22:25
- wave-apac at 06:45 / 14:45 / 22:45
sec-form4 separate 3x/day cron (05:30 / 13:30 / 21:30).
Individual fallback crons still wired for each regulator (BAFIN, RNS, SEDI, CONSOB, CNMV, FMA-AT, FI-SE, FSMA, ASX, DK, AFM, OSLO-NO, HEL-FI, Euronext-IE, DART-KR, CVM-BR, SEBI-IN). Mild duplication.
enrich-new (logo, description, gender) at 07:00 / 15:00.
sources-watchdog daily at 09:30 (alerts on stale sources).
alerts-realtime every 15 min, plus daily/weekly digest.
backtest/compute weekly Sun 05:00.

C. Code quality

npx tsc --noEmit → 19 errors in 6 files. Top offender: missing prisma.jpFiling accessor in src/lib/ingest/jp-edinet.ts:439 (model declared in migrations but Prisma client not regenerated, or schema.prisma not updated). TS2339 ×13, TS7006 ×6.
npm run lint:emdash → green, "no em-dashes in user-facing copy".
npm run lint:emoji → red, 26 violations. Visible offender: src/lib/email.ts:833 ("Top X signaux vente"). Purge agent flagged as in-flight in task brief.
Prisma migration drift: migration prisma/migrations/20260517200000_blog_articles/migration.sql creates BlogArticle but prisma/schema.prisma has no BlogArticle model. Other DB tables (*Filing migrations for KR, JP, HK, SG, IN) likewise appear in migrations only , schema.prisma is the source of TS2339 errors. Either prisma db pull to regenerate or hand-add the models.

D. Feature inventory

57 distinct page.tsx files (find src/app -name page.tsx | wc -l).

Public, indexable

Auth

/auth/login, /auth/register, /auth/magic, /auth/verify, /auth/forgot-password, /auth/reset-password.

Account (beta-gated logged-in)

/account/alerts, /account/api-keys, /companies/add.

Admin

API

~40 route handlers under src/app/api/, plus the cron/*, v1/* REST, mcp/* MCP server, billing/checkout, billing/webhook, weekly-digest, and openapi.json.

E. Scoring and recos

Scoring v5.1 with MARKET_WEIGHTS referenced repeatedly in commit log; live values per recommendation visible at /recommendations.
RecoSnapshot model in prisma/schema.prisma:239 with verifiedAt field used by the 3h coherence cron.
Reco verifier admin panel exists at /admin/recos/quality/page.tsx.
Quant rerun on enriched dataset noted as in-flight in task brief (separate agent). Latest backtest doc is 30-backtest-final-17markets.md plus alpha-discovery rounds 1 and 2 (32-, 34-). No 36-, 38-, 39-, 40-, 43-, 44-, 46- in the method-review/ folder (skipped numbers, normal because tasks reserve slots).

F. Marketing surfaces

Surface	Latest numbers shown	Status
`/` hero	"16 regulators" copy at `page.tsx:34`, `LandingHero.tsx:147`	stale , 17 ingested per task brief, 21 declared
`/` JSON-LD Org	"16 regulators" + offers `Developer 19 / Pro 99` (`page.tsx:404`)	conflicts with /pricing which lists Free/Pro 19/Quant 99
`/` regulator cards	"all 16 markets LIVE" per commit `d72aa72`	over-states Dublin (2 rows), SGX (1 row), SEBI (0 rows)
Hero subtitle	"5 markets · FX-normalised · API + MCP" (`LandingHero.tsx:127`)	stale, contradicts the headline
`/methodologie`	unverified , likely text is current	unverified
`/performance`	quant rerun pending, may show pre-v5.1 numbers	likely stale
`/compare/*`	flagged in-flight in task brief	in-flight
`/pricing`	Free / Pro €19 / Quant €99 (`pricing/page.tsx:71,90`)	OK

Recommended pre-launch: single sweep on every number on /, /methodologie, /performance to reach one canonical set ("~220k filings, 8.3k companies, 17 live markets, 4 capped, 2 seeding").

G. SEO

Sitemaps: 11 distinct sitemap routes for EN/FR companies, insiders, landings, docs, static, plus index /sitemap.xml. Entry counts not captured live but the architecture is well-segmented.
Hreflang: page-level alternates plus layout-level fallback at layout.tsx:200-206. Only EN+FR pair declared. Good but minimal.
Internal mesh: 26-internal-linking-audit.md exists and a refresh shipped in 45924c6. The blog → method-review → hubs spine is the weakest link; homepage barely surfaces the blog.
Schema.org coverage: Organization, SoftwareApplication with Offers on /. No Article, no Dataset, no FAQPage, no BreadcrumbList visible from this pass.
Structured-data validation: never run end-to-end. Recommended.

H. Performance

.next/static size on local build is ~3.9 MB , healthy.
No Sentry, no Datadog, no Web Vitals collector wired. Only mention is admin/tech/page.tsx:872 recommending Sentry.
LCP/CLS field data: missing , nothing in /admin/analytics exposes CWV. Either wire web-vitals lib or pull from CrUX.
Biggest pages by source size: /admin shell (AdminDashboard.tsx 46.6 KB raw, AdminShell.tsx 23.1 KB). Public pages all <50 KB raw.
globals.css is 128 KB. Worth a Tailwind purge audit (Tailwind 4 postcss is in postcss.config.mjs).

I. Security

Cron routes gated by Authorization: Bearer $CRON_SECRET (wave-americas/route.ts:9, all the /cron/* files).
/api/migrate defense behind ALLOW_MIGRATE_ROUTE env (default 404).
Admin pages call getCurrentUser() (src/lib/auth.ts:92); not directly verified that every admin page redirects unauthenticated visitors , sample analytics/page.tsx does redirect(...).
Magic-link auth + password auth + JWT (jose). Reset / forgot flows shipped. Email verification present.
Public /api/v1/* should use API-key gating (rows present in ApiKey model at schema.prisma:373).
.env.example documents UNSUB secret, ALERT webhooks (Slack + Discord), Stripe stubs.
No secret scanner in CI; no Dependabot config visible. Worth adding.
No CSP header set in next.config.ts (unverified line-by-line).

J. Pricing and billing

Tiers and copy live at /pricing. Plans: Free €0, Pro €19/mo, Quant €99/mo, FR + EN dictionaries.
STRIPE_* env vars documented but not wired. api/billing/checkout/route.ts returns 503 until env set. api/billing/webhook/route.ts is a logging stub. CTAs on /pricing render as disabled "Coming soon" when env unset (per file header).
Quota enforcement uses UserPageView table (src/lib/quota/page-quota.ts). Fails open on infra hiccup. Sandbox 10-free-call API mechanism flagged as in-flight in task brief.
Inconsistent offers in homepage JSON-LD (Free / Developer 19 / Pro 99) vs pricing page (Free / Pro 19 / Quant 99). Fix before any schema validation run.

K. Observability

No Sentry, no PostHog, no Logflare wired.
Server logs default to Vercel function logs.
Email events tracked in EmailEvent table.
/status public page exists (commit b4413ac).
Admin debug panels: /admin/sources, /admin/tech, /admin/audits, /admin/analytics, /admin/recos/quality. Per-page verification of "do they work" not done in this pass.
Alert paths: ALERT_SLACK_WEBHOOK_URL and ALERT_DISCORD_WEBHOOK_URL documented in .env.example. sources-watchdog cron is wired. No PagerDuty / OnCall.

L. Honest gaps (claims vs reality)

Claim on site	Reality	Action
"16 regulators live" (hero, JSON-LD)	17 ingested with >1 row, 4 of those have <250 rows total and 2 have <5 rows	Restate as "17 regulators, ~220k filings, 4 markets in seed phase".
"5 markets" (hero subtitle)	Contradicts the headline, leftover copy	Single-pass purge.
"All 16 markets LIVE on cards" (commit `d72aa72`)	Dublin (2), SGX (1), SEBI (0)	Mark these "seeding" with a clear badge, not "live".
Schema.org `Offer`s: Free / Developer 19 / Pro 99	Pricing page sells Free / Pro 19 / Quant 99	Align JSON-LD with `/pricing`.
"Open backtest" (hero)	`/backtest` exists; results from quant rerun in flight	Verify `/backtest` shows v5.1 numbers post-rerun.
50 blog articles FR+EN with categories	`BlogArticle` table created via migration but no Prisma model in `schema.prisma`	Either generate the model or document drift, otherwise tsc fails.
"MCP server"	`/api/mcp/*` and `/docs/mcp` exist; live registry listing unverified	Confirm public MCP endpoint reachable and listed somewhere indexable.
"tsc clean" (implicit)	19 errors in 6 files	Fix before next deploy.
`lint:emoji` (per AGENTS.md no-emoji policy)	26 emoji violations remain	Purge agent finish required before claiming policy enforced.
162,615 declarations	Plausible only if SEC Form 4 (~157k) is counted alongside the `DIRIGEANTS` AMF view; per-table sum is ~221k.	Adopt one canonical KPI; show derivation in `/methodologie`.

M. Top 20 improvement leads (ranked impact-over-effort)

Effort: S < ½ day, M 1-3 days, L >3 days. Impact: high / med / low.

1. Fix tsc + Prisma drift before next deploy

Impact high (CI green is a precondition for everything else). Effort S. First action: prisma db pull + diff schema.prisma, then resolve the 13× TS2339. Owner: backend-engineer.

2. One-pass copy sweep on landing numbers

3. CVM Brazil 5-year backfill

Impact high (quadruples CVM dataset, unlocks LATAM alpha). Effort M. First action: extend scripts/ingest-cvm-br.ts to walk weekly ZIPs back to 2021. Owner: data-engineer.

4. CONSOB + Oslo + CNMV 5y backfill bundle

Impact high (3 high-PDMR-discipline markets, same scrape patterns, parallelisable). Effort M. First action: stub three worktrees, one per regulator, run in parallel. Owner: data-engineer.

5. Programmatic SEO insider + company hubs with rich Article schema

6. Stripe wire-up sequence

Impact high (zero revenue today). Effort M. First action: wire test-mode keys, ship checkout for Pro, observe one real success, then live mode. Owner: billing-engineer.

7. Sentry + web-vitals + CrUX board

8. Cluster-trading detection on multi-market dataset

9. Backtest in-browser UI on the `/backtest` page

10. Onboarding + first-run flow with sandbox key

11. Tighten admin auth gate audit

12. Watchlist + custom-alerts builder

13. CSV + parquet export

Impact med (one-click "give me everything" is a Quant-tier hook). Effort S. First action: signed-URL export from /api/v1/export gated by Quant tier. Owner: backend-engineer.

14. Discord + Slack webhook deliveries for user alerts

Impact med (sticky integration; competitor quiverquant lacks it). Effort S. First action: extend UserAlert channel enum to SLACK | DISCORD. Owner: integration-engineer.

15. GDPR + MAR disclosure polish + right-to-erasure UI

Impact med (regulatory risk reduction). Effort S. First action: /privacy review + add /account/delete self-service. Owner: compliance-officer + product-engineer.

16. Image + font + ISR perf pass

17. Schema.org enrichment: Article, FAQPage, Dataset, BreadcrumbList

Impact med. Effort S. First action: ship Article on every blog post and method-review page; Dataset on /api. Owner: seo-specialist.

18. Annual discount + team seats + enterprise tier

Impact med (annual lifts ARR per active user, team seats unlock agencies). Effort M. First action: add Stripe price IDs for annual; price page UX for billing toggle. Owner: billing-engineer.

19. Editorial calendar + weekly digest cadence + founder voice channel

20. MCP marketplace listing + Zapier connector

N. Skipped / unverified

DART (KR), EDINET (JP), HKEX live counts , task brief gives values but the per-source history audit (41-) wasn't re-run for these.
Live BO debug panel checks (browser-based) skipped to keep this read-only.
Bundle analysis (next build --profile) skipped.
CSP headers in next.config.ts not fully audited.
/admin/analytics cohort numbers not pulled.
Live /status JSON body not curled.

Pick these up in a follow-up agent that is allowed to hit production.

47 · Site state audit + 20 improvement leads

A. Data coverage

B. Ingestion health

C. Code quality

D. Feature inventory

Public, indexable

Auth

Account (beta-gated logged-in)

Admin

API

E. Scoring and recos

F. Marketing surfaces

G. SEO

H. Performance

I. Security

J. Pricing and billing

K. Observability

L. Honest gaps (claims vs reality)

M. Top 20 improvement leads (ranked impact-over-effort)

1. Fix tsc + Prisma drift before next deploy

2. One-pass copy sweep on landing numbers

3. CVM Brazil 5-year backfill

4. CONSOB + Oslo + CNMV 5y backfill bundle

5. Programmatic SEO insider + company hubs with rich Article schema

6. Stripe wire-up sequence

7. Sentry + web-vitals + CrUX board

8. Cluster-trading detection on multi-market dataset

9. Backtest in-browser UI on the /backtest page

10. Onboarding + first-run flow with sandbox key

11. Tighten admin auth gate audit

12. Watchlist + custom-alerts builder

13. CSV + parquet export

14. Discord + Slack webhook deliveries for user alerts

15. GDPR + MAR disclosure polish + right-to-erasure UI

16. Image + font + ISR perf pass

17. Schema.org enrichment: Article, FAQPage, Dataset, BreadcrumbList

18. Annual discount + team seats + enterprise tier

19. Editorial calendar + weekly digest cadence + founder voice channel

20. MCP marketplace listing + Zapier connector

N. Skipped / unverified

47 · Site state audit + 20 improvement leads

A. Data coverage

B. Ingestion health

C. Code quality

D. Feature inventory

Public, indexable

Auth

Account (beta-gated logged-in)

Admin

API

E. Scoring and recos

F. Marketing surfaces

G. SEO

H. Performance

I. Security

J. Pricing and billing

K. Observability

L. Honest gaps (claims vs reality)

M. Top 20 improvement leads (ranked impact-over-effort)

1. Fix tsc + Prisma drift before next deploy

2. One-pass copy sweep on landing numbers

3. CVM Brazil 5-year backfill

4. CONSOB + Oslo + CNMV 5y backfill bundle

5. Programmatic SEO insider + company hubs with rich Article schema

6. Stripe wire-up sequence

7. Sentry + web-vitals + CrUX board

8. Cluster-trading detection on multi-market dataset

9. Backtest in-browser UI on the /backtest page

10. Onboarding + first-run flow with sandbox key

11. Tighten admin auth gate audit

12. Watchlist + custom-alerts builder

13. CSV + parquet export

14. Discord + Slack webhook deliveries for user alerts

15. GDPR + MAR disclosure polish + right-to-erasure UI

16. Image + font + ISR perf pass

17. Schema.org enrichment: Article, FAQPage, Dataset, BreadcrumbList

18. Annual discount + team seats + enterprise tier

19. Editorial calendar + weekly digest cadence + founder voice channel

20. MCP marketplace listing + Zapier connector

N. Skipped / unverified

9. Backtest in-browser UI on the `/backtest` page

9. Backtest in-browser UI on the `/backtest` page