Audit 109 · Ingest health · 2026-05-21
Period audited: 2026-05-14 to 2026-05-21 (7 days)
1. Summary
| Dimension | Count |
|---|---|
| Sources tracked (SOURCE_DEFS enabled) | 32 |
| Filing tables in staging | 26 |
| Sources with DB activity last 7d | 17 |
| Sources stale (no new rows > 48h) | 9 |
| GH Actions ingest workflows failing | 20 |
| Root causes identified | 1 (shared: DATABASE_URL secret missing in GH Actions) |
Verdict: 20_SOURCES_DOWN
All 20 GitHub Actions ingest workflows are failing with the same root cause. The three Vercel cron waves (wave-eu, wave-americas, wave-apac) are healthy -- they run in the Vercel runtime where DATABASE_URL is set. The GH Actions runners have no DATABASE_URL injected into the prisma generate step.
2. GH Actions workflow breakdown (2026-05-14 to 2026-05-21)
2a. Healthy workflows (success only)
| Workflow | Runs | Conclusion |
|---|---|---|
| CI Build and Type Check | 229 total (success days: 14-21, fail days: 14/15/18/19/20) | Mixed |
| Auto-backup Tag on main push | 276 total | success |
Note: CI failures are pre-existing TypeScript/lint errors unrelated to ingest.
2b. Failing ingest workflows (all 3 consecutive days: 2026-05-18, 19, 20)
| Workflow | Last seen failing |
|---|---|
| Ingest -- AFM (Netherlands) | 2026-05-20 |
| Ingest -- ASX (Australia) | 2026-05-20 |
| Ingest -- BaFin Directors' Dealings | 2026-05-17, 18, 19, 20 (4 days) |
| Ingest -- CNMV (Spain) | 2026-05-18, 19, 20 |
| Ingest -- Consob (Italy) | 2026-05-18, 19, 20 |
| Ingest -- DART (South Korea) | 2026-05-18, 19, 20 |
| Ingest -- EDINET v2 (Japan) | 2026-05-18, 19, 20 |
| Ingest -- FMA / OeKB (Austria) | 2026-05-18, 19, 20 |
| Ingest -- FSMA (Belgium) | 2026-05-18, 19, 20 |
| Ingest -- Finansinspektionen (Sweden) | 2026-05-18, 19, 20 |
| Ingest -- HKEX (Hong Kong) | 2026-05-18 |
| Ingest -- Ireland (Euronext Dublin) | 2026-05-18, 19, 20 |
| Ingest -- Nasdaq Copenhagen (Denmark) | 2026-05-18, 19, 20 |
| Ingest -- Nasdaq Helsinki (Finland) | 2026-05-18, 19, 20 |
| Ingest -- Oslo Bors | 2026-05-18, 19, 20 |
| Ingest -- SEBI (India) | 2026-05-18, 19, 20 |
| Ingest -- SEDI (Canada) | 2026-05-18, 19, 20 |
| Ingest -- SGX (Singapore) | 2026-05-18, 19, 20 |
| Ingest -- UK RNS (Investegate) | 2026-05-17, 18, 19, 20 (4 days) |
2c. GH Actions cron not listed above (not observed failing)
AMF, SEC, BaFin wave-based, wave-eu, wave-americas, wave-apac run via Vercel cron -- not GH Actions. No failure observed on Vercel side.
3. Root cause (confirmed via gh run view ... --log-failed)
All 20 failing GH Actions ingest workflows produce the same error:
Failed to load config file as a TypeScript/JavaScript module.
Error: PrismaConfigEnvError: Missing required environment variable: DATABASE_URL
npm error command sh -c prisma generate
The postinstall script runs prisma generate, which requires DATABASE_URL to parse prisma.config.ts. The GitHub Actions secret DATABASE_URL is not being injected into the environment before npm install / prisma generate.
Fix required: Add DATABASE_URL: ${{ secrets.DATABASE_URL }} to the env: block of the Setup Node + install deps step (or the postinstall-triggering step) in every failing workflow YAML.
This is a systemic issue -- likely introduced when prisma.config.ts was added (audit 109). All 20 workflows share the same runner template and all need the same fix.
4. DB freshness per staging table (as of 2026-05-21)
Last createdAt per table. Sources with no rows in last 7 days but last seen < 48h ago are OK (weekend skip logic). Sources last seen > 48h as of audit = stale.
| Table | Total rows | Rows last 7d | Latest createdAt (UTC) | Status |
|---|---|---|---|---|
| SecForm4Filing | 676,862 | 676,862 | 2026-05-19 22:53 | OK |
| OsloFiling | 9,953 | 9,953 | 2026-05-19 10:24 | OK |
| ConsobFiling | 5,665 | 5,665 | 2026-05-19 10:19 | OK |
| JpFiling | 66,244 | 66,244 | 2026-05-19 01:38 | OK |
| RnsFiling | 257 | 257 | 2026-05-18 16:17 | OK |
| IeFiling | 4 | 4 | 2026-05-18 16:17 | OK |
| DartFiling | 98,101 | 98,101 | 2026-05-18 10:04 | OK |
| SebiFiling | 95,002 | 95,002 | 2026-05-18 08:02 | OK |
| KnfFiling | 2,348 | 2,348 | 2026-05-18 07:24 | OK |
| SediFiling | 80 | 80 | 2026-05-18 07:08 | OK |
| AsxFiling | 1,406 | 1,406 | 2026-05-18 06:59 | OK |
| JseFiling | 17 | 17 | 2026-05-18 06:50 | OK |
| NzxFiling | 19 | 19 | 2026-05-18 06:49 | OK |
| SerFiling | 6,941 | 6,941 | 2026-05-18 06:35 | OK |
| DkFiling | 8,993 | 8,993 | 2026-05-18 00:52 | OK |
| CnmvFiling | 8,212 | 8,212 | 2026-05-18 00:01 | OK |
| FmaFiling | 4,220 | 4,220 | 2026-05-17 22:57 | STALE (>48h) |
| FiFiling | 6,884 | 6,884 | 2026-05-17 22:21 | STALE (>48h) |
| HelFiling | 10,000 | 10,000 | 2026-05-17 22:12 | STALE (>48h) |
| HkexFiling | 2,926 | 2,926 | 2026-05-17 21:47 | STALE (>48h) |
| CvmFiling | 114,417 | 114,417 | 2026-05-17 21:14 | STALE (>48h) |
| SgxFiling | 1 | 1 | 2026-05-17 17:45 | STALE (deprecated) |
| FsmaFiling | 906 | 906 | 2026-05-16 22:14 | DOWN (>72h) |
| AfmFiling | 6,392 | 6,392 | 2026-05-16 22:05 | DOWN (>72h) |
| BafinFiling | 295 | 295 | 2026-05-16 20:02 | DOWN (>72h) |
| KapFiling | 0 | 0 | NULL | DOWN (geo-blocked) |
Note: AmfFiling table does not exist in staging -- AMF ingests directly to Declaration.
Last IngestionRun success for AMF-related sources: sec-form4 at 2026-05-18 16:31.
5. Last successful IngestionRun per source
| Source key (IngestionRun.source) | Last success (UTC) |
|---|---|
| jp-edinet | 2026-05-19 01:51 |
| pse-ph | 2026-05-18 18:49 |
| euronext-ie | 2026-05-18 16:34 |
| sec-form4 | 2026-05-18 16:31 |
| knf-pl | 2026-05-18 11:14 |
| nse-in | 2026-05-18 10:33 |
| dart-kr | 2026-05-18 10:19 |
| oslo-no | 2026-05-18 10:08 |
| jse-za | 2026-05-18 07:10 |
| sedi | 2026-05-18 07:08 |
| hkex | 2026-05-18 07:08 |
| nzx-nz | 2026-05-18 06:49 |
| six-ser-sheldon-backfill | 2026-05-18 06:35 |
| consob | 2026-05-18 06:25 |
| hel-fi | 2026-05-18 06:21 |
| bse-bg | 2026-05-17 22:56 |
| tadawul-sa | 2026-05-17 22:44 |
| dk-nasdaq | 2026-05-17 22:13 |
| sebi-in | 2026-05-17 21:38 |
| fi-se | 2026-05-17 21:16 |
| bafin | 2026-05-17 21:16 |
| cvm-br | 2026-05-17 21:15 |
| six-ser | 2026-05-17 21:14 |
| sgx | 2026-05-17 17:45 |
| asx-au | 2026-05-16 22:16 |
| fsma-be | 2026-05-16 22:14 |
| afm | 2026-05-16 22:09 |
| fma-at | 2026-05-16 22:08 |
| cnmv | 2026-05-16 21:32 |
| ser-rss | 2026-05-16 19:49 |
6. Vercel cron waves (not GH Actions)
The three Vercel cron waves are not observable via gh run list. They run in-process and are monitored via the sources-watchdog cron. DB freshness data above (sec-form4, jp-edinet, oslo-no, consob successful as of 2026-05-19) confirms the Vercel-side crons were healthy on 2026-05-18 and 2026-05-19.
7. Action items
| Priority | Item | Owner |
|---|---|---|
| P0 | Add DATABASE_URL: ${{ secrets.DATABASE_URL }} env var to all 20 failing GH Actions ingest workflow YAMLs (prisma generate step) |
Simon |
| P0 | Verify DATABASE_URL secret is set in GitHub repo secrets (Settings > Secrets > Actions) |
Simon |
| P1 | Investigate why KapFiling has 0 rows (KAP Turkey geo-block confirmed, KAP API 403s from non-TR IPs) | Known, tracked |
| P2 | SgxFiling deprecated (Akamai bot wall, 1 legacy row) -- disable cron cleanly | Simon |
| P2 | BafinFiling stale > 72h (last success 2026-05-16) -- oldest GH Actions failure observed 2026-05-17 | Blocked by P0 |
8. Methodology note on GH Actions vs Vercel cron
Sources ingested via GH Actions (scheduled .github/workflows/*.yml) use npm install + prisma generate + ts-node or npx tsx. These require DATABASE_URL at generate time. Sources ingested via Vercel cron (vercel.json schedule, route handler) run in the Vercel Edge/Serverless runtime where DATABASE_URL is injected automatically from Vercel environment variables. The two pipelines are structurally different -- a secret missing on GH Actions does not affect Vercel waves.