totalAmount parser audit (2026-05-18)
Four markets had 100% null totalAmount. This audit documents the root cause
per market and the fix (or non-fix) applied.
1. DART KR (South Korea / FSS) — FIXED
Baseline: 5,243 ELESTOCK rows, 100% null totalAmountKrw.
Root cause: elestock.json OpenAPI typedef advertises trd_unit_qty and
trd_unit_amt, but real-world responses always leave those fields empty.
Per-trade prices are only exposed in the per-filing XBRL document
(/api/document.xml?rcept_no=...).
Fix: new script scripts/_dart-detail-backfill.ts fetches the per-filing
XBRL ZIP, parses the SECTION-3 "Details of changes" table-group
(ACLASS="RPT_RSN"), and extracts per-row:
RPT_RSN(acquisition vs disposition flag)MDF_STK_CNT(shares changed)ACI_AMT2(KRW unit price)MDF_DM(transaction date)
Computes Σ |qty| × price for totalAmountKrw, weighted average for
pricePerShareKrw, signed sum for sharesQty. Updates DartFiling staging
- propagates via
_dart-merge-update.tsto existing Declaration rows.
Concurrency: 4 workers, 500ms polite delay.
2. JP EDINET (FSA) — PARTIAL FIX
Baseline: 20,983 rows, 100% null totalAmountJpy.
Root cause: doc types 350 / 360 are Large Volume Holder reports under
FIEA Art 27-23. They DO NOT carry a transaction price × volume. The CSV
element IDs hardcoded in src/lib/ingest/jp-edinet.ts:375-389
(UnitPrice, PricePerShare, 単価, TransactionAmount, 金額) are
correct names but do not appear in the LVH (大量保有) taxonomy.
Discovery of 10 fresh docs confirmed the LVH taxonomy exposes monetary fields under a different shape:
jplvh_cor:TotalAmountOfFundingForAcquisition(取得資金合計, JPY)jplvh_cor:AmountOfOwnFund(自己資金額)jplvh_cor:TotalAmountOfBorrowings(借入金額計)jplvh_cor:TotalAmountFromOtherSources(その他金額計)
These are funding amounts committed for the disclosed acquisition, not a
transaction unit price. Treating them as totalAmountJpy is a reasonable
proxy: for an LVH disclosure the trade value equals the funding committed.
Fix: scripts/_jp-edinet-reparse.ts re-fetches CSVs, sums
TotalAmountOfFundingForAcquisition (fallback AmountOfOwnFund), updates
totalAmountJpy + sharesAfter + ownershipPctAfter. Propagated via
_jp-merge-update.ts.
Caveat: per-share price stays null (no UnitPrice in source). Form
361 (officer holding change, FIEA Art 165-2) would expose price × qty,
but it's not currently ingested.
3. PSE PH (Philippines) — PARTIAL FIX
Baseline: 104 Declaration rows, 100% null totalAmount.
Root cause: the ingest stored only the listing metadata + first attachment URL. Form 17-7 transaction tables only exist inside the attached PDF.
Fix: scripts/_pse-pdf-backfill.ts downloads each PDF via
/downloadFile.do?file_id=<pdfFileId> and parses with pdf-parse. Two
PDF layouts supported:
- single-line:
Common Shares M/D/YYYY <shares> (A|D) <price> <pct>% - multi-row: tranche-by-tranche rows of
<shares> I/D <amount> A/D <price> <pct>%
PHP→EUR FX via Yahoo EURPHP=X (1 EUR = X PHP, so divide). Stored under
new FxHistory pair PHPEUR_INV.
Caveat: roughly 80% of the source PDFs are SEC cover sheets (Form 23-B) attached as the only file in the disclosure. The actual 17-7 body is missing from the EDGE attachment — this is an upstream issuer error and no parser can recover the data. Coverage hits the "real 17-7 PDF" subset.
4. BSE Sofia BG — SKIPPED (source limitation)
Baseline: 1,320 Declaration rows, 100% null totalAmount.
Root cause: the feed (subject=2) is MAR Article 17 inside information
disclosures — financial reports, M&A signals, tender offers, treasury
share notifications. These are corporate news, NOT trade reports.
The BSE Sofia public news widget does NOT expose MAR Art 19 PDLT
("Persons Discharging Managerial Responsibilities") disclosures, which is
the regime that would carry per-trade values for insiders. The detail
page at /en/news/id/<newsId> is JavaScript-rendered and the AJAX news
fragment endpoint only returns matches inside a from/to window — the
news IDs we already have don't resolve to a parseable body.
The src/lib/ingest/bse-bg.ts header comment already states this:
The insider-info news doesn't expose per-trade values, so unitPrice / volume / totalAmount stay null.
Verdict: no action. The 1,320 rows are correctly representing the
upstream signal. To get per-trade values for Bulgaria we'd need a
different feed (MAR Art 19 PDLT, not exposed publicly by BSE Sofia).
Recommendation: tag these rows as transactionNature = "Insider info (MAR Art 17)" so downstream filters can drop them from "manager
transaction" cohorts.