48 · Sources Intelligence Audit · 2026-05-17
Scope. Validate the 15 currently-ingested regulators and scout new public, free, machine-readable insider/PDMR registers worth adding. Pure intelligence report, no code changes. Web research date: 2026-05-17.
Glossary. PDMR = Person Discharging Managerial Responsibilities (MAR Art. 19). DD = Directors' Dealings. MAR = EU Market Abuse Regulation 596/2014. OAM = Officially Appointed Mechanism (national filing storage).
Section A · Current 15-regulator coverage health
Reading from src/lib/ingest/*.ts, the active source modules are: afm, asx-au, bafin, cnmv, consob, cvm-br, dart-kr, dk, euronext-ie, fi-se, fma-at, fsma-be, hel-fi, hkex (+ hkex-merge), jp-edinet, oslo-no, rns-investegate (UK), sebi-trendlyne, sedi (Canada), sgx, six-ser. That is 21 modules covering the 15 jurisdictions in the public pitch (the EU Nordics share a Nasdaq backend, so several modules feed one logical "Nordics" surface).
| # | Jurisdiction | Regulator/source | Endpoint type | Health | Notes & gaps |
|---|---|---|---|---|---|
| 1 | US | SEC EDGAR Forms 3/4/5 | Atom RSS + bulk submissions JSON, hard cap 10 req/s | Healthy in theory. Not currently in repo. | EDGAR is the canonical source. The repo has no sec/ or edgar/ module · this is the largest visible gap. See Section D #1. |
| 2 | France | AMF (geco.amf-france.org) · lestransactions.fr mirror |
CSV daily; data.gouv.fr dataset also exposes it | Healthy. | Confirm we hit the AMF source directly, not a mirror. data.gouv.fr publishes a cleaned CSV at data.gouv.fr/datasets/transactions-des-dirigeants-publiees-par-lautorite-des-marches-financiers which is a safer canonical fallback. |
| 3 | Netherlands | AFM meldingenregister (publicatie voorwetenschap & geplaatst kapitaal) |
XML download per register · no documented JSON API | Working but coverage thin. | The dedicated PDMR register lives under meldingenregisters/openbaarmaking-voorwetenschap. Verify we are pulling the PDMR file, not the home-member-state file. |
| 4 | Germany | BaFin portal.mvp.bafin.de/database/DealingsInfo/sucheForm.do |
HTML form + CSV export via displaytag flag, A-Z letter pagination |
CRITICAL · only 295 filings ingested | See Section B. Root cause is upstream: BaFin's database itself only stores rows whose publication is younger than 12 months · so the universe is rolling-window, not historical. There is no historical archive endpoint. |
| 5 | UK | FCA PDMR via RNS · scraped from Investegate | HTML scrape | Healthy. | RNS itself is paywalled (LSEG). Investegate is the standard free mirror. Consider also news.aquis.eu for AIM-quoted issuers not on LSE. |
| 6 | Italy | CONSOB internal-dealing area pubblica | HTML | Working. | No machine-readable API documented. Borsa Italiana mirror at borsaitaliana.it/notizie/sotto-la-lente/internaldealing.htm is parallel surface. |
| 7 | Spain | CNMV "Hechos Relevantes / Comunicaciones de directivos" | RSS channel exists (cnmv.es/portal/gpage?id=RSS) · standard forms register at Portal/Legislacion/ModelosN/ModelosN?id=COM |
Working. | Confirm the specific RSS feed ID being consumed (CNMV has multiple feeds, only one carries directors' comms). |
| 8 | Sweden | Finansinspektionen Insynsregistret / PDMR transactions register | CSV direct download from fi.se/en/our-registers/pdmr-transactions/ · also documented open-data CSV |
Healthy. | Best-in-class source. The Python lib insynsregistret (djonsson) is the de-facto reference. |
| 9 | Norway | Oslo Børs NewsWeb (OAM) · Finanstilsynet primærinnsider | HTML announcements feed | Working. | As of 2025-04-01, supervisory duties moved from Oslo Børs to Finanstilsynet, but NewsWeb remains the OAM. No JSON endpoint documented. |
| 10 | Finland | Nasdaq Helsinki news service | HTML + Nasdaq Nordic CSV export | Working. | Same Nasdaq Nordic plumbing as Copenhagen/Stockholm. |
| 11 | Denmark | Nasdaq Copenhagen news service | HTML + Nasdaq Nordic CSV export | Working. | Same plumbing as #10. |
| 12 | Ireland | Euronext Dublin · Central Bank of Ireland PDMR feed | RNS-style announcements (also re-published on Investegate) | Working but thin volume. | Most Irish PDMR notices also surface on RNS via UK module. De-duplication risk. |
| 13 | Belgium | FSMA fsma.be/nl/transaction-search + data portal |
HTML search + bulk lists via data portal | Working. | Threshold is EUR 20k/year per PDMR. Confirm data portal export URL is stable. |
| 14 | Switzerland | SIX SER Management Transactions (six-group.com/.../official-notices.html) |
HTML + commercial XML feed (Knowledge Direct, SOAP/XML, paid) | Working but free tier is HTML-only. | No free JSON. Scraping is the only cost-free path. |
| 15 | Austria | FMA Directors' Dealings (webhost.fma.gv.at/DirectorsDealings/) |
HTML web form, no documented bulk export | Working. | XML schema exists for submission but not publication. Volume is naturally small (ATX is ~38 issuers). |
| 16 | Australia | ASX announcements (Appendix 3Y / 3Z change-of-director's-interest) | HTML company-news pages, no official RSS/JSON | Working via scrape. | Third-party WebLink Data API would be paid. Free path is HTML scrape of company-news lists. |
| 17 | Canada | SEDI insider transactions | HTML public-search forms | Working but painful. Search-only, no bulk export. | Public path is the only legal free option · SEDI has no API. TSX Insider has a paid JSON API. |
| 18 | Hong Kong | HKEX DI (Disclosure of Interests) via DION system + HKEXnews | HTML search at di.hkex.com.hk and hkexnews.hk |
Working. | Forms 3B/3C are the director-specific filings. No JSON, scrape is the standard path. |
| 19 | Japan | EDINET API v2 (api.edinet-fsa.go.jp/api/v2/) |
Documents JSON + XBRL/CSV downloads, free API key | Healthy. | Coverage is "large-shareholding reports" (5%+) · NOT classic insider trades. This is structurally a different filing type · see Section B caveat. |
| 20 | Korea | OpenDART (opendart.fss.or.kr) |
JSON REST, free API key | Healthy. | Best-documented Asian source. Equity disclosure category DS004 carries 대량보유 상황보고 (5%+). |
| 21 | Singapore | SGX sgx.com/securities/company-announcements (Appendix 7B) |
HTML, no public JSON | Working. | Disclosure is per-issuer announcement, not a central register. Scrape is the only path. |
| 22 | India | SEBI insider data via Trendlyne mirror | HTML scrape of Trendlyne | Working but fragile. | NSE (nseindia.com/companies-listing/corporate-filings-insider-trading) and BSE both publish first-party. Trendlyne is convenient but not authoritative · de-risk by adding NSE/BSE direct. |
| 23 | Brazil | CVM Portal Dados Abertos (dados.cvm.gov.br) |
CSV datasets (CVM 44, formulário 5) | Working. | The open-data portal exposes structured CSV. This is one of the cleanest free Latin sources. |
Headline gap. The repo lists "15 regulators" publicly but does not ingest SEC EDGAR. That is the world's deepest insider-trade source by 2 orders of magnitude. Section D treats this as priority 1.
Section B · BaFin deep-dive (CRITICAL)
Reported symptom. 295 ingested rows vs an expected 4-figure annual flow.
Root cause #1 · structural, not a scraper bug. BaFin explicitly states the public database "contains securities transactions whose publication is not older than one year." It is a 12-month rolling window. There is no historical archive on the public side. Pre-2025-05 filings have been purged from the source · we cannot recover them from BaFin.
Root cause #2 · 2026 threshold change. Effective 2026-01-01, BaFin raised the MAR Art. 19(9) per-PDMR-per-year threshold from EUR 20k to EUR 50k. Mechanical effect: roughly 30-50% drop in expected filing volume Q1 2026 onward compared to 2025 baselines. This is regulatory, not technical.
Root cause #3 · letter-pagination ceiling. The current scraper iterates emittentName={A-Z, Other}. BaFin's displaytag paginates results per letter at ~250 rows per page with no documented "all" parameter. If any letter exceeds the per-page display cap and the scraper does not follow pagination, filings get silently dropped. Recommend instrumenting per-letter row counts and comparing against the HTML "X results" header.
Root cause #4 · date filter under-used. The form accepts dateVon / dateBis (custom From/To). Pulling once per letter without a date filter is correct for catch-up, but for daily delta we should query Last week (zeitraum=letzteWoche) once and union. Cuts traffic by ~50x and dodges the pagination ceiling.
Recommended endpoints (none undocumented, just used more aggressively):
- Browse, EN locale, with date filter:
https://portal.mvp.bafin.de/database/DealingsInfo/sucheForm.do?emittentName=A&locale=en_GB&zeitraum=gesamterZeitraum - CSV export of same query: append
&6578706f7274=1&d-4000784-e=1(hex of "export"). - Last-week-only daily delta (recommended):
...sucheForm.do?locale=en_GB&zeitraum=letzteWoche
Bottom line. BaFin is working as designed · the volume is genuinely lower than other large markets because (a) only 12 months are public, (b) the 2026 threshold hike is real, (c) the issuer universe is smaller than the US. The 295-row count is plausibly correct for the rolling window. For backfill, BaFin filings older than 12 months are not recoverable from the public portal. Commercial alternatives (mmvo.de, dgap.de) keep history.
Caveat on EDINET (related). The Japan ingester pulls EDINET document-type 2 (large-shareholding reports, "tairyou hoyu houkokusho"). This is a 5%-of-outstanding threshold filing, NOT an insider/PDMR transaction. It is closer to a 13G than a Form 4. If the product page describes Japan coverage as "insider trades", that is a stretch. To get actual Japanese officer/director transactions you need TSE TDnet "Change in Major Shareholders" plus issuer-side "Notification of Status Change of Officer" filings · neither is in EDINET-2. Flag for marketing review.
Section C · New markets shortlist (ranked by ease × value)
Rationale columns: legal regime, format quality, language barrier, free-tier filing-volume class (S = <500/yr, M = 500-5k, L = 5k-50k, XL = >50k).
| Rank | Market | Regulator / endpoint | Format | Volume | Lang | Effort | Notes |
|---|---|---|---|---|---|---|---|
| 1 | USA · SEC | EDGAR Atom RSS sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=4 + bulk JSON data.sec.gov/submissions/CIK*.json |
RSS + JSON | XL | EN | LOW | Standard 10 req/s ceiling. Forms 3/4/5 are the global benchmark. Mandatory. |
| 2 | Turkey · KAP | Public Disclosure Platform kap.org.tr/en/bildirim-sorgu · undocumented kap.org.tr/en/api/Bildirim* endpoints |
JSON-ish (used by frontend) | L | TR/EN | LOW | English locale exists. cemsinano/pykap proves the API is scrapable. PDMR notices appear under "Material Event Disclosure" with subcategory "Şirket Genel Bilgi · Şirket Yöneticilerinin İşlemleri". |
| 3 | Taiwan · TWSE MOPS | mops.twse.com.tw/server-java/t58query + monthly insider-equity transfer report |
HTML / Excel exports per query | M-L | ZH/EN | MED | EN locale partial. Directors' monthly shareholding change tables are downloadable per-issuer. No central daily feed. |
| 4 | Poland · KNF ESPI / GPW | espi.knf.gov.pl/ + gpw.pl/komunikaty + PAP mirror espiebi.pap.pl |
XML reports | M | PL/EN | MED | PDMR notices flow through ESPI category 19. PAP exposes a public list page; XML body of each report is downloadable. |
| 5 | New Zealand · NZX | announcements.nzx.com + api.nzx.com/public/announcement/<id>/attachment/... |
HTML list + per-announcement attachment URLs | S-M | EN | LOW | Appendix 3Y / 3Z is the PDMR-equivalent. No documented RSS but per-announcement IDs are sequential; scraping the announcement type filter is trivial. |
| 6 | South Africa · JSE/FSCA | SENS via JSE jse.co.za/market-data/market-announcements + free mirrors sharenet.co.za/v3/sens.php, moneyweb.co.za/tools-and-data/moneyweb-sens |
HTML | M | EN | LOW | Director-dealing announcements are SENS category 9. Mirrors are stable. FSCA itself is regulator-only, not the disclosure venue. |
| 7 | Mexico · CNBV/BMV | BMV bmv.com.mx/es/emisoras/eventosrelevantes + CNBV STIV-2 system + XBRL taxonomies |
XBRL + HTML | M | ES | MED | "Eventos relevantes" includes PDMR-equivalent disclosures but they are mixed with all material events · keyword filter needed. |
| 8 | Chile · CMF | cmfchile.cl/institucional/hechos/hechos.php + the independent hechos-esenciales.cl aggregator |
HTML | S-M | ES | MED | Article 20 of Law 18.045 covers director/manager share transactions. Aggregator site is a useful fallback. |
| 9 | Indonesia · IDX / OJK | IDX idx.co.id/en/listed-companies/disclosure + OJK Rule 31/POJK.04/2015 |
HTML PDFs | M | ID/EN | HIGH | PDMR disclosures are buried inside per-issuer PDF announcements. OCR or NLP needed. Skip until pipeline supports PDF extraction. |
| 10 | Philippines · PSE EDGE | edge.pse.com.ph/companyDisclosures/ |
HTML | S-M | EN | LOW | Disclosures are well-structured per issuer. Director/officer ownership filings tagged separately. |
| · | Saudi Arabia · Tadawul | saudiexchange.sa disclosures |
HTML, AR/EN | M | AR/EN | HIGH | No clean free machine-readable feed; ICE feed is paid. Defer. |
| · | Thailand · SET / SEC | set.or.th + SMART Marketplace |
API (paid tier dominant) | M | TH/EN | HIGH | Form 59 insider transactions exist but bulk free access is unclear. Defer. |
| · | Vietnam · HOSE/HNX | hnx.vn/en-gb/thong-tin-cong-bo-up-hnx.html |
HTML | S | VI/EN | HIGH | Pre-trade announcement model (3-day advance notice) is structurally different · feature, not bug, but ingestion logic must accommodate. |
| · | Greece · HCMC, Czech · CNB, Romania · BVB, Hungary · MNB | National OAMs under MAR | varies | S each | local | HIGH | Low individual ROI. Roll up via a future "EU long-tail" pass. |
Section D · Implementation priority queue (next 5)
SEC EDGAR (United States). Estimated dev: 1.5-2 days. Two endpoints to wire: (a) Atom feed
sec.gov/cgi-bin/browse-edgar?action=getcurrent&type=4&output=atomfor the daily delta, (b) per-CIKdata.sec.gov/submissions/CIK{padded}.jsonfor backfill. XML body of Form 4 is parsed via thexbrl/<accession>/primary_doc.xmlfile. Rate-limit ceiling 10 req/s, User-Agent must include contact email. ROI: this single source likely doubles total ingested filings and is the strongest GEO/SEO signal because every comparator covers it.Turkey · KAP. Estimated dev: 1 day. Reverse the frontend's JSON calls (no auth required).
pykaplibrary on GitHub already maps the endpoints. Yields ~10-20k disclosures/yr, of which a few thousand are PDMR-relevant. High novelty value · few EN-language competitors index Borsa Istanbul PDMR data.New Zealand · NZX. Estimated dev: 0.5 day. The
api.nzx.com/public/announcement/{id}/attachment/...URL pattern is documented in our own search results · IDs are sequential, the announcement-list HTML carries the type tag. Pure scrape, no auth.Poland · KNF ESPI via PAP. Estimated dev: 1 day.
espiebi.pap.plexposes a free, paginated list of ESPI/EBI reports. PDMR notices are typically ESPI report category 19; full XML available per-report.South Africa · JSE SENS via Sharenet mirror. Estimated dev: 0.5 day.
sharenet.co.za/v3/sens.phpis stable and free. Category filter pulls director-dealings only. JSE Client Portal exists but requires registration; the mirror is operationally simpler.
Stretch (next-next 5, lower priority):
- SEC EDGAR Forms 3 & 5 (complement Form 4).
- Taiwan · TWSE monthly insider equity-transfer reports.
- Philippines · PSE EDGE.
- Mexico · CNBV STIV-2 / BMV eventos relevantes.
- Chile · CMF hechos esenciales.
Deferred (low ROI or hard access): Saudi Tadawul, Thailand SET, Vietnam HOSE/HNX, Indonesia IDX (PDF-locked), Greek HCMC, Czech CNB, Romanian BVB, Hungarian MNB. Recommend revisiting in 2026-H2 with a PDF/OCR pipeline in place.
Marketing-claim health check
The site currently markets "15 regulators". With the SEC gap closed (priority 1), the realistic count becomes 16. Adding priorities 2-5 brings it to 20 within ~4 dev-days. The "Japan insider" claim should be softened to "Japan large-shareholding reports" or supplemented by TSE TDnet ingestion to be technically accurate · see Section B caveat.
End of audit.