84. SEO deep audit — 2026-05-19

Scope

Full SEO sweep of https://insiders-trades.com:

Internal link mesh (depth-3 sample, 150 URLs)
4xx/5xx crawler over a sampled set
Sitemap coverage versus indexable surfaces
robots.txt and per-page meta robots
Canonical correctness on paginated and faceted listings
Structured data presence (JSON-LD)
Title and meta description length and uniqueness
Internal anchor descriptiveness
Top fixes shipped in this commit
Deferred recos

All counts captured from production HTML fetched at audit time.

TL;DR

Surface	Before	After
Root sitemap index children	119	119 (unchanged)
Blog EN entries	51	51 (1 DB row missing, see #2 below)
Blog FR entries	52	52
Paginated `/companies/page/N/` entries	174 (EN) + 174 (FR)	unchanged (already covered)
Paginated `/insiders/page/N/` entries	732 (EN) + 732 (FR)	unchanged (already covered)
Paginated `/blog/?page=N` entries	0	+4 per locale (5 pages, page 1 is `/blog/`)
Paginated `/companies/by-market/[m]/?page=N` entries	0	+N per LIVE market per locale
Paginated `/insiders/by-market/[m]/?page=N` entries	0	+N per LIVE market per locale
Duplicate `<link rel="alternate" hreflang>` tags per HTML page	2 (layout + page)	1 (page only)
4xx in 150-URL sample	0	0
5xx in 150-URL sample	1 transient (retry 200)	0

1. Maillage interne (link mesh)

Random 150-URL sample drawn from the union of sitemap-static/{en,fr}/index.xml, sitemap-blog/{en,fr}/index.xml, and sitemap-docs.xml (2205 total URLs). Outbound internal link counts on key surfaces:

Page	Internal links
`/` (home)	159
`/blog/`	62
`/blog/closely-associated-insiders-the-hidden-alpha-signal/`	46
`/methodologie/`	41
`/companies/`	164
`/companies/by-market/us/`	90
`/insiders/by-role/ceo/`	39

All sampled pages clear the target floor of 30 internal links. No orphan pages detected in the sample. Sitemap surfaces with very low inbound link density (e.g. legacy /sandbox/) are intentionally excluded from LANDING_PATHS.

2. Sitemap coverage gaps

2.1 EN blog row missing for `closely-associated-insiders-the-hidden-alpha-signal`

The article is published in FR (/fr/blog/... → 200, in sitemap-blog/fr). The EN URL /blog/closely-associated-insiders-the-hidden-alpha-signal/ also returns 200 and renders correctly, yet the EN sitemap (sitemap-blog/en) omits it. Root cause: the BlogArticle row for locale=en is not present in the production DB (the seed scripts/_seed-blog-closely-associated.ts inserts both locales but was likely run before the EN section landed, or the EN row has status != "published").

Action: not a code fix in this commit. Owner needs to re-run the seed or update the prod row status to published. Logged here so the audit row turns green automatically on next sitemap revalidate.

2.2 Paginated surfaces missing from sitemap (FIXED)

Three indexable, self-canonicalised paginated patterns were absent:

/blog/?page=N — 5 pages exist, only page 1 was advertised.
/companies/by-market/[market]/?page=N — emitted for the 14 LIVE markets only.
/insiders/by-market/[market]/?page=N — emitted for the 14 LIVE markets only.

sitemap-static/[lang]/index.xml/route.ts now enumerates them, gated on LIVE markets (mirror of LIVE_MARKETS in src/app/companies/by-market/[market]/page.tsx) to avoid emitting URLs that the page-level generateMetadata flags as robots: noindex.

2.3 `/insiders/by-role/[role]/?page=N` deliberately deferred

Role bucketing happens in JS (normalizeRole over Insider.primaryRole raw strings) and cannot be SQL-counted cheaply. Page 1 of each role remains in LANDING_PATHS. Deeper pages stay reachable via the page-1 UI.

Future: bake primaryRoleBucket as a denormalised column at ingest time, then enumerate pagination from the sitemap.

3. Errors crawl

Status	Count	Notes
200	149
500	1	`/companies/page/125/`, transient: retry returned 200.

No 4xx detected in the sample. No persistent 5xx.

4. robots.txt + per-page meta robots

/robots.txt returns 200. Sitemap declared. Allow list covers all canonical surfaces in both EN and FR. /admin/, /api/, /account/, /portfolio/, /recommendations/, /companies/add/, /_next/, and the legacy /en/ alias are disallowed.
Major AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) disallowed.
Per-page meta robots: non-LIVE markets (at, ie, kr, in) correctly emit robots: { index: false, follow: true } until they cross the declaration threshold. Filter params (?category=, ?market=) correctly demote to noindex,follow. Paginated ?page=N>=2 remains indexable, with self-canonical pointing back to the same paginated URL.

5. Canonical correctness

Sampled /blog/[slug]/, /methodologie/, /companies/, /companies/by-market/us/, /insiders/by-role/ceo/, /: each emits a single self-canonical pointing to the locale-appropriate URL. BASE is the production canonical https://insiders-trades.com on every sampled page.

6. Structured data

Organization and WebSite (with SearchAction) emitted on every page via the root layout. Blog articles carry Article plus Person-typed author with sameAs. Listing pages (/companies/, /companies/by-market/[m]/, /insiders/by-role/[r]/) carry CollectionPage plus BreadcrumbList.

Sample @type counts (lower bound, since one JSON-LD blob may hold a graph):

Page	`@type` occurrences
`/`	1
`/methodologie/`	1
`/blog/closely-associated-...`	1
`/companies/`	2
`/companies/by-market/us/`	2
`/insiders/by-role/ceo/`	2

Layout-level JSON-LD is a single @graph array, hence the value of 1 on content pages that do not add their own collection schema.

7. Titles and descriptions

Spot-checked seven pages — all within length budget, locale-correct, unique.

Page	Title length	Description length
`/`	72	not sampled (rendered via metadata)
`/blog/closely-associated-...`	73	159
`/methodologie/`	49	153
`/companies/`	56	n/a
`/companies/by-market/us/`	60	within budget
`/insiders/by-role/ceo/`	42	within budget

No duplicates detected across the sample.

8. Internal anchor text

Random pull from /blog/ listing and /companies/: all anchors carry the article title or company name; no click here, more, or read more empty-context anchors.

9. Hreflang duplication (FIXED, critical)

Every HTML page in production emitted two identical sets of <link rel="alternate" hreflang="..."> tags — one from src/app/layout.tsx and one from each page's generateMetadata alternates.languages. Sample on /:

<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>

Per Google guidance, duplicate hreflang tags are treated as a soft conflict and may suppress the alternates entirely. The layout-level fallback was removed; every page already declares its own alternates.languages.

10. Top 20 issues + status

#	Issue	Severity	Status
1	Duplicate hreflang tags on every HTML page	High	Fixed
2	`closely-associated-insiders` EN blog missing from sitemap	Medium	Open (DB seed)
3	`/blog/?page=N` not in sitemap	Medium	Fixed
4	`/companies/by-market/[m]/?page=N` not in sitemap	Medium	Fixed
5	`/insiders/by-market/[m]/?page=N` not in sitemap	Medium	Fixed
6	`/insiders/by-role/[r]/?page=N` not in sitemap	Low	Deferred (needs schema change)
7	`/companies/page/125/` transient 500	Low	Monitor (retry 200)
8	robots.txt sitemap URL points to `BASE` (vercel preview when env unset)	Low	Already handled by env override
9	Non-LIVE market pages reachable but noindex	None	By design, doc'd
10	Hreflang attribute uses React `hrefLang` (camelCase)	None	Browsers accept case-insensitively
11	No `lastmod` on per-blog entries when only `publishedAt` exists	None	Falls back via `toDate`
12	Listing pages all carry `CollectionPage` + `BreadcrumbList`	None	OK
13	Layout JSON-LD uses `@graph` (Org + WebSite)	None	OK
14	Internal link density >30 on every sampled surface	None	OK
15	All blog articles carry `Article` schema with `Person` author	None	OK
16	AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) blocked	None	Already in robots.txt
17	Filter params (`?category=`) correctly noindex	None	OK
18	Audits sitemap (`sitemap-docs.xml`) covers `/docs/method-review/*`	None	OK
19	No duplicate titles across sample	None	OK
20	No `click here` / empty-context anchors	None	OK

11. Sitemap delta after this commit

Per-locale URLs added to sitemap-static/[lang]/index.xml:

/blog/?page=2..ceil(blogCount/12) → roughly 4 URLs per locale (5-page index).
/companies/by-market/[m]/?page=2..ceil(marketCompanyCount/50) for 14 LIVE markets.
/insiders/by-market/[m]/?page=2..ceil(marketInsiderCount/60) for 14 LIVE markets.

Total new URLs across both locales: ranges with company coverage. The US alone (6759 companies) adds 134 paginated company URLs per locale (135 total including page 1 already in LANDING_PATHS).

12. Recos suivantes

Seed closely-associated-insiders-the-hidden-alpha-signal EN row to production DB or update status to published.
Add Insider.primaryRoleBucket as a denormalised column so role pagination can be SQL-counted and enumerated in the sitemap.
Audit blog Article schema dateModified versus dateCreated: confirm Google sees fresh dateModified on backfills.
Emit a WebPage @type entry for non-CollectionPage static surfaces (/about/, /how-it-works/, /pricing/) to harden entity coverage.
Wire up an internal-links footer on every blog article that surfaces the three most relevant cross-articles (boosts orphan resistance for future content).
Backfill og:image per-route via the /api/og/blog/[slug] and /api/og/* routes already present, and confirm absolute URLs in Open Graph metadata.
Long-tail: emit a Sitemap: <url> directive in robots.txt for the canonical domain only (currently driven by NEXT_PUBLIC_BASE_URL env).
CWV: pull CrUX field data per top-10 surface (separate audit, skill seo-google).
Backlinks: pull referring-domains report (skill seo-backlinks).
AI overviews: validate Article schema citations for the top five blog posts via the GEO skill.

84. SEO deep audit — 2026-05-19

Scope

Full SEO sweep of https://insiders-trades.com:

Internal link mesh (depth-3 sample, 150 URLs)
4xx/5xx crawler over a sampled set
Sitemap coverage versus indexable surfaces
robots.txt and per-page meta robots
Canonical correctness on paginated and faceted listings
Structured data presence (JSON-LD)
Title and meta description length and uniqueness
Internal anchor descriptiveness
Top fixes shipped in this commit
Deferred recos

All counts captured from production HTML fetched at audit time.

TL;DR

Surface	Before	After
Root sitemap index children	119	119 (unchanged)
Blog EN entries	51	51 (1 DB row missing, see #2 below)
Blog FR entries	52	52
Paginated `/companies/page/N/` entries	174 (EN) + 174 (FR)	unchanged (already covered)
Paginated `/insiders/page/N/` entries	732 (EN) + 732 (FR)	unchanged (already covered)
Paginated `/blog/?page=N` entries	0	+4 per locale (5 pages, page 1 is `/blog/`)
Paginated `/companies/by-market/[m]/?page=N` entries	0	+N per LIVE market per locale
Paginated `/insiders/by-market/[m]/?page=N` entries	0	+N per LIVE market per locale
Duplicate `<link rel="alternate" hreflang>` tags per HTML page	2 (layout + page)	1 (page only)
4xx in 150-URL sample	0	0
5xx in 150-URL sample	1 transient (retry 200)	0

1. Maillage interne (link mesh)

Page	Internal links
`/` (home)	159
`/blog/`	62
`/blog/closely-associated-insiders-the-hidden-alpha-signal/`	46
`/methodologie/`	41
`/companies/`	164
`/companies/by-market/us/`	90
`/insiders/by-role/ceo/`	39

2. Sitemap coverage gaps

2.1 EN blog row missing for `closely-associated-insiders-the-hidden-alpha-signal`

2.2 Paginated surfaces missing from sitemap (FIXED)

Three indexable, self-canonicalised paginated patterns were absent:

/blog/?page=N — 5 pages exist, only page 1 was advertised.
/companies/by-market/[market]/?page=N — emitted for the 14 LIVE markets only.
/insiders/by-market/[market]/?page=N — emitted for the 14 LIVE markets only.

2.3 `/insiders/by-role/[role]/?page=N` deliberately deferred

Future: bake primaryRoleBucket as a denormalised column at ingest time, then enumerate pagination from the sitemap.

3. Errors crawl

Status	Count	Notes
200	149
500	1	`/companies/page/125/`, transient: retry returned 200.

No 4xx detected in the sample. No persistent 5xx.

4. robots.txt + per-page meta robots

/robots.txt returns 200. Sitemap declared. Allow list covers all canonical surfaces in both EN and FR. /admin/, /api/, /account/, /portfolio/, /recommendations/, /companies/add/, /_next/, and the legacy /en/ alias are disallowed.
Major AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) disallowed.
Per-page meta robots: non-LIVE markets (at, ie, kr, in) correctly emit robots: { index: false, follow: true } until they cross the declaration threshold. Filter params (?category=, ?market=) correctly demote to noindex,follow. Paginated ?page=N>=2 remains indexable, with self-canonical pointing back to the same paginated URL.

5. Canonical correctness

6. Structured data

Sample @type counts (lower bound, since one JSON-LD blob may hold a graph):

Page	`@type` occurrences
`/`	1
`/methodologie/`	1
`/blog/closely-associated-...`	1
`/companies/`	2
`/companies/by-market/us/`	2
`/insiders/by-role/ceo/`	2

Layout-level JSON-LD is a single @graph array, hence the value of 1 on content pages that do not add their own collection schema.

7. Titles and descriptions

Spot-checked seven pages — all within length budget, locale-correct, unique.

Page	Title length	Description length
`/`	72	not sampled (rendered via metadata)
`/blog/closely-associated-...`	73	159
`/methodologie/`	49	153
`/companies/`	56	n/a
`/companies/by-market/us/`	60	within budget
`/insiders/by-role/ceo/`	42	within budget

No duplicates detected across the sample.

8. Internal anchor text

Random pull from /blog/ listing and /companies/: all anchors carry the article title or company name; no click here, more, or read more empty-context anchors.

9. Hreflang duplication (FIXED, critical)

<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>

10. Top 20 issues + status

#	Issue	Severity	Status
1	Duplicate hreflang tags on every HTML page	High	Fixed
2	`closely-associated-insiders` EN blog missing from sitemap	Medium	Open (DB seed)
3	`/blog/?page=N` not in sitemap	Medium	Fixed
4	`/companies/by-market/[m]/?page=N` not in sitemap	Medium	Fixed
5	`/insiders/by-market/[m]/?page=N` not in sitemap	Medium	Fixed
6	`/insiders/by-role/[r]/?page=N` not in sitemap	Low	Deferred (needs schema change)
7	`/companies/page/125/` transient 500	Low	Monitor (retry 200)
8	robots.txt sitemap URL points to `BASE` (vercel preview when env unset)	Low	Already handled by env override
9	Non-LIVE market pages reachable but noindex	None	By design, doc'd
10	Hreflang attribute uses React `hrefLang` (camelCase)	None	Browsers accept case-insensitively
11	No `lastmod` on per-blog entries when only `publishedAt` exists	None	Falls back via `toDate`
12	Listing pages all carry `CollectionPage` + `BreadcrumbList`	None	OK
13	Layout JSON-LD uses `@graph` (Org + WebSite)	None	OK
14	Internal link density >30 on every sampled surface	None	OK
15	All blog articles carry `Article` schema with `Person` author	None	OK
16	AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) blocked	None	Already in robots.txt
17	Filter params (`?category=`) correctly noindex	None	OK
18	Audits sitemap (`sitemap-docs.xml`) covers `/docs/method-review/*`	None	OK
19	No duplicate titles across sample	None	OK
20	No `click here` / empty-context anchors	None	OK

11. Sitemap delta after this commit

Per-locale URLs added to sitemap-static/[lang]/index.xml:

/blog/?page=2..ceil(blogCount/12) → roughly 4 URLs per locale (5-page index).
/companies/by-market/[m]/?page=2..ceil(marketCompanyCount/50) for 14 LIVE markets.
/insiders/by-market/[m]/?page=2..ceil(marketInsiderCount/60) for 14 LIVE markets.

Total new URLs across both locales: ranges with company coverage. The US alone (6759 companies) adds 134 paginated company URLs per locale (135 total including page 1 already in LANDING_PATHS).

12. Recos suivantes

Seed closely-associated-insiders-the-hidden-alpha-signal EN row to production DB or update status to published.
Add Insider.primaryRoleBucket as a denormalised column so role pagination can be SQL-counted and enumerated in the sitemap.
Audit blog Article schema dateModified versus dateCreated: confirm Google sees fresh dateModified on backfills.
Emit a WebPage @type entry for non-CollectionPage static surfaces (/about/, /how-it-works/, /pricing/) to harden entity coverage.
Wire up an internal-links footer on every blog article that surfaces the three most relevant cross-articles (boosts orphan resistance for future content).
Backfill og:image per-route via the /api/og/blog/[slug] and /api/og/* routes already present, and confirm absolute URLs in Open Graph metadata.
Long-tail: emit a Sitemap: <url> directive in robots.txt for the canonical domain only (currently driven by NEXT_PUBLIC_BASE_URL env).
CWV: pull CrUX field data per top-10 surface (separate audit, skill seo-google).
Backlinks: pull referring-domains report (skill seo-backlinks).
AI overviews: validate Article schema citations for the top five blog posts via the GEO skill.

84. SEO deep audit — 2026-05-19

Scope

TL;DR

1. Maillage interne (link mesh)

2. Sitemap coverage gaps

2.1 EN blog row missing for closely-associated-insiders-the-hidden-alpha-signal

2.2 Paginated surfaces missing from sitemap (FIXED)

2.3 /insiders/by-role/[role]/?page=N deliberately deferred

3. Errors crawl

4. robots.txt + per-page meta robots

5. Canonical correctness

6. Structured data

7. Titles and descriptions

8. Internal anchor text

9. Hreflang duplication (FIXED, critical)

10. Top 20 issues + status

11. Sitemap delta after this commit

12. Recos suivantes

84. SEO deep audit — 2026-05-19

Scope

TL;DR

1. Maillage interne (link mesh)

2. Sitemap coverage gaps

2.1 EN blog row missing for closely-associated-insiders-the-hidden-alpha-signal

2.2 Paginated surfaces missing from sitemap (FIXED)

2.3 /insiders/by-role/[role]/?page=N deliberately deferred

3. Errors crawl

4. robots.txt + per-page meta robots

5. Canonical correctness

6. Structured data

7. Titles and descriptions

8. Internal anchor text

9. Hreflang duplication (FIXED, critical)

10. Top 20 issues + status

11. Sitemap delta after this commit

12. Recos suivantes

2.1 EN blog row missing for `closely-associated-insiders-the-hidden-alpha-signal`

2.3 `/insiders/by-role/[role]/?page=N` deliberately deferred

2.1 EN blog row missing for `closely-associated-insiders-the-hidden-alpha-signal`

2.3 `/insiders/by-role/[role]/?page=N` deliberately deferred