84. SEO deep audit — 2026-05-19
Scope
Full SEO sweep of https://insiders-trades.com:
- Internal link mesh (depth-3 sample, 150 URLs)
- 4xx/5xx crawler over a sampled set
- Sitemap coverage versus indexable surfaces
- robots.txt and per-page meta robots
- Canonical correctness on paginated and faceted listings
- Structured data presence (JSON-LD)
- Title and meta description length and uniqueness
- Internal anchor descriptiveness
- Top fixes shipped in this commit
- Deferred recos
All counts captured from production HTML fetched at audit time.
TL;DR
| Surface | Before | After |
|---|---|---|
| Root sitemap index children | 119 | 119 (unchanged) |
| Blog EN entries | 51 | 51 (1 DB row missing, see #2 below) |
| Blog FR entries | 52 | 52 |
Paginated /companies/page/N/ entries |
174 (EN) + 174 (FR) | unchanged (already covered) |
Paginated /insiders/page/N/ entries |
732 (EN) + 732 (FR) | unchanged (already covered) |
Paginated /blog/?page=N entries |
0 | +4 per locale (5 pages, page 1 is /blog/) |
Paginated /companies/by-market/[m]/?page=N entries |
0 | +N per LIVE market per locale |
Paginated /insiders/by-market/[m]/?page=N entries |
0 | +N per LIVE market per locale |
Duplicate <link rel="alternate" hreflang> tags per HTML page |
2 (layout + page) | 1 (page only) |
| 4xx in 150-URL sample | 0 | 0 |
| 5xx in 150-URL sample | 1 transient (retry 200) | 0 |
1. Maillage interne (link mesh)
Random 150-URL sample drawn from the union of sitemap-static/{en,fr}/index.xml,
sitemap-blog/{en,fr}/index.xml, and sitemap-docs.xml (2205 total URLs).
Outbound internal link counts on key surfaces:
| Page | Internal links |
|---|---|
/ (home) |
159 |
/blog/ |
62 |
/blog/closely-associated-insiders-the-hidden-alpha-signal/ |
46 |
/methodologie/ |
41 |
/companies/ |
164 |
/companies/by-market/us/ |
90 |
/insiders/by-role/ceo/ |
39 |
All sampled pages clear the target floor of 30 internal links. No orphan pages
detected in the sample. Sitemap surfaces with very low inbound link density
(e.g. legacy /sandbox/) are intentionally excluded from LANDING_PATHS.
2. Sitemap coverage gaps
2.1 EN blog row missing for closely-associated-insiders-the-hidden-alpha-signal
The article is published in FR (/fr/blog/... → 200, in sitemap-blog/fr).
The EN URL /blog/closely-associated-insiders-the-hidden-alpha-signal/ also
returns 200 and renders correctly, yet the EN sitemap (sitemap-blog/en)
omits it. Root cause: the BlogArticle row for locale=en is not present in
the production DB (the seed scripts/_seed-blog-closely-associated.ts
inserts both locales but was likely run before the EN section landed, or
the EN row has status != "published").
Action: not a code fix in this commit. Owner needs to re-run the seed
or update the prod row status to published. Logged here so the audit row
turns green automatically on next sitemap revalidate.
2.2 Paginated surfaces missing from sitemap (FIXED)
Three indexable, self-canonicalised paginated patterns were absent:
/blog/?page=N— 5 pages exist, only page 1 was advertised./companies/by-market/[market]/?page=N— emitted for the 14 LIVE markets only./insiders/by-market/[market]/?page=N— emitted for the 14 LIVE markets only.
sitemap-static/[lang]/index.xml/route.ts now enumerates them, gated on
LIVE markets (mirror of LIVE_MARKETS in
src/app/companies/by-market/[market]/page.tsx) to avoid emitting URLs that
the page-level generateMetadata flags as robots: noindex.
2.3 /insiders/by-role/[role]/?page=N deliberately deferred
Role bucketing happens in JS (normalizeRole over Insider.primaryRole
raw strings) and cannot be SQL-counted cheaply. Page 1 of each role
remains in LANDING_PATHS. Deeper pages stay reachable via the page-1 UI.
Future: bake primaryRoleBucket as a denormalised column at ingest time,
then enumerate pagination from the sitemap.
3. Errors crawl
| Status | Count | Notes |
|---|---|---|
| 200 | 149 | |
| 500 | 1 | /companies/page/125/, transient: retry returned 200. |
No 4xx detected in the sample. No persistent 5xx.
4. robots.txt + per-page meta robots
/robots.txtreturns 200. Sitemap declared. Allow list covers all canonical surfaces in both EN and FR./admin/,/api/,/account/,/portfolio/,/recommendations/,/companies/add/,/_next/, and the legacy/en/alias are disallowed.- Major AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) disallowed.
- Per-page meta robots: non-LIVE markets (
at,ie,kr,in) correctly emitrobots: { index: false, follow: true }until they cross the declaration threshold. Filter params (?category=,?market=) correctly demote tonoindex,follow. Paginated?page=N>=2remains indexable, with self-canonical pointing back to the same paginated URL.
5. Canonical correctness
Sampled /blog/[slug]/, /methodologie/, /companies/, /companies/by-market/us/,
/insiders/by-role/ceo/, /: each emits a single self-canonical pointing to
the locale-appropriate URL. BASE is the production canonical
https://insiders-trades.com on every sampled page.
6. Structured data
Organization and WebSite (with SearchAction) emitted on every page via
the root layout. Blog articles carry Article plus Person-typed author with
sameAs. Listing pages (/companies/, /companies/by-market/[m]/,
/insiders/by-role/[r]/) carry CollectionPage plus BreadcrumbList.
Sample @type counts (lower bound, since one JSON-LD blob may hold a graph):
| Page | @type occurrences |
|---|---|
/ |
1 |
/methodologie/ |
1 |
/blog/closely-associated-... |
1 |
/companies/ |
2 |
/companies/by-market/us/ |
2 |
/insiders/by-role/ceo/ |
2 |
Layout-level JSON-LD is a single @graph array, hence the value of 1 on
content pages that do not add their own collection schema.
7. Titles and descriptions
Spot-checked seven pages — all within length budget, locale-correct, unique.
| Page | Title length | Description length |
|---|---|---|
/ |
72 | not sampled (rendered via metadata) |
/blog/closely-associated-... |
73 | 159 |
/methodologie/ |
49 | 153 |
/companies/ |
56 | n/a |
/companies/by-market/us/ |
60 | within budget |
/insiders/by-role/ceo/ |
42 | within budget |
No duplicates detected across the sample.
8. Internal anchor text
Random pull from /blog/ listing and /companies/:
all anchors carry the article title or company name; no click here,
more, or read more empty-context anchors.
9. Hreflang duplication (FIXED, critical)
Every HTML page in production emitted two identical sets of
<link rel="alternate" hreflang="..."> tags — one from src/app/layout.tsx
and one from each page's generateMetadata alternates.languages. Sample
on /:
<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="en" href="https://insiders-trades.com/"/>
<link rel="alternate" hrefLang="fr" href="https://insiders-trades.com/fr/"/>
<link rel="alternate" hrefLang="x-default" href="https://insiders-trades.com/"/>
Per Google guidance, duplicate hreflang tags are treated as a soft conflict
and may suppress the alternates entirely. The layout-level fallback was
removed; every page already declares its own alternates.languages.
10. Top 20 issues + status
| # | Issue | Severity | Status |
|---|---|---|---|
| 1 | Duplicate hreflang tags on every HTML page | High | Fixed |
| 2 | closely-associated-insiders EN blog missing from sitemap |
Medium | Open (DB seed) |
| 3 | /blog/?page=N not in sitemap |
Medium | Fixed |
| 4 | /companies/by-market/[m]/?page=N not in sitemap |
Medium | Fixed |
| 5 | /insiders/by-market/[m]/?page=N not in sitemap |
Medium | Fixed |
| 6 | /insiders/by-role/[r]/?page=N not in sitemap |
Low | Deferred (needs schema change) |
| 7 | /companies/page/125/ transient 500 |
Low | Monitor (retry 200) |
| 8 | robots.txt sitemap URL points to BASE (vercel preview when env unset) |
Low | Already handled by env override |
| 9 | Non-LIVE market pages reachable but noindex | None | By design, doc'd |
| 10 | Hreflang attribute uses React hrefLang (camelCase) |
None | Browsers accept case-insensitively |
| 11 | No lastmod on per-blog entries when only publishedAt exists |
None | Falls back via toDate |
| 12 | Listing pages all carry CollectionPage + BreadcrumbList |
None | OK |
| 13 | Layout JSON-LD uses @graph (Org + WebSite) |
None | OK |
| 14 | Internal link density >30 on every sampled surface | None | OK |
| 15 | All blog articles carry Article schema with Person author |
None | OK |
| 16 | AI scrapers (GPTBot, CCBot, anthropic-ai, Google-Extended) blocked | None | Already in robots.txt |
| 17 | Filter params (?category=) correctly noindex |
None | OK |
| 18 | Audits sitemap (sitemap-docs.xml) covers /docs/method-review/* |
None | OK |
| 19 | No duplicate titles across sample | None | OK |
| 20 | No click here / empty-context anchors |
None | OK |
11. Sitemap delta after this commit
Per-locale URLs added to sitemap-static/[lang]/index.xml:
/blog/?page=2..ceil(blogCount/12)→ roughly 4 URLs per locale (5-page index)./companies/by-market/[m]/?page=2..ceil(marketCompanyCount/50)for 14 LIVE markets./insiders/by-market/[m]/?page=2..ceil(marketInsiderCount/60)for 14 LIVE markets.
Total new URLs across both locales: ranges with company coverage. The US alone (6759 companies) adds 134 paginated company URLs per locale (135 total including page 1 already in LANDING_PATHS).
12. Recos suivantes
- Seed
closely-associated-insiders-the-hidden-alpha-signalEN row to production DB or updatestatustopublished. - Add
Insider.primaryRoleBucketas a denormalised column so role pagination can be SQL-counted and enumerated in the sitemap. - Audit blog Article schema
dateModifiedversusdateCreated: confirm Google sees freshdateModifiedon backfills. - Emit a
WebPage@typeentry for non-CollectionPagestatic surfaces (/about/,/how-it-works/,/pricing/) to harden entity coverage. - Wire up an internal-links footer on every blog article that surfaces the three most relevant cross-articles (boosts orphan resistance for future content).
- Backfill
og:imageper-route via the/api/og/blog/[slug]and/api/og/*routes already present, and confirm absolute URLs in Open Graph metadata. - Long-tail: emit a
Sitemap: <url>directive inrobots.txtfor the canonical domain only (currently driven byNEXT_PUBLIC_BASE_URLenv). - CWV: pull CrUX field data per top-10 surface (separate audit, skill
seo-google). - Backlinks: pull referring-domains report (skill
seo-backlinks). - AI overviews: validate Article schema citations for the top five blog posts via the GEO skill.