Disclosure Statement, Sigma Backtest Claims
Status: published 2026-05-17.
Scope: every surface where STRATEGY_PROOF numbers (win rate, average return, Sharpe) are displayed, UI, emails, alerts, API, JSON-LD, MCP server.
Authority: this document is the canonical reference linked from /methodologie/#disclosure. Update here first; update copy on UI surfaces second.
1. What we publish, and what we measure
The headline figures on /methodologie, on the landing page (LandingSigma), in the daily and weekly digest emails, and via the public API at /api/v1/strategy/winning are the following:
| Headline metric | Published value | Source |
|---|---|---|
| Win rate | 77 % (Wilson 95 % CI ±6.3 pts) | STRATEGY_PROOF.winRate |
| Average annual return | +13.2 % | STRATEGY_PROOF.avgReturn |
| Cross-sectional Sharpe | 1.87* | STRATEGY_PROOF.sharpeCrossSectional |
| Annualized portfolio Sharpe (T=4, rf=3 %) | ≈ 0.40 | STRATEGY_PROOF.sharpeAnnualized |
| Deflated Sharpe (Bailey & López de Prado, N = 583,200 trials) | negative → exposed as null + dsrNote |
STRATEGY_PROOF.sharpeDeflated |
| Sample n | 196 (filtered subset, AMF-anchored) | STRATEGY_PROOF.filteredSubsetSize |
| Universe N | 23,788 (full priced BUY universe, 17 markets, 2015–2026) | STRATEGY_PROOF.universeSize |
The published values are measured on a filtered subset of 196 trades, which represents 0.82 % of the full universe of insider buys (DIRIGEANTS, BUY, 17 markets, 2015–2026, 23,788 priced trades total). The subset is AMF-only today because non-AMF rows have no signalScore yet, see audits 25 §7 and 30 §9 for the multi-market scoring rollout plan.
The filter is the production Sigma strategy (docs/method-review/01-quant-challenge.md, six filters: cluster, conviction, role, mid-cap, sector momentum, score ≥ 40). It is not a retrospective cherry-pick, it is the portfolio we expose as live signals at /recommendations?mode=winning.
2. Sharpe ratios, three values, not one
We publish three Sharpe figures because they answer three different questions and disagree by an order of magnitude on this dataset. Citing only sharpe = 1.87 is misleading and we no longer do it without the qualifier.
2.1 Cross-sectional Sharpe (1.87), sharpeCrossSectional
Mean / σ of individual 90-day trade returns across the n = 173 retail-realistic universe. Not time-aggregated, not risk-free adjusted, not annualized. High by construction because the dispersion of individual trade returns is wide, while a real portfolio averages them. Useful as a feature-quality smell test, not as a portfolio metric.
When this number is surfaced in UI we mark it with an asterisk and the qualifier (cross-sectional) and we link to this document.
2.2 Annualized portfolio Sharpe (≈ 0.40), sharpeAnnualized
The proper portfolio Sharpe, computed from the four yearly aggregate returns (2022–2025) with a 3 % EUR risk-free rate:
mean(yearlyReturns) = 13.175 %
σ_sample(yearlyReturns) = 25.626 %
SR = (13.175 − 3.0) / 25.626 ≈ 0.397
This is the value that compares to other strategies' published Sharpes. The 95 % CI half-width on T = 4 yearly buckets is ≈ 1.5, the point estimate is statistically thin on its own.
2.3 Deflated Sharpe (Bailey & López de Prado, 2014), sharpeDeflated
We searched 583,200 filter combinations (scripts/grid-search-v2.mjs) before selecting the published strategy. The Bailey & López de Prado conservative deflation:
SR_deflated ≈ SR_observed − √(2 · ln(N_trials) / T)
≈ 1.87 − √(2 · ln(583,200) / 4)
≈ 1.87 − 2.577
≈ −0.71
Negative. We surface null plus a human-readable dsrNote instead of clipping to zero. The empirical reading: out-of-sample Sharpe is expected to be at or below zero on this universe. The annualized 0.40 figure, deflated, also lands negative.
3. Universe vs filtered subset, the table we always publish next to the numbers
| Metric | Filtered subset (published) | Full universe (re-derivation, 17 markets) | Δ |
|---|---|---|---|
| n | 196 | 23,788 | ×121 |
| Win rate | 77.0 % | 46.8 % | −30.2 pts |
| Avg return T+90 | +13.2 % | +0.78 % | −12.4 pts |
| Cross-sectional Sharpe | 1.87* | 0.027 | −1.84 |
Source: docs/method-review/30-backtest-final-17markets.md §2 + queried 2026-05-17 against the live BacktestResult table.
Per-market BUY-universe slice (T+90, queried 2026-05-17)
| Market | n | Mean T+90 % | WR % | Sharpe xs | t-stat |
|---|---|---|---|---|---|
| FR (AMF) | 16,101 | −0.25 | 44.3 | −0.008 | −1.02 |
| AFM (NL) | 4,330 | +2.61 | 55.2 | +0.126 | +8.28 |
| SEC (US) | 2,391 | +3.47 | 46.8 | +0.117 | +5.70 |
| FSMA (BE) | 668 | +1.23 | 50.0 | +0.084 | +2.18 |
| BAFIN (DE) | 192 | +3.76 | 49.0 | +0.184 | +2.55 |
| OSLO (NO) | 92 | +15.62 | 55.4 | +0.188 | +1.80 |
3.1 Why we still publish the filtered figures
The Sigma filter is our production strategy, not a backtest gimmick. Publishing only the universe figures (44.3 %, −0.72 %) would be honest about the population but misleading about what we actually run. We therefore publish both side-by-side, with the explicit reminder that the filtered numbers should be read as upper bounds, not as an out-of-sample promise. Section 2.3 quantifies how much of the gap is attributable to selection bias.
4. Documented biases
4.1 Point-in-time (PIT) bias, pitBiasRisk
The mid-cap filter (one of the six Sigma filters) is currently evaluated against Company.marketCap, which is the current snapshot stored in the Company table, not the market cap at transactionDate. A company that has since moved out of the mid-cap bucket (in either direction) is misclassified relative to a PIT replay.
The CompanyMarketCapSnapshot table provides the data needed for a PIT re-run. Re-running the filter with PIT mcap is expected to widen the published-vs-universe gap (the current snapshot is biased towards survivors and post-rally mcap).
Until this fix ships, every UI surface displaying the filtered figures carries the PIT disclosure (STRATEGY_PROOF.disclosure.pitBiasRisk).
4.2 Survivorship bias, survivorshipBias
Companies delisted between 2015 and 2026 are absent from the Company table. Their trades are therefore absent from both the universe count (universeSize = 15,171) and the filtered subset. Winners are over-represented by construction because failed companies do not contribute their losing trades.
This is a population-level bias that no in-sample re-fit can correct. It compresses both the universe figures (44.3 %, −0.72 %) and the filtered figures (77 %, +13.2 %) in the same direction (towards over-statement).
4.3 Multiple-testing (data snooping) risk, multipleTestingRisk
The 583,200 grid-search combinations were evaluated on the same dataset. Even under the null hypothesis (zero edge), the best combination is expected to exhibit a positive in-sample Sharpe. The Bailey & López de Prado deflation (§2.3) quantifies this and lands negative on our dataset.
Walk-forward / out-of-sample evidence is tracked in docs/method-review/12-walk-forward.md. Until that document publishes a positive OOS Sharpe with statistical significance, the deflated Sharpe is the number that should be publicized.
5. Where this disclosure surfaces
/methodologie/#disclosure, full chapter (ChDisclosure.tsx).- Landing
LandingSigma,Sharpe 1.87*with asterisk + link to/methodologie/#strategie. - Methodology
Ch08Strategie#sharpe-footnote, three-Sharpe disclosure card. StrategyProofHeadlinecomponent, used anywhere outside/methodologiethat needs to render the headline trio.- Daily & weekly digest emails, disclosure paragraph appended to the footer (
digestFooterCta) and plain-text part. - Welcome email, appended after the CTA.
- API
/api/v1/strategy/winning, returns the fullSTRATEGY_PROOFobject includingdisclosure,universeSize,filteredSubsetSize,sharpeAnnualized,sharpeDeflated,dsrNote. - MCP server,
historicalProofpayload mirrors the API.
JSON-LD on the homepage and the methodology page describes the platform as a SoftwareApplication / Organization. It does not assert any Sharpe or return figure as a structured-data performance metric, intentionally, structured data for performance metrics would carry a regulatory expectation we are not in a position to meet on this dataset.
6. Regulatory framing, CSA "fair, clear, and not misleading"
The disclosure model above aims at the Canadian Securities Administrators' "fair, clear, and not misleading" standard for performance-related communications (also aligned with ESMA Guidelines on marketing communications, 2022/EBA/04). The relevant operational rules we apply:
- Every Sharpe figure is qualified, no bare "Sharpe 1.87" appears in any public surface.
- The filtered-subset size (n) is published next to every aggregate metric.
- The universe-vs-subset delta is one click away from every metric.
- Past-performance disclaimer is appended to every email and on the methodology page.
- Multiple-testing risk is named, the word "data snooping" appears in the disclosure card on
/methodologie/#disclosure.
This document is the single source of truth, copy on UI surfaces references back here.
7. Changelog
- 2026-05-17, initial publication (fixes C5 + C6 of
docs/method-review/28-financial-coherence-audit.md). AddeduniverseSize,filteredSubsetSize,disclosuretoSTRATEGY_PROOF. NewChDisclosurechapter on/methodologie. NewStrategyProofHeadlinecomponent. Disclosure paragraphs added to daily, weekly, and welcome emails.