Replay-safe narrative verification for abnormal market moves.
Timestamped evidence, future-source quarantine, deterministic narrative ranking, and bundle-verified replay cases.
Watch the 45-second demo · Run locally · Architecture · Quality gates
NarrativeDesk is a research and education workbench for auditing generated market narratives. It is not the AI that tells people what to think about a stock. It is the verification layer that asks whether an explanation was sourced correctly, timestamp-valid, non-leaky, contradiction-aware, historically comparable, and eventually validated or falsified.
The product question is simple:
Did this explanation win using only what was knowable at the time?
NarrativeDesk ranks competing explanations under a replay lock. Sources after the cutoff are quarantined and cannot affect the winning narrative.
Watch the 45-second replay lock demo
| Layer | Current public state |
|---|---|
| Product surface | Browser workbench with six frozen real-curated replay cases. |
| Event coverage | Earnings/guidance, operational/product incident, regulatory/antitrust shock, and litigation settlement. |
| Replay integrity | Future-dated sources are blocked before event-time scoring and kept separate for validation. |
| Evidence model | Timestamped sources, contradiction links, source quality, clustering, citation QA, and exportable ledgers. |
| Evaluation | Deterministic ranking, baseline comparison, validation windows, corpus quality gate, and bundle verification. |
| Limits | Frozen historical research fixtures only; no live recommendations, brokerage actions, or investment advice. |
The public browser workbench opens six real-curated replay cases. Each case uses timestamped source provenance, replay-locked evidence, blocked future validation evidence, deterministic ranking, and bundle verification. NarrativeDesk ranks the selected explanation above competing narratives using only evidence available at the replay lock.
The workbench includes:
- Case cockpit with abnormal move, peer move, sector move, volume spike, and replay lock.
- Corpus tab with public case breadth, source depth, event-type coverage, bundle status, filtering, and sorting.
- Narrative verification bracket with ranked competing explanations and baseline comparison.
- Evidence, contradiction, citation QA, source reliability, and source clustering inspectors.
- Replay audit showing allowed sources and blocked future sources.
- Future validation panel kept separate from event-time replay evidence.
- Export links for ledger JSON and report Markdown.
The public shell uses frozen real-curated replay bundles, not live market data. Synthetic ORION/AURORA/LYRA fixtures remain in the repo for deterministic regression tests and examples.
- Open the AAPL Q2 2024 replay workbench.
- Read the event strip: AAPL is replayed against frozen market and benchmark bars.
- Check the replay lock: the event-time system is locked at
2024-05-03T10:00:00-04:00. - Inspect the audit: future source
SEC-027is quarantined and removed from replay scoring. - Compare the verification bracket: capital return reset ranks above Services mix resilience, hardware demand pressure, and Greater China pressure.
- Open the evidence inspector: sources include frozen market bars, SEC EDGAR, MacRumors, and Nasdaq / Business Wire.
- Switch to NVDA, NKE, CRWD, SAVE, or MMM to compare another real-curated replay with the same anti-leakage gates.
- Reveal validation: held-out future evidence supports the rank #1 narrative after the replay lock.
- Export the Markdown report or ledger JSON.
Historical market research can accidentally use information that was not available at the replay timestamp. That makes a system look smarter than it was.
NarrativeDesk treats the replay lock as a first-class constraint. Event-time replay filters out future-dated sources before scoring and ranking. Future validation is loaded separately and clearly labeled as after-the-fact.
The repo keeps the deterministic research kernel separate from the product surface.
src/narrativedesk/: typed ledger models, replay filtering, scoring, validation helpers, pipeline, CLI, and report export.apps/api/: FastAPI service exposing event ID based endpoints around the kernel.apps/web/: Vite React workbench that renders kernel-generated demo artifacts.data/fixtures/: public real-curated replay bundles plus synthetic regression fixtures.schemas/: Narrative Ledger, Source Pack, Real Case Config, Validation Fixture, and Replay Bundle Manifest JSON schemas.tests/: unit and API tests.
The browser demo reads generated JSON from apps/web/public/demo/. Generate those artifacts from the Python kernel with make web-data. Replay bundles stay separate from future validation and evaluation bundles.
The generated example report lives at examples/sample_report.md. Bundled real-curated reports live under data/fixtures/real/*/report.md.
Implemented deterministically:
- Typed Narrative Ledger object.
- Event return, abnormal return, peer median return, sector return, and volume ratio from raw fixture bars.
- Replay-lock checks requiring market snapshot
timestamporas_offields. - Replay timestamp filtering.
- Narrative scoring, ranking, and audit checks.
- Historical analog selection from replay-safe narrative text, mechanism, event type, direction, and held-out validation labels.
- Evaluation checks for Narrative Recall@3, replay rank #1 validation, unsupported-claim penalty, and blocked future sources.
- Deterministic headline-baseline versus NarrativeDesk verification comparison.
- Deterministic ablation comparisons for evidence-only, no-contradiction-penalty, and quality-weighted selection.
- Citation QA checks for replay leakage, support coverage, provenance gaps, and low-quality evidence.
- Source reliability summaries by publisher and source type.
- Deterministic source clustering and derived originality scoring from replay-safe evidence.
- Source-pack registration with fixture integrity and real-curated content-hash checks.
- Real-data source-pack builder for Finnhub candles/news, frozen price CSVs, SEC EDGAR submissions/facts, local transcripts, and frozen estimate revisions.
- Ledger JSON export.
- Markdown report export.
- Replay-safe validation display.
- API endpoints for event, replay, ledger, report, and validation.
Future work:
- Provider adapters for transcript services, timestamped estimates, and analyst revisions.
- Optional AI agents for source-grounded hypothesis generation and contradiction generation; model output is never evidence by itself.
- Citation QA over larger real timestamped document sets.
- Multi-model arena and benchmark dataset.
Install JavaScript dependencies from the repo root:
npm ciGenerate demo artifacts:
npm run web:dataPreview and validate the synthetic source-pack example:
PYTHONPATH=src python3 -m narrativedesk.cli source-pack-preview examples/source_pack_template.jsonAssess whether a source pack is ready to ingest:
PYTHONPATH=src python3 -m narrativedesk.cli source-pack-readiness .codex-work/real_source_pack.jsonCreate a self-contained replay bundle from a ready source pack:
PYTHONPATH=src python3 -m narrativedesk.cli source-pack-bundle .codex-work/real_source_pack.json --out-dir .codex-work/real_case_bundleBundles include manifest.json with artifact hashes and replay-integrity metadata.
Verify a replay bundle before sharing or registering it:
PYTHONPATH=src python3 -m narrativedesk.cli bundle-verify .codex-work/real_case_bundleBuild a real-curated source pack from provider data:
PYTHONPATH=src python3 -m narrativedesk.cli real-pack-build \
.codex-work/real_case_config.json \
--out .codex-work/real_source_pack.json \
--env-file .env.localOr build and bundle in one step:
PYTHONPATH=src python3 -m narrativedesk.cli real-pack-bundle \
.codex-work/real_case_config.json \
--out-dir .codex-work/real_case_bundle \
--env-file .env.localStart from examples/real_case_config_template.json, fill in a real ticker, event timestamp, peers, and curated narratives, then keep the working config in .codex-work/.
To rehearse live-provider ingestion without committing real claims, fetch raw provider data into scratch space, normalize it into strict source candidates, then draft a curator-ready config:
npm run real-case:aapl:preflight
npm run real-case:aapl:rehearseThose scripts wrap the explicit CLI sequence below:
PYTHONPATH=src python3 -m narrativedesk.cli real-data-env-check --providers finnhub,sec --env-file .env.local
PYTHONPATH=src python3 -m narrativedesk.cli real-case-preflight \
--ticker AAPL --event-date 2024-05-02 \
--providers finnhub,sec --env-file .env.local \
--fetch-dir .codex-work/live-fetches/aapl-2024-q2 \
--draft-dir .codex-work/real-cases/aapl-2024-q2-rehearsal
PYTHONPATH=src python3 -m narrativedesk.cli real-case-rehearse \
--ticker AAPL --company-name "Apple Inc." \
--event-type earnings --event-date 2024-05-02 \
--from 2024-05-01 --to 2024-05-20 \
--replay-lock 2024-05-03T10:00:00-04:00 \
--providers finnhub,sec --include-sec-document-text \
--env-file .env.local \
--fetch-dir .codex-work/live-fetches/aapl-2024-q2 \
--draft-dir .codex-work/real-cases/aapl-2024-q2-rehearsalLive-provider rehearsal requires FINNHUB_API_KEY and SEC_USER_AGENT; NEWS_API_KEY is optional when using --providers newsapi. Outputs remain scratch until a human adds competing narratives, real-pack-build --require-narratives passes, and the final bundle verifies.
Optional Sonar discovery can scout for source URLs, but it is not evidence. Keep PERPLEXITY_API_KEY local, run real-source-discover into .codex-work/source-discovery/, then run real-source-freeze so NarrativeDesk refetches each URL, hashes page text, and applies the replay lock before any source candidate enters the real-case draft path.
For a scripted real-case path, generate a scratch workflow plan first. It writes only under .codex-work/ and treats model/search output as discovery, never evidence:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-workflow \
--ticker SAVE --company-name "Spirit Airlines, Inc." \
--event-type "regulatory/antitrust shock" --event-date 2024-01-16 \
--from 2024-01-12 --to 2024-03-08 \
--replay-lock 2024-01-16T16:10:00-05:00 \
--providers finnhub,sec --env-file .env.localThe workflow writes workflow_status.json and workflow_commands.md with per-stage status, expected outputs, missing outputs, and the next runnable stage.
Promote a private bundle only after bundle verification and the public quality gate pass:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-promote \
--bundle-dir .codex-work/real-cases/save-2024-regulatory-bundle \
--public-slug save_2024_regulatory \
--label "SAVE January 2024 JetBlue merger block real-curated replay"If you have a frozen, timestamped market CSV from another trusted local source, pass it during draft repair with real-case-draft --market-bars path/to/market_bars.csv; the file is copied into the scratch draft and still goes through the normal readiness and bundle checks.
Before using a frozen price file, inspect it against the case replay lock:
PYTHONPATH=src python3 -m narrativedesk.cli real-market-bars-check path/to/market_bars.csv \
--ticker AAPL --replay-lock 2024-05-03T10:00:00-04:00The rehearsal command also writes curated_narratives.template.json. After curation, apply a separate narrative JSON file without hand-editing source link arrays:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-apply-narratives \
--draft-dir .codex-work/real-cases/aapl-2024-q2-rehearsal \
--narratives .codex-work/real-cases/aapl-2024-q2-rehearsal/curated_narratives.template.jsonEach curated narrative can include supporting_source_ids, contradicting_source_ids, future_supporting_source_ids, and future_contradicting_source_ids; these helper fields are used to link sources and are omitted from the written config. Replace all TBD values and add source links before applying the template.
To apply curation, build a source pack, write a replay bundle, and verify it in one scratch step:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-curated-bundle \
--draft-dir .codex-work/real-cases/aapl-2024-q2-rehearsal \
--narratives .codex-work/real-cases/aapl-2024-q2-rehearsal/curated_narratives.template.json \
--out-dir .codex-work/real-cases/aapl-2024-q2-bundleIf a validation window has closed, pass a separate held-out fixture with --validation-fixture. It must match the case event ID and can reference only blocked-future source IDs.
Check draft, curation, and bundle state at any point:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-status \
--draft-dir .codex-work/real-cases/aapl-2024-q2-rehearsal \
--narratives .codex-work/real-cases/aapl-2024-q2-rehearsal/curated_narratives.template.json \
--bundle-dir .codex-work/real-cases/aapl-2024-q2-bundleBefore promoting a private real bundle, run the quality gate. It checks for a real-curated pack, 3-5 competing narratives, enough replay-time sources, blocked future evidence, contradiction links, and bundle integrity:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-quality \
--bundle-dir .codex-work/real-cases/aapl-2024-q2-bundleThat gate is for private ingestion quality. Before treating a real case as demo/public-ready, use the stricter gate; it additionally requires peer-market context for abnormal returns, directly linked replay-time evidence, more than one source type/publisher, and at least one held-out validation outcome:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-quality \
--bundle-dir .codex-work/real-cases/aapl-2024-q2-bundle \
--require-demo-readyFor public promotion, run the final gate. It keeps provenance-only rehearsals private until replay-time narrative evidence includes non-SEC, non-market sources:
PYTHONPATH=src python3 -m narrativedesk.cli real-case-quality \
--bundle-dir .codex-work/real-cases/aapl-2024-q2-bundle \
--require-public-readyInspect local prior-art repos for timestamped manual-source candidates:
PYTHONPATH=src python3 scripts/inspect_prior_art.py --repo-root citadail=/path/to/citadail --repo-root mktmind-qtm=/path/to/mktmind-qtm --repo-root applecapital=/path/to/applecapitalExtract scratch sector market bars from the local MarketMind prior-art dataset:
PYTHONPATH=src python3 scripts/extract_prior_art_market_bars.py --tickers XLK --from 2024-05-01 --to 2024-05-07Check a real-curated config before fetching provider data:
PYTHONPATH=src python3 -m narrativedesk.cli real-pack-check .codex-work/real_case_config.json --check-filesThe real-data builder currently supports:
- Finnhub
stock/candlefor replay-safe event, peer, and sector bars. - Local frozen CSV price files for replay-safe event, peer, and sector bars.
- Finnhub
company-newsfor timestamped company news sources. - SEC EDGAR
company_tickers, submissions JSON, and optional filing document text. - SEC EDGAR XBRL
companyfactsfor reported fundamental facts. - Local transcript text files for source-backed earnings-call evidence.
- Frozen CSV estimate revisions for replay-time and future validation evidence.
Real-data configs are intentionally curator-led. Provide case_metadata, optional market_data, optional news, optional sec_filings, optional sec_facts, optional transcripts, optional estimate_revisions, optional manual_sources, and optional narratives. Run real-pack-build first, inspect the generated source pack, then add or curate narrative links before source-pack-ingest, which requires ingestion-ready narratives. For intraday replay locks, use intraday candles; daily bars and date-only CSV rows are rejected for pre-close locks unless explicitly marked as post-close safe.
Convert a complete source pack into a replay fixture:
PYTHONPATH=src python3 -m narrativedesk.cli source-pack-ingest examples/source_pack_template.json --out .codex-work/event_fixture.json --validation-out .codex-work/validation_fixture.jsonValidate the generated validation scaffold before registering it:
PYTHONPATH=src python3 -m narrativedesk.cli validation-validate .codex-work/validation_fixture.jsonRegister those generated fixtures in a case index:
PYTHONPATH=src python3 -m narrativedesk.cli case-index-register .codex-work/case_index_seed.json --event-fixture .codex-work/event_fixture.json --validation-fixture .codex-work/validation_fixture.json --label "EXMPL synthetic source-pack example" --out .codex-work/case_index.jsonValidate a case index before evaluating it:
PYTHONPATH=src python3 -m narrativedesk.cli case-index-validate .codex-work/case_index.jsonRun the no-network real-data workflow smoke:
npm run real-pack:smokeRun the Python smoke export:
make smokeRun deterministic evaluation checks across the synthetic case index:
make evaluateRun the API locally after installing API dependencies:
python3 -m pip install -e '.[api]'
PYTHONPATH=src uvicorn apps.api.main:app --reload --port 8000Run the browser workbench:
npm run web:devBuild the browser workbench:
npm run web:buildDeploy the static workbench:
npm run verify:release
npx vercel --prodThe Vercel build uses npm run web:build and serves apps/web/dist. Do not deploy private .codex-work/ scratch outputs or local .env* files.
Run kernel and API tests:
make testRun a CLI smoke check:
make smokeIf frontend dependencies are installed, build the browser product:
npm run web:buildRun the browser smoke test:
npm run web:smokeCheck that the registered public corpus clears the stricter product-quality gate. The gate currently requires six verified real-curated bundles, six tickers, four event types, blocked future evidence per case, clean provenance, and replay rank #1 validation:
npm run public-corpus:qualityRun the release verification path, including real browser QA:
npm run verify:releaseIf the environment cannot launch a browser, run the static artifact smoke:
npm run web:smoke:staticGET /healthGET /api/eventsGET /api/evaluationsGET /api/events/{event_id}POST /api/events/{event_id}/runGET /api/events/{event_id}/ledgerGET /api/events/{event_id}/reportGET /api/events/{event_id}/report?include_validation=trueGET /api/events/{event_id}/validation
Replay, ledger, and default report endpoints do not include future validation data.
- The public workbench defaults to a frozen real-curated AAPL replay bundle and includes five additional verified public cases; it is not live market data.
- Real-data builder output is not automatically an investment thesis; a human still curates source-to-narrative links and uncertainty.
- Scores are transparent heuristics, not learned truth labels.
- Validation rows remain separate from event-time replay and may include pending outcomes.
- The browser demo is a single workbench, not a live terminal.
- No investment recommendations, brokerage integration, or real-money trading exist.
- Add more real historical replay cases with timestamped citations.
- Add transcript, estimate-revision, and analyst-consensus adapters.
- Add optional agent generation grounded in structured source packs, with every generated claim tied back to source IDs.
- Expand case evaluation to T+5, T+20, and T+60 validation windows.
- Build a 100-event benchmark for Validated Narrative Rank@3.
NarrativeDesk is meant to test the layer before P&L: whether a generated market thesis is source-backed, replay-safe, contradicted, ranked against alternatives, and later validated or falsified.
NarrativeDesk is for research and education only. It is not investment advice, a trading system, a broker, or an automated investment adviser.
