Conversation
Initial setup for the /loop auto-QA initiative on this repo. Adds auto-qa/PROGRESS.md as the working ledger: - Methodology + status snapshot - Full PR ledger #1–#9 (entire repo history): 7 bug-fix, 1 feature, 1 infra. Each bug-fix has hypothesis + ideal test + tools needed. - Tooling backlog ranked by leverage: 1. GraphQL passthrough contract test (4/9) 2. Unified-chart snapshot test (2/9) 3. Path-prefix dual-form test (1/9 — trivial first build) 4. Field-semantics property walker (#9 + future) - Open question recorded: pick node --test vs vitest. Default to node --test for minimal surface. No production code touched.
First test landed: auto-qa/tests/path-prefix.test.mjs covers PR #1 (the /charts path-prefix-strip middleware). Two cases, both green against the live api.futarchy.fi: ✔ GET /charts/<path> ≡ GET /<path> (same JSON envelope) ✔ GET /charts/health ≡ GET /health (status 200 from both) Test runner: node --test (built-in, zero deps). Glob 'auto-qa/tests/**/*.test.mjs'. Run with: npm run auto-qa:test The fixture proposal is GIP-150 v2 (0x1a0f209f…) over a pinned historical time window for reproducibility. Tests skip cleanly when api.futarchy.fi is unreachable so the suite stays non-flaky. PROGRESS.md updated: status snapshot bumped (1 test landed-passing, runner resolved), PR #1 row marked landed-passing, open question on test framework removed. No production code touched.
Adds auto-qa/tests/passthrough-contract.test.mjs — 7 cases covering the GraphQL passthrough behavior fixed in 4 of this repo's 9 PRs: PR #4: scalar pool: "0x…" filter on candles (chain prefix translated) PR #7: scalar proposal: "0x…" filter (chain prefix translated) PR #8: pool_in / proposal_in / id_in array filters PR #9: periodStartUnix preserved as snapped boundary (asserts ts % 3600 === 0 for every returned candle) Cross: response IDs always come back without chain prefix All 9 auto-qa tests now green (2 path-prefix + 7 passthrough-contract). Strategy used: each test makes a real HTTP call to api.futarchy.fi with a stable fixture (GIP-150 v2 = 0x1a0f209f…), checks both that the call succeeds and that the documented invariant holds. This is the highest-leverage suite in the auto-qa backlog (4/9 bugs covered by a single file). Tests skip cleanly if the live API is unreachable, so the auto-qa suite stays non-flaky. PROGRESS.md updated: status snapshot now 9 tests landed-passing, PRs #4/#7/#8/#9 marked landed-passing in the ledger. No production code touched.
…ignment) Adds auto-qa/tests/unified-chart.test.mjs — 4 cases against GET /api/v2/proposals/:id/chart for the GIP-150 v2 fixture: PR #5 — both conditional_yes.price_usd and conditional_no.price_usd are positive numbers (covers the CONDITIONAL > EXPECTED_VALUE > PREDICTION fallback chain). Also asserts pool_id is a plain address (not chain-prefixed). PR #6 — company_tokens.base.tokenSymbol is "GNO" and explicitly NOT "PNK" (the legacy hardcoded fallback). currency.tokenSymbol or stableSymbol is set. Bonus — candles.yes / candles.no are non-empty arrays for the pinned window, every periodStartUnix is snapped to a 3600s boundary, every close parses to a positive number. Bonus — volume reported in human units (< 1e15), not raw wei. If unit normalization regresses, this fires. All 13 auto-qa tests now green: - 2 path-prefix.test.mjs (PR #1) - 7 passthrough-contract.test.mjs (PRs #4/#7/#8/#9) - 4 unified-chart.test.mjs (PRs #5/#6) Coverage: 7 of 9 PRs in the repo's history, all of which are bug-fixes. Remaining: PR #2 (rpc-proxy infra, would need mock upstream) and PR #3 (passthrough scaffold feature, implicitly covered by all the filter tests). No production code touched.
Adds auto-qa/tests/multi-proposal-smoke.test.mjs — iterates over 3 diverse proposal fixtures and asserts the chart endpoint returns a valid contract envelope for each: - GIP-150 v2 (GNO/sDAI, fully indexed) - TSLA Mega Package (TSLAon/USDS) - Circle native USDC on Gnosis (USDC/sDAI) Asserts shape only (HTTP 200 + envelope keys present + symbols are non-empty strings + never the legacy "PNK" fallback). Data quality (price > 0, candles non-empty) is the unified-chart.test.mjs job for the canonical fixture. All 16 api auto-qa tests now green: - 2 path-prefix - 7 passthrough-contract - 4 unified-chart - 3 multi-proposal-smoke ← NEW Surfaced (and documented in PROGRESS.md, per /loop directive — NOT fixed): TSLA Mega Package and CIP-82 both return zero prices and fall through to the "TOKEN" default base symbol. Likely missing CONDITIONAL pools — worth investigating in a real fix-pass.
Adds auto-qa/tests/spot-candles.test.mjs — 3 cases pinning the
contract of GET /api/v1/spot-candles, the third major endpoint
on this service that wasn't previously covered:
1. Missing `ticker` query param → 400 with `error` field.
2. Unknown ticker → 200 with `{ spotCandles: [] }` envelope.
3. When any candidate ticker returns data, every candle satisfies
the documented `{ periodStartUnix: <unix-ts>, close: <number> }`
shape.
Test #3 is intentionally tolerant: it tries a few candidate tickers
and skips (not fails) if none return data. The underlying source
(futarchy-spot or GeckoTerminal) varies and pinning a specific
ticker fixture would rot.
API smoke-coverage now spans all three surfaces:
/api/v2/proposals/:id/chart ← unified-chart.test.mjs + multi-proposal-smoke
/candles/graphql ← passthrough-contract.test.mjs (PRs #4/#7/#8/#9)
/api/v1/spot-candles ← NEW
Test counts: 19 (18 passing, 1 skipped — the shape check, by design).
No production code touched.
Adds auto-qa/tests/indexer-freshness.test.mjs — 3 cases that compare
each Checkpoint indexer's head block to the live Gnosis chain tip
and assert the lag stays under a threshold:
candles (gnosis): < 5000 blocks (~7h @ 5s/block)
registry: < 15000 blocks (~21h — registry runs further
behind in normal ops)
Plus a sanity case that both heads are positive numbers > 30M (real
Gnosis blocks).
Snapshot at iteration creation:
- candles gnosis: 870 blocks behind (healthy)
- registry: 7916 blocks behind (degraded but within threshold)
This is a real ops invariant — earlier in this session we caught the
registry indexer drifting to ~5000 blocks behind via the status page.
Now there's a test for that. Skips gracefully if API or Gnosis RPC
unreachable.
All 22 api auto-qa tests now pass (21 pass + 1 skipped by design).
Adds auto-qa/tests/registry-org-shape.test.mjs — 4 cases that validate
the Organization entity shape returned by the registry indexer:
✔ at least one org returned (catastrophic-empty guard — catches
full table wipe / resync failure that would empty the Companies
page upstream of the frontend)
✔ every org has {id, name, owner, metadata}
✔ every non-null metadata is parseable JSON
✔ at least one org metadata uses archived/visibility flags
(PR #61 filter coverage diagnostic — emits counts so we know the
filter is being exercised by real data)
Cross-cutting catch for the bug family that landed as `interface`
PR #61 (Companies page rendering empty after Checkpoint migration).
If the registry indexer ever returns zero orgs again or the
metadata field changes shape, this test fires before users see
the symptom.
Diagnostic at iteration time: 3/7 orgs use archived flag — the
filter IS exercised by real data.
All 26 api auto-qa tests now pass (25 pass + 1 skipped by design).
Adds auto-qa/tests/legacy-v1-prices.test.mjs — 6 cases covering an
API surface that wasn't tested before:
GET /api/v1/market-events/proposals/:id/prices
(the predecessor to /api/v2/.../chart, still used by some clients)
Cases:
✔ HTTP 200 + documented envelope keys
✔ conditional_yes/no expose price_usd + pool_id (positive number,
plain address)
✔ company_tokens.base.tokenSymbol resolves (not legacy "PNK" fallback)
✔ v1 vs v2 cross-check: same base symbol + same YES pool_id
(catches the v1 and v2 paths drifting apart)
✔ response time < 5s for v1 (perf bound — catches warmer regression)
✔ response time < 5s for v2 (perf bound)
The cross-check is the most leveraged test — both endpoints serve
the same underlying market, so any logic-divergence between the
v1 and v2 codepaths trips this test before users see inconsistencies.
All 32 api auto-qa tests now pass (31 pass + 1 skipped by design).
Adds auto-qa/tests/operational-endpoints.test.mjs — 3 cases pinning
the contract of the two operational endpoints not previously covered:
✔ /health returns 200 with {status, timestamp} and timestamp is fresh
(catches edge-cache pinning that would freeze liveness checks)
✔ /warmer returns {active, entries[]} with consistent counts
✔ /health timestamp advances between consecutive calls (1.5s apart)
(catches edge cache regression on the health endpoint)
These are the surfaces status.futarchy.fi and any uptime monitor
depend on — if /health goes stale or /warmer crashes silently, this
test surfaces it before users notice a stale dashboard.
All 35 api auto-qa tests now pass (34 + 1 skipped by design).
Pins the GraphQL passthrough surface itself, independent of any
user-defined schema. Catches a layer the existing schema-shape tests
would miss with a more confusing error message:
- Cloud Run revision shipped without the route mounted
- Upstream Checkpoint indexer entirely unreachable
- HTTPS termination broken
- Reverse-proxy stripping the request body
- Error envelope shape changes that break clients branching on
`response.errors[0].message`
Coverage:
- { __typename } returns "Query" on both endpoints
- { __schema { queryType { name } } } returns a non-empty type name
- Malformed query yields { errors: [{ message: <string> }] } body
- GET is rejected (POST-only surface, must NOT return 200)
- 12 parallel introspections complete cleanly (no shared mutex)
Surfaced inconsistency (pinned, NOT fixed per directive):
/candles/graphql returns HTTP 502 on parse errors
/registry/graphql returns HTTP 400 on the same input
502 misclassifies a client error as server failure. Pinned in
PARSE_ERROR_STATUS so a deliberate unification surfaces as a
test update.
PR coverage: 7/9 -> 8/9 (only #2 RPC infra remains).
Tests: 35 -> 46 (api), 103 -> 114 cross-repo. All green.
Pins the cross-origin contract every browser-side caller depends on:
the frontend at futarchy.fi, staging frontends, Apollo Client, and the
Snapshot widget at snapshot.box. Not tied to any single PR — defensive
against a class of regressions:
- cors() middleware accidentally dropped from a route
- Stricter origin allowlist excludes futarchy.fi
- Apollo-Require-Preflight no longer in allow-headers
- X-Cache / X-Response-Time stops being exposed (silently zeros
the frontend's cache-hit instrumentation)
Coverage:
- 5 endpoints × 3 representative origins (futarchy.fi, staging,
snapshot.box) preflight matrix
- POST + REST GET responses also carry CORS headers (not just
preflight)
- Allow-Headers includes Content-Type AND Apollo-Require-Preflight
- Allow-Methods includes POST for both passthroughs
- Expose-Headers includes X-Cache and X-Response-Time
- Pinned-policy ratchet: today's policy is wildcard origin, test
fires loudly if we tighten so REPRESENTATIVE_ORIGINS gets updated
Tests: 46 -> 67 (api), 125 -> 146 cross-repo. All green.
Pins the X-Cache observability instrumentation on the chart endpoint
(/api/v2/proposals/:id/chart) — the only hot read path with a cache.
Catches a class of regressions that otherwise only surface as latency
slowly rising in production:
- Cache layer silently disabled (X-Cache header missing → frontend
cache-hit dashboard goes blind, no obvious user impact until p99
latency rises)
- X-Cache returns garbage instead of literal HIT or MISS
- X-Cache-TTL drifts to 0 (every call cold) or unbounded (stale data)
- X-Response-Time format breaks (frontend dashboard math goes NaN)
- Cache key includes a non-deterministic component (back-to-back
requests both MISS, throughput collapses)
- HIT path silently degraded (HIT requests no longer < 100ms)
Today's measurements: TTL=13s, HIT=0ms, MISS≈30-160ms. Test loosens
those bounds to defensible ceilings (TTL <= 24h, HIT <= 100ms).
Tests: 67 → 73 (api), 152 → 158 cross-repo. All green.
Pins the boundary semantics of GET /api/v2/proposals/:id/chart around
its minTimestamp / maxTimestamp params. Catches a class of quiet bugs
where the endpoint returns data outside the requested window — the
chart silently shows wrong candles with no visible symptom unless
someone manually inspects timestamps.
Coverage:
Degenerate-window graceful handling (must NOT 5xx):
- inverted window (max < min) → 200 + empty candles
- far-future window → 200 + empty candles
- far-past window → 200 + empty candles
- missing both timestamps → 200 + default window applied
- negative timestamps → 200 (defensive)
Window-respect invariants on a known-good window:
- every returned candle satisfies min <= periodStartUnix <= max
- candles strictly ascending by periodStartUnix in each series
- shape contract: {periodStartUnix, close} both parse as numbers
- 1-second window between known candles returns at most 1 per series
(period-snapping invariant from api PR #9)
Defensive against:
- Window predicate flipped (>= ↔ <=) in the indexer query
- Sort order inverted in a refactor
- Default-window logic returning unbounded data on missing params
- Inverted/future/past windows crashing the Checkpoint passthrough
Tests: 73 → 82 (api), 174 → 183 cross-repo. All green.
Pins POST /subgraphs/name/algebra-proposal-candles-v1 — a backward- compat shim that proxies to the same upstream as /candles/graphql but also injects spotCandles: [] into the response. Older clients (snapshot-labs/sx-monorepo, pre-Cloud-Run integrations) still hit this URL pattern; removing it would silently 404 them — same class of bug as PR #1 (/charts prefix lost). Coverage: - POST { __typename } returns 200 + Query - GET is rejected (POST-only surface) - spotCandles injection invariant on the legacy route - Negative confirmation: modern /candles/graphql does NOT inject spotCandles (the two routes have distinct contracts; if both start or both stop injecting, they've drifted) - Cross-route parity: real candles(...) query returns same shape + same row count from both routes - Malformed query yields the standard errors[] envelope API surface coverage: 3/4 → 4/4. Tests: 82 → 88 (api), 188 → 194 cross-repo. All green.
Pins the type contract of /api/v2/proposals/:id/chart's market block.
The frontend branches on heterogeneous types — price_usd as number
(JSON-native) vs volume as string (preserves 18-decimal precision
from the indexer) — and any "normalization" refactor that homogenizes
either type breaks parsing.
Coverage:
Type heterogeneity (the subtle invariant):
- market.{conditional_yes,conditional_no}.price_usd → number, finite, > 0
- market.volume.{cy,cn}.volume / volume_usd → string, parses positive
- volume.{cy,cn}.status === "ok" for healthy fixture
Address shape:
- event_id == requested proposal id (exact lowercase match)
- trading_address == event_id (single-trade-address invariant)
- all pool_ids match /^0x[a-f0-9]{40}$/ (chain prefix stripped)
- YES pool_id ≠ NO pool_id (catches pool-resolution collapse)
Timeline + chain:
- timeline.start, end are integer unix ts in 2020-2050 range
- timeline.start <= timeline.end
- timeline.chain_id === 100 (Gnosis pin)
Tokens (sharper than existing PR #6 test):
- company_tokens.base.tokenSymbol non-empty AND not "TOKEN"
fallback (catches pool-resolution priority chain regression
on the canonical healthy fixture)
Tests: 88 → 99 (api), 213 → 224 cross-repo. All green.
Pins how /api/v2/proposals/:id/chart treats four input classes:
canonical lowercase, uppercase (clients sometimes send checksummed),
zero address, and garbage/path-traversal/oversized strings.
Today's behavior is permissive (every input → 200, with empty/fallback
data for non-existent proposals). The test pins that permissiveness
so a future input-validation patch surfaces as a deliberate API
change requiring client coordination.
Coverage:
Case-insensitive lookup (the most important invariant):
- uppercase request returns same data as lowercase (same pool_id,
same token symbol — catches case-sensitive lookup leak)
- event_id normalized to lowercase in response
Zero-address graceful degradation:
- 200 status, prices=0, "TOKEN" fallback symbol
Garbage-input safety (must NOT 5xx):
- non-hex string ("0xnotahexvalue") → 2xx
- too-short string ("shortaddr") → 2xx
- path-traversal payload ("../etc/passwd") → 2xx + JSON body
(defensive: must NOT pass through to upstream as a substring)
- very long id (502 chars) → < 500 status
Tests: 99 → 107 (api), 244 → 252 cross-repo. All green.
Pins the three pure helpers in src/adapters/candles-adapter.js that
underpin the entire Checkpoint passthrough translation:
stripChainPrefix(id) "100-0xabc" → "0xabc"
addChainPrefix(id, chainId=100) "0xabc" → "100-0xabc"
stripPrefixesAndNormalize(value) recursive walker over response objects
CHAIN_PREFIXED_RE /^\d+-0x[a-fA-F0-9]{40}$/
Every PR #4/#7/#8/#9 fix relied on these being correct. A regression
in any one of them returns wrong data for every passthrough query (or
worse: 200 with no data, which looks normal until users complain).
Coverage:
stripChainPrefix:
- "100-<addr>" → bare addr; works for chains 1, 137, etc
- bare addr → unchanged (idempotent)
- null/undefined/"" passed through
- composite IDs ("1-<addr>-3600-<ts>") strip only leading segment
addChainPrefix:
- bare addr → "100-<addr>" (default chain)
- custom chainId
- already-prefixed input NOT double-prefixed (critical idempotency)
- null/undefined/"" passed through
- round-trip: stripChainPrefix(addChainPrefix(addr)) === addr
CHAIN_PREFIXED_RE pattern:
- matches valid forms (mixed-case hex)
- rejects bare addrs, composite IDs, partial matches, non-hex,
wrong-length, leading/trailing extras
stripPrefixesAndNormalize walker:
- top-level object fields
- nested objects (recursion)
- arrays (preserves order + length)
- leaves non-matching strings intact (composite IDs, URLs,
numeric strings)
- handles primitives + nullish at any depth
- idempotent (run twice = same output)
Tests: 107 → 128 (api), 262 → 283 cross-repo. All green.
Pins src/utils/token-from-pool.js — the pool-name → company/currency symbol resolver that PR #6 fixed. The PR-#6 unified-chart test pins the end-to-end "no PNK leak" property; this test pins the function directly so a regression to the priority chain or pattern matching surfaces with a clear message instead of a downstream "TOKEN" fallback that could be mistaken for an indexer issue. Coverage: Empty / invalid: - empty array, non-array, no-recognized-type all → both null Each pool type, happy path: - CONDITIONAL "YES_GNO / YES_sDAI" → company=GNO, currency=sDAI - NO_ prefix on either side accepted - EXPECTED_VALUE "YES_GNO / sDAI" → company=GNO, currency=sDAI - PREDICTION degenerate symmetry "YES_sDAI / sDAI" → company=null, currency=sDAI Priority chain (heart of PR #6's fix): - CONDITIONAL beats EXPECTED_VALUE even when EV is first in array - EXPECTED_VALUE beats PREDICTION when CONDITIONAL absent - falls all the way through to PREDICTION when no others present Defensive: - pools with no name field → skipped, fallback to next - null/undefined entries in array → skipped (no throw) - unrecognized name format → both null - whitespace tolerance around the slash separator - symbol \\w class permits digits + underscores Anti-PNK regression check (PR #6's whole point): - none of the empty/unknown/malformed paths return "PNK" Tests: 128 → 144 (api), 294 → 310 cross-repo. All green.
Pins src/utils/cache.js — the in-memory TTL cache backing the
response/registry/candles/spot caches. Subtle behaviors that
regressions can break silently:
- TTL expiry: get() must return undefined AND delete the entry
(not just lazy-skip)
- Hit/miss counters increment exactly once per get(); expired-
entry get counts as MISS
- clear() resets both store AND counters (not just store)
- set() resets the TTL clock for the key
Coverage:
get/set basics:
- set then get returns value
- get on missing key returns undefined
- set overwrites
Counter accuracy:
- hit increments hits, miss increments misses
- interleaved counts independent
TTL expiry (with 30-50ms TTLs for fast tests):
- entry expires after ttlMs
- expired-entry get is a MISS
- expired-entry get DELETES from store
- set() resets the TTL clock
stats() formatting:
- 0% with no calls
- integer percent rounding
- entry count from store.size
clear():
- empties store + resets both counters
cache-config defaults pinned via source-file regex:
- RESPONSE_TTL_SEC default = 13s
- REGISTRY_TTL_SEC default = 300s
- WARMER_INTERVAL_SEC = max(RESPONSE_TTL - 3, 5) formula
Tests: 144 → 161 (api), 357 → 374 cross-repo. All green.
Pins src/config/endpoints.js — the single switch routing the api
between Graph Node (legacy AWS, dead) and Checkpoint (post-AWS-GCP
target). A regression that flips the default mode back to graph_node,
removes the BROKEN_ footgun prefix, or drifts localhost ports breaks
adapter calls silently.
Coverage:
MODE handling:
- default is "checkpoint" (post-AWS-migration target)
- lowercased so case-insensitive env vars work
- allowlist is exactly graph_node + checkpoint
- unknown MODE warns AND falls through to GRAPH_NODE (the
pre-existing "if not exactly checkpoint then GRAPH_NODE" logic
— pinned so any "fix" is deliberate)
GRAPH_NODE footgun deterrent:
- registry + candles URLs both prefixed with
BROKEN_GRAPH_NODE_DO_NOT_USE:// (post-AWS-migration intentional
breakage so accidental routing fails fast with DNS error)
- legacy AWS CloudFront host pinned in the URL
CHECKPOINT defaults:
- registry localhost port = 3003 (Registry checkpoint per comment)
- candles localhost port is 3001 (prod) or 3004 (staging) per comment
- both read process.env.{REGISTRY_URL,CANDLES_URL} with fallback
Exports:
- ENDPOINTS, IS_CHECKPOINT, MODE all exported
- IS_CHECKPOINT defined as MODE === "checkpoint"
Tests: 161 → 172 (api), 385 → 396 cross-repo. All green.
Pins the LRU eviction + re-registration logic in src/utils/warmer.js's
registerForWarming function. The /warmer endpoint shape is covered by
operational-endpoints.test.mjs but the eviction policy itself is not
tested. A refactor that silently changes the eviction order or breaks
re-registration would cause:
- Re-registration treated as new entry → list bloat + churn
- LRU broken → unbounded growth past WARMER_MAX_ENTRIES
- registeredAt updated on re-register → retention windows reset
Coverage:
Initial registration:
- first call adds entry with params, lastSeen, registeredAt
- lastSeen === registeredAt on first registration
Re-registration of existing key:
- updates lastSeen but NOT size
- registeredAt is preserved (used for RETENTION_DAYS check)
- params are NOT updated (intentional: stale call shouldn't
overwrite params from initial registration)
LRU eviction at maxEntries:
- next registration after capacity evicts oldest by lastSeen
- re-registering an old entry protects it from eviction
- works at degenerate maxEntries=1
- 20 rapid registrations to a max-5 warmer → exactly 5 entries
Config defaults pinned via cache-config.js source regex:
- WARMER_MAX_ENTRIES = 50
- WARMER_RETENTION_DAYS = 7
- ENABLE_WARMER defaults to "true" (enabled)
Tests: 172 → 183 (api), 412 → 423 cross-repo. All green.
Pins src/services/rate-provider.js — the ERC-4626 rate fetcher used
to convert sDAI rates into base prices throughout the api. Critical
constants where a typo silently returns wrong data:
GET_RATE_SELECTOR keccak256("getRate()")[:4] = 0x679aefce
Drift here → eth_call falls into the catch →
returns 1 (no-conversion fallback) silently
CHAIN_CONFIG[100].defaultRateProvider
The canonical sDAI rate provider on Gnosis
(0x89C8...EceD). Typo → every sDAI conversion
uses 1.0 instead of the real rate
18-decimal scaling Number(rateBigInt) / 1e18. Wrong divisor (1e6
for USDC) scales every rate by 1e12; TVL
dashboards explode
Coverage:
GET_RATE_SELECTOR pinned exact value + still referenced in eth_call payload
CHAIN_CONFIG[1] is Ethereum + null defaultRateProvider (pinned;
adding an Ethereum default surfaces as deliberate change)
CHAIN_CONFIG[100] is Gnosis + canonical sDAI address
CACHE_DURATION = 5 * 60 * 1000 (5 min — sweet spot vs RPC load)
All four error paths return 1 (the no-conversion fallback):
- unknown chain
- missing providerAddress
- RPC error in result
- thrown exception in catch
18-decimal scaling literal pinned
Tests: 183 → 195 (api), 434 → 446 cross-repo. All green.
Pins src/services/spot-source.js — the toggle that routes spot price
fetches between the futarchy-spot service and CoinGecko/GeckoTerminal.
Coverage:
Toggle:
- USE_FUTARCHY_SPOT default is empty-string (== falsy → use Gecko)
- .toLowerCase() applied so "TRUE"/"True" both work
- FUTARCHY_SPOT_URL default is http://localhost:3032
URL construction (futarchy-spot endpoint shape):
- calls /api/v1/candles?ticker=...&minTimestamp=...&maxTimestamp=...
- encodeURIComponent on ticker (ticker contains "+", "!", "/")
Reliability:
- 10s AbortSignal timeout
- non-OK response falls back to fetchFromGecko
- try/catch wraps the entire fetch with same fallback
Default-window math:
- minTimestamp = maxTs - (limit * 3600) [hours back, NOT days/min]
- Math.max(0, minTimestamp) clamps to no-negative-unix
- default limit = 500
Surfaced bug (NOT fixed per directive — pinned for ratchet):
src/services/spot-price.js has a hardcoded CoinGecko API key as
`process.env.COINGECKO_API_KEY || '<KEY>'` fallback. Leaked key
in source. Test pins existence (not value) so a removal surfaces
as deliberate fix and any new addition surfaces too.
Plus pinned DEFAULT_CONFIG ticker = 'PNK/WETH+!sDAI/WETH-hour-500-xdai'
Tests: 195 → 208 (api), 458 → 471 cross-repo. All green.
Pins src/services/spot-price.js's parseConfig — the parser that
decodes ticker config strings used throughout the spot-price chain.
Four formats supported:
1. composite::POOL1+POOL2::RATE-interval-limit-network
2. BASE/QUOTE+!OTHER/QUOTE-... (multi-hop, ! inverts)
3. 0xPOOL[::RATE]-interval-limit-network (direct address)
4. BASE[::RATE]/QUOTE-interval-limit-network (base/quote ticker)
Plus trailing -invert flag and URL auto-decoding.
Bug class this catches: a refactor that breaks the format
disambiguation order silently routes "PNK/WETH" through the wrong
branch, returning bad data with no obvious symptom.
Coverage:
Falsy input → null
Format 1 (composite):
- two pools + rate provider
- ! invert prefix on a hop
- missing rate provider
Format 2 (multi-hop):
- "A/B+C/D" parses two hops with base/quote split
- "!" prefix inverts hop, NOT included in base symbol
Format 3 (pool address):
- bare 0x pool, no rate
- "0xPOOL::0xRATE" extracts both
- case-insensitive 0x prefix detection (0X works)
Format 4 (base/quote):
- simple "GNO/sDAI"
- "BASE::RATE/QUOTE" extracts rate from base side
-invert flag:
- case-insensitive (INVERT, Invert, invert)
- stripped from parts before indexing
- default invert=false
Defaults:
- interval="hour", limit=500, network="xdai"
- partial parts use defaults for missing slots
URL decoding:
- auto-decodes when % present
- skips decode when no % (perf shortcut pinned)
Format disambiguation order:
- composite:: takes priority over + and 0x checks
- multi-hop + takes priority over pool-address branch
Tests: 208 → 228 (api), 492 → 512 cross-repo. All green.
Pins src/adapters/registry-adapter.js — the on-chain registry lookup
that powers resolveProposalId (normalizes arbitrary proposal IDs to
canonical addresses) and lookupOrgMetadata.
Coverage:
normalizeProposalResult (pure shape-normalizer):
- proposalId + proposalAddress are lowercased
- originalProposalId preserves the input case (display in checksummed
form back to user)
- empty/null proposal yields all-undefined-or-null shape (no throw)
- organization fields extracted (id + name)
- 6 parseInt config fields (closeTimestamp, startCandleUnix,
twapStartTimestamp, twapDurationHours, chain, pricePrecision):
- valid string → integer
- missing → null (NOT 0, NOT NaN)
- empty string → null (the truthy-check; without it parseInt("")
yields NaN throughout downstream)
- 4 string fields with || null fallback (coingeckoTicker etc.)
Pinned canonical addresses (all four constants):
- AGGREGATOR_ADDRESS = 0xc5eb43...4fc1 (case-insensitive match
with futarchy-fi/interface DEFAULT_AGGREGATOR — cross-pinned)
- SNAPSHOT_LINK_REGISTRY = 0xa6Bc28...0823
- FACTORY_ADDRESS = 0xa6cB18...0a345
- GNOSIS_RPC default = https://rpc.gnosischain.com
Tests: 228 → 248 (api), 530 → 550 cross-repo. All green.
Pins src/services/spot-price.js's combineHopCandles + NETWORK_MAP +
GECKO endpoint selection logic.
combineHopCandles is the core multi-hop price multiplier — given
candles for each hop in a multi-hop ticker (e.g. PNK/WETH × WETH/sDAI),
produces a single composite series by collecting all unique timestamps,
forward-filling missing prices per hop, and multiplying once ALL hops
have at least one known price. A regression here silently corrupts
every multi-hop spot price.
Coverage:
combineHopCandles:
- empty array → empty
- single-hop → identity (returns SAME array, not copy)
- two hops same timestamps multiply per-timestamp
- three hops multiply all together
- missing timestamp on one hop forward-fills from previous
- skips timestamps before ALL hops are initialized (warmup)
- output sorted by time ascending
- float precision preserved through multiplication
NETWORK_MAP:
- xdai alias (chainId 100, gecko "xdai")
- gnosis alias (synonym for xdai — both must route to chain 100)
- eth alias (chainId 1)
- base alias (chainId 8453)
- all RPC URLs are HTTPS
GECKO endpoint selection (key-conditional URL + headers):
- GECKO_API switches to pro-api.coingecko.com when key set
- Falls back to api.geckoterminal.com (public)
- GECKO_HEADERS adds 'x-cg-pro-api-key' when key set (pro-api requires it)
- Public-headers branch is just {accept} — defensive against leaking
pro key to public terminal endpoint
Tests: 248 → 265 (api), 568 → 585 cross-repo. All green.
iteration 29 (api side). New: auto-qa/tests/algebra-client.test.mjs
(17 cases).
Pins src/services/algebra-client.js — the LEGACY Graph Node-shaped
client still imported by unified-chart.js + market-events.js as the
non-Checkpoint fallback path. Five concerns:
1. ALGEBRA_ENDPOINT === ENDPOINTS.candles (env-driven, not hardcoded
URL). Plus a defensive scan asserting NO http(s):// strings live
outside comments.
2. fetchPoolsForProposal uses GraphQL VARIABLE binding ($proposalId:
String!) NOT inline interpolation — protects against query
injection. Variable type is String! (not BigInt!) — Graph Node
shape, not Checkpoint.
3. getLatestPrice period hardcoded to "3600" (1-hour candles) on
BOTH ternary branches. Drift would silently change chart
sampling rate.
4. getLatestPrice maxTimestamp param defaults to null (not undefined,
not 0); when null, _lte filter is omitted; when set, included.
A 0 default would query "everything <= 0" → zero rows.
5. Default-zero behavior: returns 0 (not null/NaN) when no candle
found — pinned because callers expect numeric. parseFloat (not
parseInt) on candle.close.
Plus pins for the orderBy/orderDirection invariant (must be
periodStartUnix DESC + first 1 to get LATEST, not earliest), the
Graph-Node-only nested selections (token0/token1/proposal), the
GraphQL error-throw guards on both functions, and the module
docstring's pointer to candles-adapter.js for mode-aware code.
Tests: 265 -> 282 (api). All 17 new cases passing.
iteration 30 (api side). New: auto-qa/tests/graphql-passthrough-factory.test.mjs
(20 cases).
First UNIT-level coverage for src/routes/graphql-passthrough.js — the
generic GraphQL passthrough factory wired into /registry/graphql and
/candles/graphql by src/index.js. Existing passthrough-smoke and
passthrough-contract tests exercise the live HTTP endpoint; this file
locks the factory's internal branching with mock req/res + a fetch
stub (no live network).
Branches pinned:
- Factory shape — returns an async (req, res) handler.
- 503 branch — getUpstreamUrl() returning null/undefined/"" emits
`{ errors: [{ message: '[label] upstream URL not configured' }] }`
AND short-circuits BEFORE calling fetch (prevents accidental
fetch("") on env-driven misconfig).
- Happy path — POST + Content-Type: application/json + AbortSignal,
upstream status code forwarded (probed at 200/201/400/502),
upstream content-type forwarded with 'application/json' fallback,
body forwarded VERBATIM (no JSON.parse re-stringify).
- req.body fallback — undefined → "{}", null → "{}" (the ?? operator
in `req.body ?? {}`). Catches a regression where the body becomes
the literal string "undefined".
- Error branches — AbortError → 504 ("[label] upstream timeout
after Nms"); other Error → 502 ("[label] upstream error: <msg>");
error without .message → "unknown" (defensive default).
Plus source-text pins: DEFAULT_TIMEOUT_MS = 15_000 (15s),
AbortController + signal wiring, clearTimeout in `finally` (timer
leak guard under high traffic), and the [${label}] log prefix
invariant for ops triage.
Tests: 282 -> 302 (api). All 20 new cases passing.
…BoundedByDirect First true cross-layer count check for the unified-chart endpoint. apiUnifiedChartShape only validated SHAPE; this asserts the inter- layer relationship between api yes+no candles and direct indexer total. Since api filters by proposal pools, api ⊆ direct, so api total ≤ direct total. Catches api filter regression (returns ALL instead of pool-filtered subset) and transform fabrication. Cross-layer match family now spans 3 patterns: passthrough match (apiCandlesMatchesDirect), multi-entity passthrough match (apiRegistryMatchesDirect), and filtered subset (NEW). Test fix: previously-passing apiUnifiedChartShape populated 3 candles but default direct had 1; bumped its candlesCandlesCount to 3 to keep it happy under the new invariant. 36 invariants total. 109/109 smoke tests pass (was 105). Bridges to the documented full chartShape invariant — that future iteration extends count to ID-by-ID pair-wise compare.
…dAbove Magnitude-upper-bound for swap amounts. Closes the swap-side gap in the magnitude-sanity family: candle side already had probabilityBounds + candlePricesNonNegative; swap side only had > 0 + range checks. Asserts amountIn AND amountOut < 1e15 — catches raw uint256 leaks (parseFloat returning 1e18 instead of decimal "1.0") and token-decimal misalignment that scales values by 1e6x. Distinct from swapAmountsPositive which only checks sign; raw-int leak passes that check (1e18 > 0) but fails this one. 37 invariants total. 113/113 smoke tests pass (was 109). Magnitude- sanity family now SYMMETRIC across candle and swap sides — each has lower-bound + upper-bound coverage.
First indexer-side enum validation. For all pools (first 50), asserts
type ∈ {CONDITIONAL, PREDICTION, EXPECTED_VALUE} (the set sourced from
unified-chart.js's findPoolByOutcome). Catches:
- Schema migration that adds a 4th type without updating consumers
- Indexer regression returning null type
- Typo'd type values like "PRDICTION"
Distinct from probabilityBounds which treats non-PREDICTION as vacuous
— so a typo'd type silently slips through every existing check while
the api adapter silently drops the pool. New pattern: iterate-all-rows
enum check (vs latest-row or count-only).
38 invariants total. 118/118 smoke tests pass (was 113). Pool-entity
coverage now spans existence + FK + per-pool field validation.
…hyProdAggregator High-value PINNING check at the registry layer. Asserts the indexer has the production futarchy aggregator (0xc5eb43d5…d4fc1, hardcoded in 3 api source files: registry-adapter.js, unified-chart.js, market-events.js — the api literally cannot function without this aggregator's data). Registry-side analog of anvilChainId: chain pin proves we forked Gnosis; this pin proves the indexer was bootstrapped with the right chain + start_block + contract config. Distinct from registryHasAggregators (existence): a wrong-block bootstrap might produce ghost aggregators, passing the existence check but missing the prod one entirely. Test 4 verifies this gap explicitly. Fixture: new includeFutarchyProdAggregator knob (default true) appends prod address. 2 existing tests updated to set knob=false where they assert exact aggregator counts. 39 invariants total. 121/121 smoke tests pass (was 118). Hardcoded- address pinning now symmetric across chain (anvilChainId) + registry (this slice).
…sObservabilityHeaders
40-invariant milestone. First response-HEADER validation in the
catalog — every prior api invariant probed status code or body
shape. This asserts X-Cache ∈ {HIT, MISS} AND X-Response-Time
matches /^\d+ms$/.
The unified-chart handler emits these on every code path (cached
HIT + fresh MISS); ops dashboards consume them. A regression that
drops them is invisible to body checks. Test 3 verifies the gap:
drop X-Cache header → apiUnifiedChartShape STILL passes since body
shape unchanged; only this header probe catches it.
Catches: removal of cache layer instrumentation; addition of third
state ('STALE') without telling ops; format regressions emitting
'NaN ms' or raw integer.
Fixture: chart handler now emits headers unconditionally; new
unifiedChartXCache / unifiedChartXResponseTime knobs.
40 invariants total. 126/126 smoke tests pass (was 121).
…nMentionsAnvil Chain-CLIENT identity pin. Distinct from anvilChainId (chain-NETWORK pin). Calls web3_clientVersion; asserts response contains "anvil". Together they pin both layers of "right environment" — chain ID for the network, client version for the EVM impl. Catches running against a Gnosis fork on geth/erigon where chain ID matches but anvil_/evm_ extensions for impersonation, snapshots, and time-warping would silently fail in scenario tests. 41 invariants total. 130/130 smoke tests pass.
…n api side First staged CI workflow on the api side. Job runs the orchestrator's 130+ smoke-test invariant battery against an in-process node:http fixture (no docker, no real services, ~1.5s test time + Node setup). Trigger is workflow_dispatch only for this first version, matching the conservative roll-out of slices 3a + 3c. Also added auto-qa/harness/ci/README.md mirroring interface-side pattern (staging dance explanation + currently-staged table + promotion command). Cheapest of the 4 currently-staged CI workflows to promote (no docker, no Playwright, no GH Actions secrets); recommended first promotion target for the maintainer.
…bsetOfDirect (42nd invariant) First cross-layer per-row TIME-PAIR check for the unified-chart endpoint. Strengthens chartCandleCountsBoundedByDirect (count bound) into per-row time-membership: every candle time the api surfaces must appear in the direct candles indexer's time set, otherwise the api is fabricating data (or mixing another proposal's periods). Uses `time` not `id` because applyRateToCandles reshapes raw indexer candles and doesn't expose IDs. Catches bug classes the count bound MISSES: transform synthesizing period-start timestamps, cache key mismatch returning wrong proposal's candles, time-bucket off-by-one, SPOT bleeding into yes/no. 42 invariants now: 11 api-internal + 26 indexer + 5 chain. 134/134 smoke tests pass (4 new + 2 existing tests aligned to DESCENDING candleTimes so candleTimeMonotonic stays happy).
…ent (43rd invariant) Sixth chain-layer invariant. Probes the FEE-MARKET state, which can be independently broken from chain identity / block shape. Asserts eth_gasPrice returns a 0x-prefixed positive hex value. Three named failure modes (each with its own diagnostic): - null → EIP-1559-only mode (legacy gas pricing disabled) - 0x0 → broken fee market (anvil --gas-price 0 misconfig) - non-hex → RPC-layer regression (BigInt parsing breaks) Why this matters for scenarios: most futarchy flows submit transactions (impersonateAccount + send) which need a working gas price for estimation. Without this probe, a scenario reports "transaction failed at step N" with no breadcrumb pointing to the fee-market issue. 43 invariants now: 11 api-internal + 26 indexer + 6 chain. 139/139 smoke tests pass (5 new). 1 fixture knob added (gasPriceHex), 1 RPC dispatch case added (eth_gasPrice).
…acheTtlPresent (44th invariant)
Second response-HEADER probe in the catalog. Sister to
apiUnifiedChartHasObservabilityHeaders (X-Cache +
X-Response-Time); this one covers X-Cache-TTL. Split into a
separate invariant for single-responsibility per probe — ops
dashboards filter on TTL independently of hit/miss.
Scope correction: the original X-Cache+X-Response-Time invariant's
comment said TTL was HIT-only. Inspection of unified-chart.js
shows it's set on BOTH paths (line 74 HIT + line 278 MISS), so
this asserts unconditionally rather than as a conditional check.
The old comment was updated in the same commit.
Format asserted: positive integer string, no unit suffix. Catches
refactor dropping TTL from one path but not the other (sister
probe STILL passes — demonstrates per-header-split value), 'NaN'
/'-1' from timing/env-var bugs, accidental unit suffix ('300s'
silently wrong: parseInt returns 300 by coincidence), header
dropped entirely.
44 invariants now: 12 api-internal + 26 indexer + 6 chain.
144/144 smoke tests pass (5 new). 1 fixture knob added
(unifiedChartXCacheTtl).
…onMatchesChainId (45th invariant)
Seventh chain-layer invariant. Chain-RPC-CONSISTENCY check —
asserts net_version (decimal) and eth_chainId (hex) numerically
agree. Both methods should report the same chain ID by spec
(net_version is legacy; eth_chainId is the EIP-695 modern method).
Divergence silently breaks consumers that pick one or the other.
Orthogonal to anvilChainId: that asserts eth_chainId === 0x64 (the
EXPECTED Gnosis value); this asserts net_version === eth_chainId
(CONSISTENCY regardless of WHAT they equal). Demonstrated by the
bare-anvil-31337 test: both methods report 31337, this passes
(consistency intact), anvilChainId fails (wrong network).
Bug shapes caught (NOT caught by anvilChainId alone):
- Fork rebase updates one method but not the other
- Reverse-proxy misconfig routes them to different upstreams
- Mock fixture hardcodes one but not the other
- Anvil version regression where one method reads from a
stale cached config and the other from live state
45 invariants now: 12 api-internal + 26 indexer + 7 chain.
149/149 smoke tests pass (5 new). 1 fixture knob added
(netVersion), 1 RPC dispatch case added (net_version).
…nCapabilityPresent (46th invariant) Eighth chain-layer invariant; first to exercise an ANVIL-SPECIFIC RPC method (anvil_impersonateAccount) rather than standard JSON-RPC. Asserts the method is actually callable, not just that the client *claims* to be anvil — distinct domain from anvilClientVersionMentionsAnvil. Several "hardhat-compatible" forks and patched-anvil builds exist that emit "anvil" in web3_clientVersion but lack the impersonation extension scenarios depend on. With anvilImpersonationSupported: false, the capability probe FAILS while the identity probe STILL passes — proving the two checks are orthogonal. Why this matters for scenarios: every futarchy flow that mutates state requires impersonating an account (proposer, trader, resolver). Without this method, EVERY scenario silently fails to produce state changes. 46 invariants now: 12 api-internal + 26 indexer + 8 chain. 152/152 smoke tests pass (4 new). 1 fixture knob added (anvilImpersonationSupported: true | false | 'rpc-error'), 1 RPC dispatch case added (anvil_impersonateAccount).
…bilityPresent (47th invariant) Ninth chain-layer invariant; second chain-CAPABILITY probe. Sister to anvilImpersonationCapabilityPresent. Together they form the MINIMAL CAPABILITY SET scenarios depend on: - impersonate → call function as arbitrary account - snapshot/revert → roll back state between tests Distinct domain from impersonation: evm_snapshot is part of the GANACHE LINEAGE (anvil + hardhat both support it; geth/erigon/reth don't). Failure modes are complementary: - anvil_* missing → wrong dev client (hardhat instead of anvil) - evm_* missing → real client (geth/erigon/reth) - both ok → minimal scenario capability satisfied Also catches subsystem-broken case: method registered but returns null/non-hex (calling evm_revert with that silently fails). The non-hex check distinguishes "registered but broken" from "not registered at all" — different diagnostic paths. 47 invariants now: 12 api-internal + 26 indexer + 9 chain. 156/156 smoke tests pass (4 new). 1 fixture knob added (snapshotResult: '0x1' | false | null | 'rpc-error'), 1 RPC dispatch case added (evm_snapshot).
…sPositive (48th invariant)
First iterate-all-rows extension on the swap side. Strengthens
swapAmountsPositive (latest-only) into a per-row check across the
first 50 swaps. Mirrors the poolTypeIsValidEnum pattern (iterate-
all-rows enum check at the indexer layer).
Why both invariants exist:
- swapAmountsPositive (LATEST only) — cheap probe; catches
event-decoder bugs uniform across ALL swaps
- THIS one (UP-TO-50 rows) — catches bugs that affect SUBSETS
of swaps without affecting the latest
Bug shapes caught (NOT caught by latest-only):
- Indexer reorg re-processed historical blocks; latest fine,
old rows wrong
- Block-context-dependent decoder bug — reads "decimals" from
pool's CURRENT state instead of swap's block, corrupting
historical swaps from before a decimals change
- Partial-rewrite bug — fix re-emitted only swaps from a
specific block range with the corrupted shape
- Pool-specific decoder bug — only swaps for one pool affected;
latest happens to be a different pool
Fixture extension: buildSwaps now defaults amountIn/amountOut to
'1.0' for non-zero indices (index 0 still uses latestSwap* for
back-compat). New per-row override knobs: swapAmountIns,
swapAmountOuts arrays.
48 invariants now: 12 api-internal + 27 indexer + 9 chain.
160/160 smoke tests pass (4 new).
…e (49th invariant)
First body-shape probe on /health. STRENGTHENS the existing
apiHealth (status-code-only) into a body validation. Production
/health (src/index.js line 54) emits { status: 'ok', timestamp:
<ISO 8601> } — both fields matter to downstream ops.
Why both invariants exist:
- apiHealth (status-code-only) — catches "endpoint dead" outright
- THIS one (body shape) — catches refactors that keep the
endpoint serving 200 but change its body shape, silently
breaking downstream consumers
Bug shapes caught (NOT caught by apiHealth):
- Refactor returns just the string 'ok' (not JSON body)
- status field renamed to 'state'
- status value changed ('healthy' instead of 'ok') — LB string-
match health checks silently fail
- timestamp dropped — ops dashboards parsing 'last-fresh' age
silently break
- timestamp emitted as Unix epoch number instead of ISO 8601
string — every ISO parser breaks
- timestamp is a string but malformed
ISO 8601 validation strategy: Date.parse() rather than a regex —
robust enough to accept the canonical new Date().toISOString()
format and reject typical malformed inputs.
Fixture /health handler now defaults to production shape (was
{ ok: true }, now { status: 'ok', timestamp: <ISO> }). New knobs:
healthStatus, healthTimestamp, healthBody (full-body override).
49 invariants now: 13 api-internal + 27 indexer + 9 chain.
164/164 smoke tests pass (4 new).
…bilityPresent (50th invariant) 🎯
50-invariant milestone. Tenth chain-layer invariant; third chain-
CAPABILITY probe. COMPLETES the minimal capability TRIO that
scenarios depend on:
1. impersonate → call function as arbitrary account
2. snapshot/revert → roll back state between tests
3. TIME-WARP → simulate "wait N seconds/days" (this slice)
Without time-warp, ANY scenario involving a time-gated state
transition (resolution after deadline, TWAP window calculation,
vote-weight decay) cannot run at all — wall-clock waits would
make CI runs hours-long.
evm_setNextBlockTimestamp lineage: ganache-original method,
supported by anvil + hardhat + ganache. Same support profile as
evm_snapshot — wrong-fork clients (geth/erigon/reth) lack it.
Bug shapes caught (NOT caught by impersonate / snapshot probes):
- Anvil flag --no-storage-caching disabling time-warp specifically
(snapshot can work while timestamp manipulation is broken)
- RPC method-allowlisting blocking evm_setNextBlockTimestamp
while allowing evm_snapshot/revert
- Anvil version regression with new signature dropping legacy
alias
Side effect: probe sets next-block timestamp to now+86400. No
block mined in the probe; effect only manifests if a scenario
subsequently mines, which can override.
Includes a "trio milestone test" that explicitly verifies all
three capability probes pass on default fixture, documenting
the trio as a coherent set.
50 invariants now: 13 api-internal + 27 indexer + 10 chain.
168/168 smoke tests pass (4 new). 1 fixture knob added
(timeWarpSupported), 1 RPC dispatch case added
(evm_setNextBlockTimestamp).
…e (51st invariant)
Second body-shape probe in the catalog. Sister to apiHealthBodyShape
(just shipped) — both extend a status-code-only invariant with
body-shape validation. Together they cover the two main
observability endpoints (/health + /warmer).
Production /warmer (src/utils/warmer.js getWarmerStatus()) emits:
{ active, maxEntries, refreshIntervalSec, retentionDays, entries[] }
All four numeric fields must be finite numbers. `active` may be 0
(warmer might have no entries yet); the three config fields must
be > 0 (0 means "disabled" — a config regression). `entries`
must be an array.
Bug shapes caught (NOT caught by apiWarmer):
- Refactor renames any of the four numeric fields (e.g.,
active → activeCount) — silent until ops gauges break
- Numeric field emitted as string ('5' instead of 5) —
consumers using strict typeof checks break
- entries field changed to non-array (e.g., object keyed by id)
— consumers iterating with .map() crash with "is not a function"
- Body wrapped in a `data` field by middleware refactor
- Config sentinel hit (refreshIntervalSec=0 = warmer disabled
— silent regression that breaks the entire warming subsystem)
Fixture /warmer handler now defaults to production shape (was
{ status: 'warm', queues: 0 }). New knobs: warmerActive,
warmerMaxEntries, warmerRefreshIntervalSec, warmerRetentionDays,
warmerEntries, warmerBody (full-body override).
51 invariants now: 14 api-internal + 27 indexer + 10 chain.
172/172 smoke tests pass (4 new).
…emaHasRequiredTypes (52nd invariant)
First GraphQL INTROSPECTION probe in the catalog — qualitatively
new dimension. All previous indexer probes query DATA (pools,
swaps, candles); this queries the SCHEMA (__schema { types { name } })
to verify the entity types themselves still exist.
The bug class this catches: schema regeneration renames a type
(Candle → OHLCBar) or drops one entirely. Data probes hitting the
renamed/dropped type return GraphQL errors like "Cannot query
field 'candles'" — surfacing as misleading "indexer empty"
diagnostics. This invariant catches the rename DIRECTLY with a
clear "schema is missing required type(s): Candle" message,
making triage take seconds instead of minutes.
Bug shapes caught (NOT caught by data probes):
- Schema regeneration renamed Pool → LiquidityPool / Candle
→ OHLCBar
- A required type was DROPPED entirely from the schema
- Schema introspection itself was disabled (some production
GraphQL servers disable it for security)
Required types asserted: Pool, Swap, Candle (the three entities
the harness actually queries). Doesn't hard-pin every type —
that would over-couple to the schema; pins ONLY the harness's
actual dependencies.
Fixture extension: candles-direct response now includes
__schema: { types: [...] } by default. Knob candlesSchemaTypes
lets tests override; setting it to null omits __schema entirely
(simulates introspection-disabled servers).
Indexer-side coverage now spans THREE qualitatively distinct
dimensions: connectivity, data, and SCHEMA. Together they
distinguish "indexer down" vs "indexer empty" vs "indexer
schema regression" — three failure modes that previously all
collapsed into "indexer failing somehow".
52 invariants now: 14 api-internal + 28 indexer + 10 chain.
176/176 smoke tests pass (4 new).
…hemaHasRequiredTypes (53rd invariant)
Second GraphQL INTROSPECTION probe — sister to
candlesIndexerSchemaHasRequiredTypes (just shipped), on the
registry indexer. Symmetrically completes schema-validation
coverage across BOTH indexers.
Why a separate registry probe (not one combined invariant):
- Registry and candles are SEPARATE Checkpoint deployments —
they can be regenerated/migrated independently
- Failure diagnostics stay precise: which indexer's schema
regressed, not "one of them did"
- Different required type names (registry = ProposalEntity/
Organization/Aggregator; candles = Pool/Swap/Candle)
Required types: ProposalEntity, Organization, Aggregator. The
three load-bearing registry entities — each referenced by other
invariants (FK probes, aggregator pinning probe, registry
adapter probes).
Bug shapes caught (NOT caught by data probes):
- Schema regen renamed ProposalEntity → Proposal (data probes
return "Cannot query field 'proposalEntities'" — looks like
indexer-empty)
- Aggregator dropped entirely → aggregator-pinning probes
silently fail with misleading errors
- Introspection disabled on registry side (independent of
candles side; sister probe still passes — demonstrates
per-indexer diagnostic precision)
Fixture extension: registry-direct response now includes
__schema: { types: [...] } by default. Knob registrySchemaTypes
lets tests override; null omits __schema entirely.
Both indexers now have FULL coverage across THREE qualitative
dimensions: connectivity, data, SCHEMA. Failure diagnostics
triage to ONE of three modes (down / empty / schema-regressed)
per indexer.
53 invariants now: 14 api-internal + 29 indexer + 10 chain.
180/180 smoke tests pass (4 new).
…owsNonNegative (54th invariant)
Iterate-all-rows extension on the candle side. Sister to
swapAmountsAllRowsPositive — symmetrically completes the iterate-
all-rows pattern across the two main accumulator-bearing entities
(swap amounts + candle volumes).
Why both candleVolumesNonNegative AND this exist:
- candleVolumesNonNegative (LATEST only) — cheap probe; catches
aggregator bugs uniform across all candles
- THIS one (UP-TO-50 rows) — catches bugs affecting SUBSETS of
candles without affecting the latest
Bug shapes caught (NOT caught by latest-only):
- Indexer reorg re-processed historical periods; latest fine,
old candles corrupted
- Per-period decoder bug (aggregator reads pool token-decimals
from CURRENT state instead of period snapshot, corrupting
historical candles from before a decimals change)
- Partial-rewrite bug — fix re-emitted only candles from a
specific period range
- Pool-specific aggregator bug — only candles for one pool
affected; latest happens to be a different pool
Fixture extension: buildCandles now defaults volumeToken0/1 to
'1.0' for non-zero indices (index 0 still uses latestCandleVolume*
for back-compat). New per-row override knobs: candleVolumeToken0s,
candleVolumeToken1s arrays.
The iterate-all-rows pattern is now SYMMETRIC: swap amounts AND
candle volumes both have latest-only + iterate-all-rows coverage.
Each pattern catches a distinct bug class (uniform-aggregator vs
subset-corruption).
54 invariants now: 14 api-internal + 30 indexer + 10 chain.
184/184 smoke tests pass (4 new).
…Consistent (55th invariant)
Third iterate-all-rows extension. COMPLETES the iterate-all-rows
TRIAD on the indexer's main accumulator entities:
1. swapAmountsAllRowsPositive (8 slices ago)
2. candleVolumesAllRowsNonNegative (last slice)
3. candleOHLCAllRowsConsistent (this slice)
Each accumulator entity now has BOTH latest-only + all-rows
coverage:
| Entity | Latest-only | All-rows |
|--------|-------------|----------|
| swap.amount{In,Out} | swapAmountsPositive | swapAmountsAllRowsPositive |
| candle.volumeToken{0,1} | candleVolumesNonNegative | candleVolumesAllRowsNonNegative |
| candle.{open,high,low,close} | candleOHLCOrdering | candleOHLCAllRowsConsistent (NEW) |
Each pair catches uniform-aggregator bugs (latest) AND subset-
corruption bugs (all-rows).
Bug shapes caught (NOT caught by latest-only):
- Per-period min/max accumulator bug — historical candles
initialized differently (running-min reset to 0 instead of
+Infinity for periods where the first swap > 0)
- Indexer reorg corrupted historical OHLC fields
- Pool-specific aggregator bug — only candles for one pool
affected
- Period-boundary off-by-one — a swap counted in the wrong
period had a price outside the window's bounds
Fixture extension: buildCandles now defaults OHLC to consistent
values on every row (open=close=0.5, high=0.6, low=0.4) for non-
zero indices. Index 0 still uses latestCandle* for back-compat.
New per-row override knobs: candleOpens, candleHighs, candleLows,
candleCloses arrays.
55 invariants now: 14 api-internal + 31 indexer + 10 chain.
188/188 smoke tests pass (4 new).
…er ergonomics script
Tooling slice (no new invariant). At 55 invariants the dry-run
flat catalog is hard to scan. Adds `npm run scenarios:by-layer`
that prints invariants grouped by layer, with both a summary
table (layer + count + bar-chart) AND per-layer detail blocks.
What it answers at a glance:
- "What does the chain layer cover?" → orchestrator↔chain block
lists all 10 chain probes
- "Which probes cross to the candles indexer?" → api↔candles
block lists 4 names
- "Where's the catalog growing fastest?" → bar-chart shows
orchestrator↔candles is densest at 21
Authoritative layer breakdown surfaced (corrects an earlier
inconsistency in status-line bucketing):
api 10
api↔candles 4
api↔registry 2
orchestrator↔candles 21
orchestrator↔chain 10
orchestrator↔registry 8
----
55
Implementation: 35-line scripts/scenarios-by-layer.mjs that
imports INVARIANTS, groups by `layer` field, prints text. No
flags, no colors, deliberately scriptable (pipe into grep/awk
for filtering). Same import style as the existing dry-run output
but reorganized.
Smoke test: 1 new (asserts header line, summary table format,
per-layer detail sections, sanity-check that the chain-CAPABILITY
trio names appear under the chain-layer block).
189/189 smoke tests pass (was 188).
…lForwardsIntrospection (56th invariant)
First api-layer introspection-passthrough probe. Sister to
registryIndexerSchemaHasRequiredTypes (DIRECT-side). Same __schema
query, but routed through the API LAYER instead of direct.
The bug class: many production GraphQL proxies (Apollo Gateway,
Hasura, etc.) ship with introspection disabled by default at the
proxy layer for security — even when the upstream indexer
supports it. If a deploy accidentally turns on that toggle,
harness scenarios that introspect through the api layer silently
break, BUT the DIRECT-side sister still passes — making the
actual cause hard to find without this distinct probe.
Diagnostic-precision pattern (api+direct cross-check):
api ✓ direct ✓ → both layers fine
api ✗ direct ✓ → API PROXY STRIPPED INTROSPECTION
api ✓ direct ✗ → indexer schema regressed (api correctly
forwarded the broken schema)
api ✗ direct ✗ → indexer is root cause
Each combination has a distinct error message so engineers can
read the combined signal without guessing which layer broke.
Fixture extension: api /registry/graphql handler now includes
__schema in its passthrough by default (mirroring direct). New
knob apiRegistryStripsIntrospection (default false) simulates
proxy-layer disablement.
The api↔registry layer (previously thinnest at 2 invariants) is
now at 3.
56 invariants now: 10 api + 4 api↔candles + 3 api↔registry +
21 orchestrator↔candles + 8 orchestrator↔registry + 10
orchestrator↔chain. 193/193 smoke tests pass (4 new).
…ForwardsIntrospection (57th invariant)
Sister to apiRegistryGraphqlForwardsIntrospection (just shipped)
on the candles side. COMPLETES the introspection-coverage MATRIX:
| Direct-side | API-side |
--------------+-------------+----------+
registry | ✓ | ✓ |
candles | ✓ | ✓ | ← this slice
All four probes are now in the catalog. For ANY introspection
failure, the diagnostic combines layer (api/direct) × indexer
(registry/candles) into a precise root-cause statement.
Bug class beyond the registry sister: per-route proxy config drift.
The candles route can be misconfigured INDEPENDENTLY of the registry
route (separate proxy configs are common in production GraphQL
gateways). Pairing the two api-layer probes catches that drift:
apiRegistry ✓ apiCandles ✓ → proxy fine on both routes
apiRegistry ✗ apiCandles ✓ → registry route stripped only
apiRegistry ✓ apiCandles ✗ → candles route stripped only (drift)
apiRegistry ✗ apiCandles ✗ → proxy-wide lockdown
Fixture extension: api /candles/graphql handler now includes
__schema in its passthrough by default (mirroring direct +
registry). New knob apiCandlesStripsIntrospection (separate
from the registry knob so per-route drift is testable).
57 invariants now: 10 api + 5 api↔candles + 3 api↔registry +
21 orchestrator↔candles + 8 orchestrator↔registry + 10
orchestrator↔chain. 197/197 smoke tests pass (4 new).
Sister to interface-side scenarios-catalog. Ships a
committed Markdown index of the orchestrator's 57
invariants — browsable on GitHub without running
anything — plus drift detection against doc rot.
What ships:
- scripts/invariants-catalog.mjs (~85 lines): imports
INVARIANTS, validates {name, description, layer},
groups by layer, emits orchestrator/INVARIANTS.md
- orchestrator/INVARIANTS.md (~110 lines, committed):
title + per-layer summary + per-layer detail tables
- npm script invariants:catalog wired
- tests/smoke-invariants-catalog.test.mjs: drift smoke
test mirroring interface-side smoke-scenarios-catalog
(snapshot → regen → byte-identical assertion → restore
in finally; "added an invariant but forgot to regen"
becomes a CI failure with a fix-command pointer)
Why now: with 57 invariants across 6 layers (api,
api↔candles, api↔registry, orchestrator↔candles,
orchestrator↔chain, orchestrator↔registry), coverage-
by-script is insufficient. Reviewers can now read the
catalog on GitHub without cloning. Future tooling (CI
dashboards, coverage gap reports) gets a readable
machine-friendly source.
Validation: 1/1 new smoke test passes in isolation.
Full suite: pre-existing daemon-dependent flake on
Phase 2 slice 4 port-leak test, no regressions caused
by this slice.
Two new root-level aliases so the catalog scripts shipped in recent slices are invocable from the repo root, not just from auto-qa/harness/: - auto-qa:e2e:scenarios:by-layer (slice 4d-by-layer-script) - auto-qa:e2e:invariants:catalog (slice 4d-invariants-catalog) Pure-additive package.json change. Each alias verified to resolve from the repo root and produce expected output. Pairs with an analogous interface-side commit wiring auto-qa:e2e:scenarios:by-route at that repo's root.
…aged CI The staged api smoke workflow (3e) currently runs only smoke:scenarios + a dry-run catalog sanity check — it does NOT cover the new smoke-invariants-catalog.test.mjs shipped two iterations ago. Extends auto-qa-harness-smoke.yml.staged with two new steps mirroring the interface-side scenarios:catalog drift pattern (slice 3a): - Regenerate invariants catalog (npm run invariants:catalog) - Verify invariants catalog is in sync (git diff --exit-code) Without this, an invariant added without regenerating INVARIANTS.md would silently drift in CI. Validation: YAML re-parsed clean via js-yaml@4; drift assertion pre-verified locally (regen + git diff exits 0). Trigger remains workflow_dispatch.
…link smoke
Phase 0 CHECKLIST item 41 ("Sister-link verified: fresh
checkout of both repos in ~/code/futarchy-fi/") bundles
doc + docker checks. This slice ships the doc-side half.
Adds tests/smoke-architecture-sync.test.mjs that:
- Resolves sister ARCHITECTURE.md at
../interface/auto-qa/harness/ARCHITECTURE.md (4
levels up from the test file)
- Skips cleanly via t.skip() if sister not present
(CI runners, one-repo clones)
- Asserts byte-identical otherwise — fails loudly
if the shared spec drifted between repos
Sister test on the interface side mirrors this in reverse
(looks up futarchy-api-side ARCHITECTURE.md). Baseline
verified byte-identical; both tests pass 1/1 in isolation.
CHECKLIST item 41 gains a sub-bullet recording the doc-
side coverage; docker-side half remains unchecked
(daemon-required).
…workflow STAGED
Cross-repo complement to the previous-iteration smoke test
(tests/smoke-architecture-sync.test.mjs). Together they
give complete drift coverage of the shared ARCHITECTURE.md
spec:
- Smoke test: catches dev-with-sibling-clone case at
npm test time. Skips if sister not present.
- Workflow: catches CI-with-one-repo-checked-out case.
Curls sister via raw.githubusercontent.com (public),
diffs against local. Fails loudly on byte mismatch.
Adds auto-qa/harness/ci/auto-qa-harness-architecture-sync.yml.staged
pointing at the interface sister. Optional sister_branch
input (default auto-qa; switch to main post-merge).
Trigger: workflow_dispatch only.
Sister-side workflow on the interface repo mirrors this in
reverse (looks up futarchy-api).
Validation:
- YAML re-parsed clean via js-yaml@4
- Sister raw URL returns HTTP/2 200 (publicly accessible)
- Simulated workflow locally: curl + diff → PASS
ci/README.md staged-table updated (2 rows now); promote
order documented (smoke first, then this).
… in staged CI
Two new daemon-free smoke files shipped earlier this session
(smoke-invariants-catalog.test.mjs,
smoke-architecture-sync.test.mjs) were not exercised by the
api-side staged CI workflow, which runs only smoke:scenarios.
Adds explicit steps for each in
auto-qa-harness-smoke.yml.staged:
- Run invariants-catalog smoke test (unit-level drift
assertion, sister to the workflow-level git-diff check
shipped in slice 3e-extend)
- Run architecture-sync smoke test (Phase 0 doc-side
sister-link check; SKIPS cleanly in CI's single-repo
checkout — the cross-repo workflow handles actual
sister drift)
Why explicit steps vs broadening to npm test: the api
harness has 9 daemon-required smoke files (anvil + docker +
indexers) that would fail in CI without that infra.
Validation: YAML re-validated via js-yaml@4; both tests
pass 1/1 in isolation. Trigger remains workflow_dispatch.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds the auto-qa forked-replay harness as a self-contained subdirectory. Built incrementally over many iterations; all changes confined to
auto-qa/except for an additive block ofauto-qa:*scripts in rootpackage.json.npm run scenarios:by-layerfor a layer-grouped view)auto-qa/harness/ci/auto-qa-harness-smoke.yml.staged(workflow_dispatch only for v1; ~1 min runtime; no docker, no real services)What's NOT in this PR
package.jsonis touched outsideauto-qa/, and the diff is a pure addition ofauto-qa:*script aliases (zero modifications to existing scripts or deps).4a-verify/4b-verify/4c-verifyslices inauto-qa/harness/CHECKLIST.mdexist exactly to close that gap; they need a Docker daemon (deferred to post-merge maintainer work)..stagedbecause GitHub blocks OAuth Apps withoutworkflowscope from writing.github/workflows/. After merge, copy the.stagedfile into.github/workflows/(instructions inauto-qa/harness/ci/README.md).Layer breakdown (per `scenarios:by-layer`)
Coverage patterns shipped
Test plan
Post-merge tasks (maintainer)
🤖 Generated with Claude Code