You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ResolvedPath []*string field from StoreTx and StoreObs. Compile-time guard test ensures it stays gone.
Add
Structure
Purpose
Estimated cost (1M obs)
resolvedPubkeyIndex map[uint64][]int
xxhash64(pubkey) → []txID. Forward index for "Paths through node X" + collision-safety candidates.
50–120 MB
resolvedPubkeyReverse map[int][]uint64
txID → []hashes it was indexed under. Required for clean removal on eviction / backfill re-index.
~40 MB
apiResolvedPathLRU (sized 10K, ~200 B each)
Cache for on-demand API decode of resolved_path. Mandatory for live polling path.
~2 MB
Decode-window discipline (single rule)
resolved_path JSON is decoded at exactly one place per packet (ingest / Load() row). During that decode window, all consumers are fed in this order, then the temporary []*string is dropped — never lands on the struct:
WebSocket broadcast map (carries the raw JSON bytes — no struct mutation)
Persist batch (carries the raw JSON bytes for SQL UPDATE)
Enforced by: (a) struct field gone (compile-time), (b) godoc on ingestObservationDecoded() documenting the contract, (c) test that broadcast maps include resolved_path post-refactor.
On-demand SQL fetch (cold path)
txToMap / obsToMap API serializers and the eviction byNode/nodeHashes cleanup query SQLite for resolved_path when needed:
SELECT id, resolved_path FROM observations WHERE id IN (?, ?, ?, ...)
Single batched query per request (≤500 rows).
Result cached in apiResolvedPathLRU keyed by obs ID.
LRU cache invalidation: backfill writes call apiResolvedPathLRU.Delete(obsID) after committing the SQL UPDATE.
Collision safety
xxhash64 collision rate at 1M unique keys = ~1 in 4B per pair. When resolvedPubkeyIndex[h] returns candidates, /api/nodes/{pubkey}/paths runs one batched SQL query to verify the exact pubkey appears in each candidate's resolved path. Same query path as the on-demand SQL fetch — no separate code.
Backfill refactor
backfillResolvedPathsAsync:
SQL UPDATE (unchanged)
Use reverse map to remove old hash entries for the obs's tx
Insert new hash entries into forward + reverse maps
Update byPathHop resolved-key entries
Invalidate LRU cache for the obs ID
Schema
No schema change. SQLite resolved_path column stays — source of truth for ingest-time resolution, on-demand decode, and collision-safety check.
Feature flag
useResolvedPathIndex bool (default true in v3.6.0). The off-path keeps the old per-StoreTx field as a one-release rollback safety net. Removed in v3.7.0.
Consumers (audit)
All Go consumers of ResolvedPath / resolved_path and their post-refactor strategy:
TestLivePolling_ResolvedPathFromBroadcast — live poll uses in-flight broadcast cache, no SQL
TestLivePolling_LRUUnderConcurrentIngest — 100 concurrent live polls + ingest writes; p95 < 50 ms
Feature flag
TestFeatureFlag_OffPath_PreservesOldBehavior — with useResolvedPathIndex=false, struct still has field, all existing tests pass
TestFeatureFlag_Toggle_NoStateLeak — toggling the flag at runtime doesn't corrupt state (or document it as restart-only)
Concurrency
TestReverseMap_NoLeakOnPartialFailure — if backfill UPDATE succeeds but index insert panics, recovery doesn't leave the reverse map in an inconsistent state
Goal
Cut server startup heap by ~900 MB on databases with 1M+ observations by removing per-
StoreTx/StoreObsResolvedPathslices. Unblocks #791.Discussion, profiling data, and full review history: #799 (closed in favor of this spec).
Profile (in brief)
pprof inuse_spaceon a representative server (388K obs, 630 MB heap):Load()→unmarshalResolvedPathJSON decode = 33% of heapResolvedPath []*stringslices = ~1 GB extrapolated to 1.66M obs (Existing SQLite database prevents CoreScope from becoming reachable, empty DB starts immediately #791 user)Design
Remove
ResolvedPath []*stringfield fromStoreTxandStoreObs. Compile-time guard test ensures it stays gone.Add
resolvedPubkeyIndex map[uint64][]intresolvedPubkeyReverse map[int][]uint64apiResolvedPathLRU(sized 10K, ~200 B each)resolved_path. Mandatory for live polling path.Decode-window discipline (single rule)
resolved_pathJSON is decoded at exactly one place per packet (ingest /Load()row). During that decode window, all consumers are fed in this order, then the temporary[]*stringis dropped — never lands on the struct:addToByNode— relay node indexingtouchRelayLastSeen— relay liveness DB updates (bug: nodes only used for relaying/pathed traffic show as dead #660 / feat: repeater liveness indicator with relay stats (#662) #755)addTxToPathHopIndexresolved-pubkey branch —byPathHopfull-pubkey keysresolvedPubkeyIndex+resolvedPubkeyReverseinsertEnforced by: (a) struct field gone (compile-time), (b) godoc on
ingestObservationDecoded()documenting the contract, (c) test that broadcast maps includeresolved_pathpost-refactor.On-demand SQL fetch (cold path)
txToMap/obsToMapAPI serializers and the evictionbyNode/nodeHashescleanup query SQLite forresolved_pathwhen needed:apiResolvedPathLRUkeyed by obs ID.apiResolvedPathLRU.Delete(obsID)after committing the SQL UPDATE.Collision safety
xxhash64 collision rate at 1M unique keys = ~1 in 4B per pair. When
resolvedPubkeyIndex[h]returns candidates,/api/nodes/{pubkey}/pathsruns one batched SQL query to verify the exact pubkey appears in each candidate's resolved path. Same query path as the on-demand SQL fetch — no separate code.Backfill refactor
backfillResolvedPathsAsync:byPathHopresolved-key entriesSchema
No schema change. SQLite
resolved_pathcolumn stays — source of truth for ingest-time resolution, on-demand decode, and collision-safety check.Feature flag
useResolvedPathIndex bool(default true in v3.6.0). The off-path keeps the old per-StoreTx field as a one-release rollback safety net. Removed in v3.7.0.Consumers (audit)
All Go consumers of
ResolvedPath/resolved_pathand their post-refactor strategy:addToByNodetouchRelayLastSeenpickBestObservation(obs→tx propagation)txToMap/obsToMap(REST API)IngestNewObservations/IngestNewFromDB(broadcast + persist)nodeInResolvedPathaddTxToPathHopIndex(resolved-key branch)removeTxFromPathHopIndexmapSliceToStoreTxs/mapSliceToObservationsbackfillResolvedPathsAsyncbyNode/nodeHashescleanupFrontend: no API contract change required.
resolved_pathremains in broadcast maps and API responses.Tests
Unit
TestResolvedPubkeyIndex_BuildFromLoad— forward + reverse maps consistent afterLoad()TestResolvedPubkeyIndex_HashCollision— crafted-vector collision; SQL safety filters false candidateTestResolvedPubkeyIndex_IngestUpdate— both maps reflect new ingests; struct has no fieldTestResolvedPubkeyIndex_RemoveOnEvict— eviction removes via reverse map; no orphan txIDsTestResolvedPubkeyIndex_PerObsCoverage— non-best obs's resolved pubkeys are also indexedTestStoreTx_NoResolvedPathField— compile-time guardTestAddToByNode_WithoutResolvedPathField— relay nodes still inbyNodeTestTouchRelayLastSeen_WithoutResolvedPathField— relaylast_seenstill updatedTestWebSocketBroadcast_IncludesResolvedPath— broadcast carriesresolved_pathTestBackfill_UpdatesIndexAndByPathHop— backfill populates new structuresTestBackfill_RemoveOldOnReBackfill— re-backfill removes old hashes via reverse mapTestBackfill_InvalidatesLRU— LRU cache evicts the obs after backfill UPDATETestEviction_ByNodeCleanup_OnDemandSQL— eviction path SQL-fetchesresolved_pathto cleanbyNode/nodeHashesEndpoint
TestPathsThroughNode_PrecisionAfterRefactor— identical results before/after on prefix-collision fixtureTestPathsThroughNode_NilResolvedPathFallback— NULLresolved_pathpackets still returned via raw-byte fallbackTestPathsThroughNode_CollisionSafety— crafted hash collision filtered by SQL safety checkTestPacketsAPI_OnDemandResolvedPath—/api/packetsincludesresolved_pathfor cold packetsTestPacketsAPI_OnDemandResolvedPath_LRUHit— second request hits cacheTestPacketsAPI_OnDemandResolvedPath_Empty— NULL returns null/omittedTestLivePolling_ResolvedPathFromBroadcast— live poll uses in-flight broadcast cache, no SQLTestLivePolling_LRUUnderConcurrentIngest— 100 concurrent live polls + ingest writes; p95 < 50 msFeature flag
TestFeatureFlag_OffPath_PreservesOldBehavior— withuseResolvedPathIndex=false, struct still has field, all existing tests passTestFeatureFlag_Toggle_NoStateLeak— toggling the flag at runtime doesn't corrupt state (or document it as restart-only)Concurrency
TestReverseMap_NoLeakOnPartialFailure— if backfill UPDATE succeeds but index insert panics, recovery doesn't leave the reverse map in an inconsistent stateTestDecodeWindow_LockHoldTimeBounded— measure write-lock duration during ingest decode window; document budgetIntegration / regression
TestRepeaterLiveness_StillAccurate— relay-detection counts identical (PR feat: repeater liveness indicator with relay stats (#662) #755 area)TestSubpathDetail_StillWorks—/api/analytics/subpath-detailidenticalTestNodeDetailPage_RelayCount— Node Detail "relayed N packets" matches pre-refactorBenchmarks
BenchmarkLoad_BeforeAfter— 100K obs fixture; target ≥80% heap reductionBenchmarkResolvedPubkeyIndex_Memory— at 50K and 500K unique-pubkey distributions; verify within budgetBenchmarkPathsThroughNode_Latency— 5K candidates; equal or fasterBenchmarkPacketsAPI_FirstPage—/api/packets?limit=100; <20 ms regressionBenchmarkLivePolling_UnderIngest— 1 Hz live polling under continuous ingest; p99 < 100 msManual validation
Acceptance criteria
/api/nodes/{pubkey}/pathsbyte-identical results before/after on regression fixtureBenchmarkLoad≥80% heap reductionStoreTx/StoreObshave noResolvedPathfield (compile-time)Estimated effort
10–14h for a senior Go developer familiar with the codebase:
byNode/nodeHashescleanup via on-demand SQLReferences