Optimize indexer buffer design#558
Open
aditya1702 wants to merge 19 commits intoremove-optimized-catchupfrom
Open
Optimize indexer buffer design#558aditya1702 wants to merge 19 commits intoremove-optimized-catchupfrom
aditya1702 wants to merge 19 commits intoremove-optimized-catchupfrom
Conversation
Pass pre-fetched txs slice into insertIntoDB so the same slice is shared with unlockChannelAccounts, removing one O(n) allocation per ledger in both live and backfill paths.
These four getters are each called exactly once per ledger and the caller never mutates the returned map. Returning the internal reference directly eliminates 4 × O(n) map allocations per ledger.
Allocate the buffer once before the loop and Clear() at the start of each iteration, matching the pattern already used in backfill mode. This eliminates 12 map allocations per ledger and retains backing array capacity across iterations.
Extract lock-free unsafe methods from PushTrustlineChange, PushAccountChange, PushSACBalanceChange, and PushSACContract. New BatchPushChanges method acquires the write lock once for all four processor results per operation, reducing lock cycles from 4N to N where N is the number of operations in a transaction.
Replace unbounded pond.NewPool(0) with pond.NewPool(runtime.NumCPU()) to prevent goroutine oversubscription on busy ledgers. Transactions now queue in the pool instead of all running simultaneously, reducing context switching and memory pressure.
Reuse per-transaction buffers via sync.Pool instead of allocating a new IndexerBuffer (12 maps) per transaction. Buffers are Get()ed from the pool, Clear()ed, used for processing, merged into the ledger buffer, then Put() back. After warmup this eliminates nearly all per-transaction map allocations. Also removes unnecessary loop variable captures (Go 1.24 has per-iteration loop variables).
These public wrappers have no callers outside the buffer — all production code now uses BatchPushChanges. The unsafe internals remain for use by BatchPushChanges and Merge.
All production code uses BatchPushChanges. The unsafe internal remains for BatchPushChanges and Merge. Tests updated to use BatchPushChanges directly.
The architecture guarantees single-goroutine access at every level: per-transaction buffers are owned by one goroutine from pool Get to Put, ledger buffers are owned by the ingestion loop, and the merge phase runs sequentially after group.Wait(). The mutex was never contended. This removes the RWMutex field, collapses 6 public+unsafe method pairs into direct implementations, and removes concurrent tests that tested scenarios that never occur in production.
PushTransaction, PushOperation, and PushStateChange now accept pointer arguments, eliminating per-call copies of Transaction (~100+ bytes) and Operation (10-50KB+ XDR) structs. Previously each participant triggered a full struct copy that was immediately re-addressed inside the method.
golang-set/v2 wraps map[T]struct{} with its own internal sync.RWMutex.
Since IndexerBuffer already guarantees single-goroutine access, this
internal mutex was pure overhead on every Add/Iter/Clone call.
ParticipantSet is a type alias for map[string]struct{} with only the
methods that are actually used: Add, Cardinality, ToSlice, Clone.
…nsaction In Merge(), steal participant sets from the source buffer instead of cloning them. This is safe because the source buffer is always Clear()ed after Merge in ProcessLedgerTransactions. Replace set.NewSet + repeated Union() calls with a plain map for counting unique participants, eliminating N intermediate set allocations per transaction.
Covers PushTransaction, PushOperation, Merge (1/10/50 txs), BatchPushChanges, and Clear+reuse cycle.
Instead of allocating per-transaction buffers from a sync.Pool, processing in parallel, then sequentially merging into the ledger buffer, all goroutines now push directly into the shared ledger buffer. The heavy XDR parsing work runs without the lock. Only the brief map inserts acquire the mutex (~200ns per transaction vs ~700ns merge cost). This removes Merge(), sync.Pool, txBufferPool, and ~450 lines of merge logic and tests. The deduplication (highest-OperationID-wins, ADD/REMOVE no-op detection) still works correctly since it happens on push.
Previous benchmarks included fmt.Sprintf and struct allocation in the hot loop, inflating B/op numbers. Pre-allocating all test data in init() shows the true cost: zero allocations for PushTransaction and PushOperation, both single-threaded and under contention.
6d2eab4 to
d3b6c89
Compare
Introduce TransactionResult and BatchPushTransactionResult to aggregate all data produced when processing a transaction and push it into IndexerBuffer in a single mutex acquisition. Update IndexerBuffer interface and implementation, refactor Indexer.processTransaction to build and pass a TransactionResult (operations, participants, state/contract changes), validate state-change operation IDs with logs, and collect unique participants for metrics. Update tests and benchmarks to cover the new batching API. This reduces mutex contention and consolidates many small buffer writes into one atomic batch write.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.