Cleanup historical ledger query by SirTyson · Pull Request #5221 · stellar/stellar-core

SirTyson · 2026-04-10T22:28:07Z

Description

This is another follow up in the ledger state refactor.

Currently, CompleteConstLedgerState manages historical snapshots. These are only used by captive-core for the RPC query server, but add complexity to the core LedgerState interface. This PR makes it such that the QueryServer is responsible for maintaining it's own historical states, so CompleteConstLedgerState is simplified to just containing the minimum LCL state.

Checklist

Reviewed the contributing document
Rebased on top of master (no merge commits)
Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
Compiles
Ran all tests
If change impacts performance, include supporting evidence per the performance document

The binary search uses t-statistic to ensure the necessary confidence at each step. This results in needing less samples far away from the true value and more samples around it. We still need at least 30 samples though to keep math simple. This works reasonably well, but still doesn't avoid some fundamental variance issues that we have that result in very different and statistically significant performance difference between different runs. That concern should hopefully be addressed separately. Also fix some SAC max TPS benchmark issues, the math got messed up after I've updated it from 1s granularity to 50ms.

Document that getXDRSize and writeOne implement the Record Marking Standard defined in RFC 5531 Section 11, clarifying the fragment header format and the last-fragment flag bit.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Add 3 new benchmark scenarios to improve coverage: - Low conflict (0.1 conflicts/tx, 5 RO, 1 RW) - Medium conflict (0.5 conflicts/tx, 2 RO, 2 RW) - Dense RW-only (5 conflicts/tx, 0 RO, 3 RW) Also add structured output header for easier comparison.

Add mAllConflictTxs BitSet member to Stage that accumulates the union of all successfully-added transactions' conflict sets. getConflictingClusters checks mAllConflictTxs.get(tx.mId) first - if false, avoids the O(C) cluster scan entirely. Benchmark: worst case (conflicts=1,1000,1) improved ~49% (1148ms -> 585ms). Other scenarios within noise.

Add mTxToCluster vector mapping tx id -> cluster pointer, enabling getConflictingClusters to iterate tx.mConflictTxs set bits and look up which cluster each conflicting tx belongs to. This is O(K) where K is the number of conflict tx ids, vs the previous O(C) cluster scan. The mapping is updated after each successful tryAdd by walking the new merged cluster's txIds bitset. For the full bin-packing path we save the new cluster pointer before binPacking() sorts the vector. Benchmark across all scenarios shows 5-16% improvement beyond Step 2. Best improvement: conflicts=20,40,1 now at 480ms (was 675ms baseline, -29%).

Replace createNewClusters (which copied all N shared_ptrs) with in-place mutation of mClusters. Conflicting clusters are saved for rollback, removed from the vector via compaction scan, and the new merged cluster is appended. On failure, rollback restores the original state. For the common no-conflict case this reduces shared_ptr operations from O(N) copies to a single push_back. Benchmark: 20-35% improvement across all scenarios vs Step 3. Combined from baseline: worst case -65% (1148ms -> 401ms), zero-conflict -32.5% (635ms -> 428ms).

- Change txFrames and sorobanCfg from by-value to by-reference capture in the thread lambda, eliminating O(N) shared_ptr copies per thread. Benchmark: modest improvement on conflict-heavy scenarios (conflicts=1,1000,1: 401->381ms, conflicts=10,10,10: 576->524ms).

Replace UnorderedMap<LedgerKey,...> footprint maps with a flat vector of (keyHash, txId, isRW) entries sorted by precomputed hash. This avoids ~400K hash map node allocations and multiple rehashings, providing much better cache behavior through sequential memory access. The sort-based approach: 1. Collects all footprint entries into a single flat vector with precomputed LedgerKey hashes (one allocation). 2. Sorts by hash for cache-friendly grouping. 3. Linear scan finds groups sharing the same key hash. 4. Conflict marking (RW-RW and RO-RW) is done in the same scan, eliminating the separate conflict marking phase. Profile improvement (no-conflict scenario): footprintMaps: 240ms -> 70ms (hashSort) conflictMark: 27ms -> 1ms (conflictScan) Combined: 267ms -> 71ms (3.8x faster) Benchmark improvement from previous step: conflicts=0,0,0: 452.6 -> 179.4ms (-60%) conflicts=0.1,5,1: 424.0 -> 196.3ms (-54%) conflicts=1,1000,1: 381.4 -> 170.5ms (-55%) conflicts=10,10,10: 524.4 -> 294.1ms (-44%)

Worker threads receive read-only references instead of expensive deep-copied std::set trees. Each thread does a simple linear scan over the sorted order, checking anyGreater() to skip oversized txs. It's not great that a bit of logic is duplicated with SurgePricingPriorityQueue now, but on the other hand the parallel tx set nomination is different enough from the mempool prioritization and going forward it might diverge even more, so this duplication doesn't seem too problematic to me (and it's just a few lines in the end). Benchmark (10000 txs, 128 clusters, 1-4 stages, 5 iterations): conflicts=0,0,0: 179.4 -> 194.8ms (noise; +8.6%) conflicts=0.1,5,1: 196.3 -> 150.0ms (-23.6%) conflicts=0.5,2,2: 192.9 -> 142.2ms (-26.3%) conflicts=1,1000,1: 170.5 -> 139.0ms (-18.5%) conflicts=5,0,3: 176.8 -> 156.4ms (-11.5%) conflicts=10,40,1: 231.3 -> 194.0ms (-16.1%) conflicts=20,40,1: 253.8 -> 195.3ms (-23.0%) conflicts=10,10,10: 294.1 -> 250.9ms (-14.7%) conflicts=50,50,5: 265.0 -> 199.6ms (-24.7%) 8/9 scenarios improved, up to 26% faster. Eliminates O(N log N) std::set insertions and 4 expensive tree deep-copies per build.

This mode uses the official Soroswap contracts from mainnet. It executes swaps of different token pairs (to allow for parallelism).

The ChangeLog file is empty and untouched in 12 years. Proposing we remove it as it is not the definitive changelog, but has hits in google search results

* and add ChangeLog to gitignore like the /INSTALL file * remove /AUTHORS as well. it's empty.

sisuresh and others added 30 commits April 10, 2026 15:26

Fix ArchivedStateConsistency invariant for live restore checks

a5d2c8c

Add and use markRestoredFromLiveBucketList

5f06a62

Remove unnecessary copies and have ltxs track their own restores.

e2c315d

Clean up pre-v23 restore meta logic

7ef9fb1

Add comments and assert

d0901ed

Add tests for markRestoredFromLiveBucketList

6a1875e

Refactored ParallelTxReturnVal

5cf6a2d

Bump p26 env

95d429f

Add RFC 5531 Record Marking Standard references to XDRStream

f52b699

Document that getXDRSize and writeOne implement the Record Marking Standard defined in RFC 5531 Section 11, clarifying the fragment header format and the last-fragment flag bit.

Upgrade to C++20

7458f56

Fix msvc build on C++20

e49420e

Residual DR 2237 violation

644fd9c

Remove maintenance

649c7c8

Get the handle to the session once

65b9c69

Address review feedback

3af39f8

Address copilot feedback

2f95071

Use prepared statements for binding queries

0aa8536

Fix non-postgres build

17b2f3e

Implement copilot suggestions

bacf576

use secrets.SUBMODULE_TOKEN in build-private.yml

dd3a98e

Update .github/workflows/build-private.yml

c718e09

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

VS project fix

e454c0e

Fix by-value captures

1403014

- Change txFrames and sorobanCfg from by-value to by-reference capture in the thread lambda, eliminating O(N) shared_ptr copies per thread. Benchmark: modest improvement on conflict-heavy scenarios (conflicts=1,1000,1: 401->381ms, conflicts=10,10,10: 576->524ms).

drebelsky and others added 27 commits April 10, 2026 15:27

Remove extraneous #pragma once

063b2c6

Only test Soroban if Catch2 tests pass

0966791

Remove obsolete docs

8720941

Remove impossible condition

7c050e2

Initialize uninitialized variable

829a96a

Add clang format rule for right alignment

d92bf94

Implement review suggestions

c2374f3

Format new code

6b74a82

Remove LedgerHeader from BucketListSnapshotData

6bc376f

Consolidated state access to LedgerStateSnapshot

c9e6c94

Make LedgerStateSnapshot threadsafe

4a1efd8

Remove BucketSnapshotManager and manage snapshot from LedgerManager

38c0a45

Fix minor race condition in invariant check

5ce6ec6

Small fixes

f304bbc

Fix build issue with SharedMutex

2fa895d

Add Soroswap benchmark mode for apply load.

91f9396

This mode uses the official Soroswap contracts from mainnet. It executes swaps of different token pairs (to allow for parallelism).

More updates to VS Rust build, change detection is tricky on Windows.

a6bd195

Add a script to run a of apply load scenarios for a given build.

60d6470

Fix race on bucket random evicion cache

0a56869

remove unused ChangeLog

9dfd91b

The ChangeLog file is empty and untouched in 12 years. Proposing we remove it as it is not the definitive changelog, but has hits in google search results

update automake

d0573d8

* and add ChangeLog to gitignore like the /INSTALL file * remove /AUTHORS as well. it's empty.

Add randomized testing for snapshot concurrency

b8e4a10

Removed mutable LedgerHeader from BucketSnapshotState

e1615be

Remove BucketSnapshotState and clean up interface

f385597

Rename LedgerSnapshot to LedgerReadView

6dbfbec

Clean up history publish

6258558

Move historical ledger management to query server

bb7e0d6

SirTyson force-pushed the cleanup-historical-ledger-query branch from fb7213d to 66a30bb Compare April 10, 2026 22:30

Added comprehensive historical query tests

3cc6f16

SirTyson force-pushed the cleanup-historical-ledger-query branch from 66a30bb to 3cc6f16 Compare April 10, 2026 22:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cleanup historical ledger query#5221

Cleanup historical ledger query#5221
SirTyson wants to merge 125 commits intostellar:masterfrom
SirTyson:cleanup-historical-ledger-query

SirTyson commented Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

Conversation

SirTyson commented Apr 10, 2026

Description

Checklist

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants