Draft
Conversation
The binary search uses t-statistic to ensure the necessary confidence at each step. This results in needing less samples far away from the true value and more samples around it. We still need at least 30 samples though to keep math simple. This works reasonably well, but still doesn't avoid some fundamental variance issues that we have that result in very different and statistically significant performance difference between different runs. That concern should hopefully be addressed separately. Also fix some SAC max TPS benchmark issues, the math got messed up after I've updated it from 1s granularity to 50ms.
Document that getXDRSize and writeOne implement the Record Marking Standard defined in RFC 5531 Section 11, clarifying the fragment header format and the last-fragment flag bit.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Add 3 new benchmark scenarios to improve coverage: - Low conflict (0.1 conflicts/tx, 5 RO, 1 RW) - Medium conflict (0.5 conflicts/tx, 2 RO, 2 RW) - Dense RW-only (5 conflicts/tx, 0 RO, 3 RW) Also add structured output header for easier comparison.
Add mAllConflictTxs BitSet member to Stage that accumulates the union of all successfully-added transactions' conflict sets. getConflictingClusters checks mAllConflictTxs.get(tx.mId) first - if false, avoids the O(C) cluster scan entirely. Benchmark: worst case (conflicts=1,1000,1) improved ~49% (1148ms -> 585ms). Other scenarios within noise.
Add mTxToCluster vector mapping tx id -> cluster pointer, enabling getConflictingClusters to iterate tx.mConflictTxs set bits and look up which cluster each conflicting tx belongs to. This is O(K) where K is the number of conflict tx ids, vs the previous O(C) cluster scan. The mapping is updated after each successful tryAdd by walking the new merged cluster's txIds bitset. For the full bin-packing path we save the new cluster pointer before binPacking() sorts the vector. Benchmark across all scenarios shows 5-16% improvement beyond Step 2. Best improvement: conflicts=20,40,1 now at 480ms (was 675ms baseline, -29%).
Replace createNewClusters (which copied all N shared_ptrs) with in-place mutation of mClusters. Conflicting clusters are saved for rollback, removed from the vector via compaction scan, and the new merged cluster is appended. On failure, rollback restores the original state. For the common no-conflict case this reduces shared_ptr operations from O(N) copies to a single push_back. Benchmark: 20-35% improvement across all scenarios vs Step 3. Combined from baseline: worst case -65% (1148ms -> 401ms), zero-conflict -32.5% (635ms -> 428ms).
- Change txFrames and sorobanCfg from by-value to by-reference capture in the thread lambda, eliminating O(N) shared_ptr copies per thread. Benchmark: modest improvement on conflict-heavy scenarios (conflicts=1,1000,1: 401->381ms, conflicts=10,10,10: 576->524ms).
Replace UnorderedMap<LedgerKey,...> footprint maps with a flat vector of (keyHash, txId, isRW) entries sorted by precomputed hash. This avoids ~400K hash map node allocations and multiple rehashings, providing much better cache behavior through sequential memory access. The sort-based approach: 1. Collects all footprint entries into a single flat vector with precomputed LedgerKey hashes (one allocation). 2. Sorts by hash for cache-friendly grouping. 3. Linear scan finds groups sharing the same key hash. 4. Conflict marking (RW-RW and RO-RW) is done in the same scan, eliminating the separate conflict marking phase. Profile improvement (no-conflict scenario): footprintMaps: 240ms -> 70ms (hashSort) conflictMark: 27ms -> 1ms (conflictScan) Combined: 267ms -> 71ms (3.8x faster) Benchmark improvement from previous step: conflicts=0,0,0: 452.6 -> 179.4ms (-60%) conflicts=0.1,5,1: 424.0 -> 196.3ms (-54%) conflicts=1,1000,1: 381.4 -> 170.5ms (-55%) conflicts=10,10,10: 524.4 -> 294.1ms (-44%)
Worker threads receive read-only references instead of expensive deep-copied std::set trees. Each thread does a simple linear scan over the sorted order, checking anyGreater() to skip oversized txs. It's not great that a bit of logic is duplicated with SurgePricingPriorityQueue now, but on the other hand the parallel tx set nomination is different enough from the mempool prioritization and going forward it might diverge even more, so this duplication doesn't seem too problematic to me (and it's just a few lines in the end). Benchmark (10000 txs, 128 clusters, 1-4 stages, 5 iterations): conflicts=0,0,0: 179.4 -> 194.8ms (noise; +8.6%) conflicts=0.1,5,1: 196.3 -> 150.0ms (-23.6%) conflicts=0.5,2,2: 192.9 -> 142.2ms (-26.3%) conflicts=1,1000,1: 170.5 -> 139.0ms (-18.5%) conflicts=5,0,3: 176.8 -> 156.4ms (-11.5%) conflicts=10,40,1: 231.3 -> 194.0ms (-16.1%) conflicts=20,40,1: 253.8 -> 195.3ms (-23.0%) conflicts=10,10,10: 294.1 -> 250.9ms (-14.7%) conflicts=50,50,5: 265.0 -> 199.6ms (-24.7%) 8/9 scenarios improved, up to 26% faster. Eliminates O(N log N) std::set insertions and 4 expensive tree deep-copies per build.
This mode uses the official Soroswap contracts from mainnet. It executes swaps of different token pairs (to allow for parallelism).
The ChangeLog file is empty and untouched in 12 years. Proposing we remove it as it is not the definitive changelog, but has hits in google search results
* and add ChangeLog to gitignore like the /INSTALL file * remove /AUTHORS as well. it's empty.
fb7213d to
66a30bb
Compare
66a30bb to
3cc6f16
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This is another follow up in the ledger state refactor.
Currently,
CompleteConstLedgerStatemanages historical snapshots. These are only used by captive-core for the RPC query server, but add complexity to the core LedgerState interface. This PR makes it such that the QueryServer is responsible for maintaining it's own historical states, soCompleteConstLedgerStateis simplified to just containing the minimum LCL state.Checklist
clang-formatv8.0.0 (viamake formator the Visual Studio extension)