fix: bound wallet scan threadpool to half of available cores#70
Open
sn1f3rt wants to merge 5 commits into
Open
fix: bound wallet scan threadpool to half of available cores#70sn1f3rt wants to merge 5 commits into
sn1f3rt wants to merge 5 commits into
Conversation
Build Artifacts
10 succeeded, 0 failed | View workflow run |
The wallet refresh pipeline used the global singleton threadpool for all block parsing and transaction processing. On multi-core machines this saturated every hardware thread with crypto work, starving the daemon's RPC server of CPU time and causing two inconsistent scan modes: 1. High CPU / RPC unresponsive / scan completes in minutes 2. Low CPU / RPC responsive / scan takes over an hour Introduce a per-wallet scan threadpool (m_scan_threadpool) initialised with max(1, hardware_concurrency/2) threads. This preserves parallel block processing while leaving headroom for the daemon RPC threads. Also fix the macOS DEFAULT_MAX_CONCURRENCY=1 workaround in wallet_args.cpp: the previous is_arg_defaulted guard meant set_max_concurrency(1) was never applied unless the user explicitly passed --max-concurrency. The guard is removed so the default value (1 on macOS, 0 elsewhere) is always applied; calling set_max_concurrency(0) on other platforms is a no-op. Closes #66
The daemon's getblocks.bin RPC handler returns up to COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The wallet held m_daemon_rpc_mutex for the entire duration of each response, meaning one 1000-block fetch (potentially several seconds on Windows) blocked all other wallet RPC operations — balance queries, height updates, etc. — causing the GUI to show timeouts during wallet scan. Note: the adaptive sync cap introduced in PR #65 applies only to P2P block propagation via get_block_sync_size(); it has no effect on the wallet getblocks.bin endpoint, which always used the 1000-block server default. Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST request (0 = server default, backwards-compatible). The wallet sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold to a fraction of its previous duration and allowing status calls to interleave between batches.
3f75e2d to
f0298aa
Compare
Drops the dedicated m_scan_threadpool (hardware_concurrency/2 threads) and uses a counting semaphore over the global getInstance() pool instead. This avoids the N*1.5 total thread count from having two separate pools, and correctly respects --max-concurrency since the global pool already honours tools::get_max_concurrency(). The semaphore is initialised to max(1, get_max_concurrency()/2) at wallet construction, after wallet_args has called set_max_concurrency() from the CLI flag.
pull_and_parse_next_blocks is an orchestrating task that internally calls scan_submit for CPU-intensive leaf tasks. Wrapping it with the semaphore causes it to hold a slot while submitting children that also need slots — on a single-core machine (semaphore count=1, all execution inline) this deadlocks. The inner leaf tasks are already gated by the semaphore inside their own call sites, so the outer wrapper is redundant and unsafe.
ngeojiajun
reviewed
May 15, 2026
If f() throws, the previous bare release() call would be skipped, permanently leaking a semaphore slot. Use a scope leave handler so the slot is released regardless of how the task exits.
ngeojiajun
approved these changes
May 16, 2026
Contributor
ngeojiajun
left a comment
There was a problem hiding this comment.
this seems fine, but a proper testing is required
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
m_scan_threadpoolwith a counting semaphore (m_scan_semaphore) over the globalthreadpool::getInstance()— eliminates the N×1.5 total thread count from having two separate poolsmax(1, get_max_concurrency() / 2)at wallet construction, afterwallet_argshas calledset_max_concurrency()from--max-concurrency, so the flag is correctly respectedscan_submit(), which wraps each with semaphore acquire/release to cap concurrent scan work to half the pool capacity;pull_and_parse_next_blocksis submitted without the semaphore since it is an I/O-bound orchestrator that internally dispatches those leaf tasks — wrapping it would cause a slot-hold deadlock on single-core machinesmax_countfield toCOMMAND_RPC_GET_BLOCKS_FAST::requestand sets it toBLOCKS_SYNCHRONIZING_DEFAULT_COUNT(100) inpull_blocks— caps eachgetblocks.binresponse to 100 blocks instead of up to 1000, reducing RPC mutex hold time per batchDEFAULT_MAX_CONCURRENCY=1pthread workaround inwallet_args.cpp— the previousis_arg_defaultedguard meant the default was never applied unless the user explicitly passed--max-concurrencyRelationship to #77
#77 (now merged) fixed RPC unresponsiveness by moving
refresh()to a dedicated background thread. This PR is independent and addresses the remaining issues #77 did not cover: CPU saturation from uncapped scan parallelism, the 1000-block batch size on the wallet RPC endpoint, and the macOS concurrency default bug.