Skip to content

fix: bound wallet scan threadpool to half of available cores#70

Open
sn1f3rt wants to merge 5 commits into
masterfrom
fix/issue-66-wallet-scan-threadpool
Open

fix: bound wallet scan threadpool to half of available cores#70
sn1f3rt wants to merge 5 commits into
masterfrom
fix/issue-66-wallet-scan-threadpool

Conversation

@sn1f3rt
Copy link
Copy Markdown
Member

@sn1f3rt sn1f3rt commented Apr 6, 2026

Summary

  • Replaces the dedicated m_scan_threadpool with a counting semaphore (m_scan_semaphore) over the global threadpool::getInstance() — eliminates the N×1.5 total thread count from having two separate pools
  • Semaphore is initialised to max(1, get_max_concurrency() / 2) at wallet construction, after wallet_args has called set_max_concurrency() from --max-concurrency, so the flag is correctly respected
  • CPU-intensive leaf tasks (output checking, block/tx parsing, cache and key derivation work) go through scan_submit(), which wraps each with semaphore acquire/release to cap concurrent scan work to half the pool capacity; pull_and_parse_next_blocks is submitted without the semaphore since it is an I/O-bound orchestrator that internally dispatches those leaf tasks — wrapping it would cause a slot-hold deadlock on single-core machines
  • Adds max_count field to COMMAND_RPC_GET_BLOCKS_FAST::request and sets it to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100) in pull_blocks — caps each getblocks.bin response to 100 blocks instead of up to 1000, reducing RPC mutex hold time per batch
  • Fixes the macOS DEFAULT_MAX_CONCURRENCY=1 pthread workaround in wallet_args.cpp — the previous is_arg_defaulted guard meant the default was never applied unless the user explicitly passed --max-concurrency

Relationship to #77

#77 (now merged) fixed RPC unresponsiveness by moving refresh() to a dedicated background thread. This PR is independent and addresses the remaining issues #77 did not cover: CPU saturation from uncapped scan parallelism, the 1000-block batch size on the wallet RPC endpoint, and the macOS concurrency default bug.

@sn1f3rt sn1f3rt added the bug Something isn't working label Apr 6, 2026
@sn1f3rt sn1f3rt self-assigned this Apr 6, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 6, 2026

Build Artifacts

Target Status Download
nerva-linux-armv7 Download
nerva-linux-armv8 Download
nerva-windows-x32 Download
nerva-linux-i686 Download
nerva-windows-x64 Download
nerva-linux-x86_64 Download
nerva-macos-x64 Download
nerva-macos-armv8 Download
nerva-freebsd-x86_64 Download
nerva-android-armv8 Download

10 succeeded, 0 failed | View workflow run

sn1f3rt added 2 commits May 12, 2026 23:47
The wallet refresh pipeline used the global singleton threadpool for all
block parsing and transaction processing. On multi-core machines this
saturated every hardware thread with crypto work, starving the daemon's
RPC server of CPU time and causing two inconsistent scan modes:

  1. High CPU / RPC unresponsive / scan completes in minutes
  2. Low CPU / RPC responsive / scan takes over an hour

Introduce a per-wallet scan threadpool (m_scan_threadpool) initialised
with max(1, hardware_concurrency/2) threads. This preserves parallel
block processing while leaving headroom for the daemon RPC threads.

Also fix the macOS DEFAULT_MAX_CONCURRENCY=1 workaround in
wallet_args.cpp: the previous is_arg_defaulted guard meant
set_max_concurrency(1) was never applied unless the user explicitly
passed --max-concurrency. The guard is removed so the default value
(1 on macOS, 0 elsewhere) is always applied; calling
set_max_concurrency(0) on other platforms is a no-op.

Closes #66
The daemon's getblocks.bin RPC handler returns up to
COMMAND_RPC_GET_BLOCKS_FAST_MAX_COUNT (1000) blocks per request. The
wallet held m_daemon_rpc_mutex for the entire duration of each response,
meaning one 1000-block fetch (potentially several seconds on Windows)
blocked all other wallet RPC operations — balance queries, height
updates, etc. — causing the GUI to show timeouts during wallet scan.

Note: the adaptive sync cap introduced in PR #65 applies only to P2P
block propagation via get_block_sync_size(); it has no effect on the
wallet getblocks.bin endpoint, which always used the 1000-block server
default.

Fix: add an optional max_count field to COMMAND_RPC_GET_BLOCKS_FAST
request (0 = server default, backwards-compatible). The wallet sets it
to BLOCKS_SYNCHRONIZING_DEFAULT_COUNT (100), reducing each mutex hold
to a fraction of its previous duration and allowing status calls to
interleave between batches.
@sn1f3rt sn1f3rt force-pushed the fix/issue-66-wallet-scan-threadpool branch from 3f75e2d to f0298aa Compare May 12, 2026 18:17
sn1f3rt added 2 commits May 13, 2026 20:16
Drops the dedicated m_scan_threadpool (hardware_concurrency/2 threads)
and uses a counting semaphore over the global getInstance() pool instead.
This avoids the N*1.5 total thread count from having two separate pools,
and correctly respects --max-concurrency since the global pool already
honours tools::get_max_concurrency(). The semaphore is initialised to
max(1, get_max_concurrency()/2) at wallet construction, after
wallet_args has called set_max_concurrency() from the CLI flag.
pull_and_parse_next_blocks is an orchestrating task that internally calls
scan_submit for CPU-intensive leaf tasks. Wrapping it with the semaphore
causes it to hold a slot while submitting children that also need slots —
on a single-core machine (semaphore count=1, all execution inline) this
deadlocks. The inner leaf tasks are already gated by the semaphore inside
their own call sites, so the outer wrapper is redundant and unsafe.
Comment thread src/wallet/wallet2.cpp Outdated
If f() throws, the previous bare release() call would be skipped,
permanently leaking a semaphore slot. Use a scope leave handler so
the slot is released regardless of how the task exits.
Copy link
Copy Markdown
Contributor

@ngeojiajun ngeojiajun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems fine, but a proper testing is required

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants