Skip to content

Wasm merge timeout enforcement#2000

Draft
cursor[bot] wants to merge 1 commit intofeat/sync-wasm-merge-callbackfrom
cursor/wasm-merge-timeout-enforcement-b92a
Draft

Wasm merge timeout enforcement#2000
cursor[bot] wants to merge 1 commit intofeat/sync-wasm-merge-callbackfrom
cursor/wasm-merge-timeout-enforcement-b92a

Conversation

@cursor
Copy link
Contributor

@cursor cursor bot commented Feb 14, 2026

runtime: Enforce WASM merge function timeout

Description

This PR fixes a high-severity logic bug (bug_id: 381f5661-ec7b-4a91-b616-57c22661471b) where the WASM merge function could block indefinitely, leading to permanent stalls in sync processing. The RuntimeMergeCallback::call_merge_function now executes the WASM merge call in a separate thread with a timeout, using the timeout_ms field and returning MergeError::WasmTimeout if the timeout is exceeded. This prevents untrusted app merge logic from becoming a denial-of-service vector.

Test plan

The following tests were run to verify the changes:

  • cargo test within the crates/runtime directory. All existing tests passed.
  • cargo clippy and cargo fmt were run to ensure code style and catch potential issues.

No new end-to-end tests were added as the change is internal to the runtime's merge callback mechanism and covered by existing unit tests.

Documentation update

No public or internal documentation updates are required for this change.


Implement timeout enforcement for WASM merge callbacks to prevent
buggy or malicious merge functions from blocking sync indefinitely.

Changes:
- Wrap store and instance in Arc to enable thread sharing
- Execute WASM merge calls in background threads with configurable timeout
- Return MergeError::WasmTimeout when timeout expires (default: 5 seconds)
- Remove unused helper methods (get_memory, write_to_wasm, read_from_wasm)
- Remove #[allow(dead_code)] from timeout_ms field (now actively used)

Note: Due to Wasmer limitations, the underlying WASM execution may
continue in the background thread after timeout. This is documented
in the module-level docs.
@cursor
Copy link
Contributor Author

cursor bot commented Feb 14, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

@github-actions
Copy link

Your PR title does not adhere to the Conventional Commits convention:

<type>(<scope>): <subject>

Common errors to avoid:

  1. The title must be in lower case.
  2. Allowed type values are: build, ci, docs, feat, fix, perf, refactor, test.

Copy link

@meroreviewer meroreviewer bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤖 AI Code Reviewer

Reviewed by 3 agents | Quality score: 94% | Review time: 130.8s

🔴 1 critical, 🟡 3 warnings, 💡 2 suggestions. See inline comments.


🤖 Generated by AI Code Reviewer | Review ID: review-f77127a9

local_len = local.len(),
remote_len = remote.len(),
timeout_ms,
"Calling WASM merge function with timeout"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Mutex held indefinitely after timeout causes subsequent calls to block

When timeout occurs, the background thread still holds store.lock() and continues execution; subsequent call_merge_function calls will block on the mutex until the WASM completes, defeating the timeout purpose.

Suggested fix:

Consider using a try_lock with retry, or create a new WASM instance after timeout, or use a per-call store clone if Wasmer supports it.

})?;
debug!(
target: "calimero_runtime::merge",
export_name,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Thread resource leak on timeout - spawned threads never terminate

After timeout, the spawned thread keeps running indefinitely if WASM is in an infinite loop; repeated timeouts will accumulate orphaned threads consuming CPU and memory.

Suggested fix:

Document this limitation clearly; consider tracking active threads with a counter/metric; investigate Wasmer's interrupt mechanisms for future improvement.

&local_owned,
&remote_owned,
);
// Send result; ignore error if receiver dropped (timeout occurred)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 WASM instance state undefined after timeout - subsequent merges may corrupt data

After timeout, the WASM instance continues modifying its memory; subsequent merge calls on the same instance may read partially-written or corrupted state.

Suggested fix:

Mark the RuntimeMergeCallback as poisoned after timeout and return an error on subsequent calls, or create a fresh instance.

timeout_ms,
"Calling WASM merge function with timeout"
);

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Unbounded thread spawning enables resource exhaustion DoS

Each merge call spawns an unbounded thread that continues running after timeout, holding Arc references to Store and Instance; an attacker triggering many slow WASM merge operations could exhaust thread/memory resources.

Suggested fix:

Consider using a bounded thread pool, adding a semaphore to limit concurrent merge calls, or exploring WASM metering/fuel-based cancellation in future iterations.

})?;
debug!(
target: "calimero_runtime::merge",
export_name,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Thread spawning overhead on every merge call

Creating a new OS thread for each merge operation adds overhead; for high-frequency merges this could become a bottleneck.

Suggested fix:

Consider using a thread pool or async task spawning if merge frequency is expected to be high.

}
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Long function could be decomposed for readability

The execute_merge_call function is ~110 lines; extracting helper closures or local functions for memory write/read operations would improve readability.

Suggested fix:

Extract local helper closures like `write_to_memory` and `read_from_memory` within the function to reduce repetition and clarify intent.

@github-actions
Copy link

This pull request has been automatically marked as stale. If this pull request is still relevant, please leave any comment (for example, "bump"), and we'll keep it open. We are sorry that we haven't been able to prioritize reviewing it yet. Your contribution is very much appreciated.

@github-actions github-actions bot added the Stale label Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant