perf: parallelize attestation signature verification#274
Conversation
🤖 Kimi Code ReviewOverall: Good optimization moving signature verification to parallel execution. The approach of separating cheap input preparation from expensive verification is correct. However, there are performance and observability regressions to address. Critical Issues
1. Performance regression: Unnecessary Vec clone in hot loop (Line ~1201)verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot)Each attestation clones the entire Fix: Change 2. Loss of timing metrics (Lines ~1196-1207)The original code wrapped verification in: let _timing = metrics::time_pq_sig_aggregated_signatures_verification();This metric is entirely removed. For consensus monitoring, you need visibility into aggregate signature verification latency. Fix: Either:
Code Quality3. Inconsistent error handling patternThe sequential preparation phase uses Suggestion (Line ~1198-1208): .par_iter()
.try_for_each(|(proof_data, public_keys, message, slot)| {
verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot)
.map(|_| metrics::inc_pq_sig_aggregated_signatures_valid())
.map_err(|e| {
metrics::inc_pq_sig_aggregated_signatures_invalid();
StoreError::AggregateVerificationFailed(e)
})
})4. Module import placement (Line ~1197)use rayon::prelude::*;Placing this inside the function body is unconventional. Move to top-level imports for consistency. Security & Correctness5. Verification completeness✅ Good: Using 6. Thread safety✅ Good: The Documentation7.
|
🤖 Claude Code ReviewHere's my review of PR #274: PR Review: perf: parallelize attestation signature verificationSummaryThe two-phase approach (sequential input preparation → parallel verification) is correct and well-motivated. The Issues
// Current: par_iter() borrows, forcing .clone() on each parallel call
verification_inputs
.par_iter()
.try_for_each(|(proof_data, public_keys, message, slot)| {
let result =
verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot);
verification_inputs
.into_par_iter()
.try_for_each(|(proof_data, public_keys, message, slot)| {
let result =
verify_aggregated_signature(proof_data, public_keys, &message, slot);Each XMSS pubkey is 52 bytes; with a full attestation committee this is a non-trivial allocation per attestation that is entirely avoidable. Timing metric dropped without replacement
The RAII guard .try_for_each(|(proof_data, public_keys, message, slot)| {
let _timing = metrics::time_pq_sig_aggregated_signatures_verification();
let result = verify_aggregated_signature(...);Given that the stated motivation is observing a 4× speedup, keeping this metric would let you verify it in production without relying solely on log output.
This import belongs at the top of the file with the other Non-deterministic error attribution on failure Minor: the sequential version always returned the error from the first failing attestation (by index). Docs changes
VerdictApprove with the Automated review by Claude (Anthropic) · sonnet · custom prompt |
Greptile SummaryThis PR parallelizes attestation signature verification in Confidence Score: 5/5Safe to merge — parallelization is correct, error propagation is sound, and all findings are P2 style/observability suggestions. The two-phase approach (sequential prep, parallel verify) is correctly implemented. Prometheus counters are atomic so calling them from rayon threads is safe. try_for_each correctly short-circuits on the first error. The only findings are a dropped timing histogram (observability regression, not a correctness issue) and an in-function import placement. Both are P2. crates/blockchain/src/store.rs — dropped per-attestation timing metric worth restoring.
|
| Filename | Overview |
|---|---|
| crates/blockchain/src/store.rs | Parallel attestation signature verification added via rayon; per-attestation timing metric (time_pq_sig_aggregated_signatures_verification) silently dropped from this path. |
| crates/blockchain/Cargo.toml | Added rayon workspace dependency for parallel iteration. |
| Cargo.toml | Registered rayon = "1.10" in workspace dependencies. |
| README.md | Added Container Releases section and a new roadmap item for formal STF verification. |
| RELEASE.md | Updated tag convention: unstable replaces latest for main-branch builds; latest now tracks the current devnet. |
Flowchart
%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[verify_signatures called] --> B{attestation count\n== signature count?}
B -- No --> ERR1[Err: AttestationSignatureMismatch]
B -- Yes --> C[Sequential preparation phase\nbits check + pubkey lookup]
C --> D{Any input error?}
D -- Yes --> ERR2[Err: ParticipantsMismatch /\nInvalidValidatorIndex /\nPubkeyDecodingFailed]
D -- No --> E[verification_inputs Vec collected]
E --> F[rayon par_iter]
F --> G1[Thread 1: verify_aggregated_signature]
F --> G2[Thread 2: verify_aggregated_signature]
F --> G3[Thread N: verify_aggregated_signature]
G1 & G2 & G3 --> H{All OK?}
H -- Any Err --> ERR3[Err: AggregateVerificationFailed]
H -- All Ok --> I[inc valid metrics]
I --> J[verify proposer signature sequentially]
J --> K[Log timing info]
K --> L[Ok]
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1199-1214
Comment:
**Per-attestation timing metric dropped**
The `_timing = metrics::time_pq_sig_aggregated_signatures_verification()` guard was removed and nothing replaces it inside the parallel closure. The histogram (`LEAN_PQ_SIG_AGGREGATED_SIGNATURES_VERIFICATION_TIME_SECONDS`) is still registered and still used in the gossip path (line 458), so it will now only reflect gossip-path verifications, silently under-counting block-path verifications. Since `TimingGuard` is a plain RAII struct with no shared state, it's safe to create per-thread inside `par_iter`. Consider adding it back:
```suggestion
verification_inputs
.par_iter()
.try_for_each(|(proof_data, public_keys, message, slot)| {
let _timing = metrics::time_pq_sig_aggregated_signatures_verification();
let result =
verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot);
match result {
Ok(()) => {
metrics::inc_pq_sig_aggregated_signatures_valid();
Ok(())
}
Err(e) => {
metrics::inc_pq_sig_aggregated_signatures_invalid();
Err(StoreError::AggregateVerificationFailed(e))
}
}
})?;
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1198
Comment:
**Import inside function body**
`use rayon::prelude::*` is placed mid-function. The codebase convention is to keep imports at the top of the file or at least at the top of the function; placing it inline between a comment and the expression it enables makes it easy to miss. Consider moving it to the top of the file alongside the other `use` statements.
```suggestion
use rayon::prelude::*;
// Run expensive signature verification in parallel
verification_inputs
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "perf: parallelize attestation signature ..." | Re-trigger Greptile
Signature verification dominated block processing time (~40-50ms per attestation, sequential). With 36 attestations this meant ~1.4-1.9s per block, causing the target to fall hundreds of slots behind the head. Split into two phases: 1. Prepare inputs sequentially (cheap: bit checks, pubkey lookups) 2. Verify signatures in parallel via rayon's par_iter On a 4-core machine with 36 attestations, expect ~4x speedup (1.4s -> ~350ms).
1a71cbf to
e45113b
Compare
crates/blockchain/src/store.rs
Outdated
| verification_inputs | ||
| .par_iter() | ||
| .try_for_each(|(proof_data, public_keys, message, slot)| { | ||
| let result = | ||
| verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot); | ||
| match result { | ||
| Ok(()) => { | ||
| metrics::inc_pq_sig_aggregated_signatures_valid(); | ||
| Ok(()) | ||
| } | ||
| Err(e) => { | ||
| metrics::inc_pq_sig_aggregated_signatures_invalid(); | ||
| Err(StoreError::AggregateVerificationFailed(e)) | ||
| } | ||
| validators[vid as usize] | ||
| .get_pubkey() | ||
| .map_err(|_| StoreError::PubkeyDecodingFailed(vid)) | ||
| }) | ||
| .collect::<Result<_, _>>()?; | ||
|
|
||
| let verification_result = { | ||
| let _timing = metrics::time_pq_sig_aggregated_signatures_verification(); | ||
| verify_aggregated_signature(&aggregated_proof.proof_data, public_keys, &message, slot) | ||
| }; | ||
| match verification_result { | ||
| Ok(()) => metrics::inc_pq_sig_aggregated_signatures_valid(), | ||
| Err(e) => { | ||
| metrics::inc_pq_sig_aggregated_signatures_invalid(); | ||
| return Err(StoreError::AggregateVerificationFailed(e)); | ||
| } | ||
| } | ||
| } | ||
| })?; |
There was a problem hiding this comment.
Per-attestation timing metric dropped
The _timing = metrics::time_pq_sig_aggregated_signatures_verification() guard was removed and nothing replaces it inside the parallel closure. The histogram (LEAN_PQ_SIG_AGGREGATED_SIGNATURES_VERIFICATION_TIME_SECONDS) is still registered and still used in the gossip path (line 458), so it will now only reflect gossip-path verifications, silently under-counting block-path verifications. Since TimingGuard is a plain RAII struct with no shared state, it's safe to create per-thread inside par_iter. Consider adding it back:
| verification_inputs | |
| .par_iter() | |
| .try_for_each(|(proof_data, public_keys, message, slot)| { | |
| let result = | |
| verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot); | |
| match result { | |
| Ok(()) => { | |
| metrics::inc_pq_sig_aggregated_signatures_valid(); | |
| Ok(()) | |
| } | |
| Err(e) => { | |
| metrics::inc_pq_sig_aggregated_signatures_invalid(); | |
| Err(StoreError::AggregateVerificationFailed(e)) | |
| } | |
| validators[vid as usize] | |
| .get_pubkey() | |
| .map_err(|_| StoreError::PubkeyDecodingFailed(vid)) | |
| }) | |
| .collect::<Result<_, _>>()?; | |
| let verification_result = { | |
| let _timing = metrics::time_pq_sig_aggregated_signatures_verification(); | |
| verify_aggregated_signature(&aggregated_proof.proof_data, public_keys, &message, slot) | |
| }; | |
| match verification_result { | |
| Ok(()) => metrics::inc_pq_sig_aggregated_signatures_valid(), | |
| Err(e) => { | |
| metrics::inc_pq_sig_aggregated_signatures_invalid(); | |
| return Err(StoreError::AggregateVerificationFailed(e)); | |
| } | |
| } | |
| } | |
| })?; | |
| verification_inputs | |
| .par_iter() | |
| .try_for_each(|(proof_data, public_keys, message, slot)| { | |
| let _timing = metrics::time_pq_sig_aggregated_signatures_verification(); | |
| let result = | |
| verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot); | |
| match result { | |
| Ok(()) => { | |
| metrics::inc_pq_sig_aggregated_signatures_valid(); | |
| Ok(()) | |
| } | |
| Err(e) => { | |
| metrics::inc_pq_sig_aggregated_signatures_invalid(); | |
| Err(StoreError::AggregateVerificationFailed(e)) | |
| } | |
| } | |
| })?; |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1199-1214
Comment:
**Per-attestation timing metric dropped**
The `_timing = metrics::time_pq_sig_aggregated_signatures_verification()` guard was removed and nothing replaces it inside the parallel closure. The histogram (`LEAN_PQ_SIG_AGGREGATED_SIGNATURES_VERIFICATION_TIME_SECONDS`) is still registered and still used in the gossip path (line 458), so it will now only reflect gossip-path verifications, silently under-counting block-path verifications. Since `TimingGuard` is a plain RAII struct with no shared state, it's safe to create per-thread inside `par_iter`. Consider adding it back:
```suggestion
verification_inputs
.par_iter()
.try_for_each(|(proof_data, public_keys, message, slot)| {
let _timing = metrics::time_pq_sig_aggregated_signatures_verification();
let result =
verify_aggregated_signature(proof_data, public_keys.clone(), message, *slot);
match result {
Ok(()) => {
metrics::inc_pq_sig_aggregated_signatures_valid();
Ok(())
}
Err(e) => {
metrics::inc_pq_sig_aggregated_signatures_invalid();
Err(StoreError::AggregateVerificationFailed(e))
}
}
})?;
```
How can I resolve this? If you propose a fix, please make it concise.| .collect::<Result<_, StoreError>>()?; | ||
|
|
||
| // Run expensive signature verification in parallel | ||
| use rayon::prelude::*; |
There was a problem hiding this comment.
use rayon::prelude::* is placed mid-function. The codebase convention is to keep imports at the top of the file or at least at the top of the function; placing it inline between a comment and the expression it enables makes it easy to miss. Consider moving it to the top of the file alongside the other use statements.
| use rayon::prelude::*; | |
| use rayon::prelude::*; | |
| // Run expensive signature verification in parallel | |
| verification_inputs |
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/blockchain/src/store.rs
Line: 1198
Comment:
**Import inside function body**
`use rayon::prelude::*` is placed mid-function. The codebase convention is to keep imports at the top of the file or at least at the top of the function; placing it inline between a comment and the expression it enables makes it easy to miss. Consider moving it to the top of the file alongside the other `use` statements.
```suggestion
use rayon::prelude::*;
// Run expensive signature verification in parallel
verification_inputs
```
How can I resolve this? If you propose a fix, please make it concise.Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
🤖 Codex Code Review
No fork-choice, justification/finalization, STF, or SSZ logic changed in this PR; the risk is isolated to XMSS aggregate verification behavior. I couldn’t run Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
Also bump rayon to 1.11.
Switch from par_iter() + public_keys.clone() to into_par_iter() which moves ownership of each tuple into the parallel closure, eliminating the per-attestation Vec<ValidatorPublicKey> clone.
Summary
par_iterTest plan
make fmtpassesmake lintpassesmake testpasses (all 101 tests, including signature spec tests)