fix: self-deliver aggregator attestations to gossip_signatures#270
fix: self-deliver aggregator attestations to gossip_signatures#270pablodeymo merged 5 commits intomainfrom
Conversation
Gossipsub does not deliver messages back to the sender. The aggregator subscribes to the attestation subnet and publishes via mesh, but in small networks no other node subscribes, so the publish fails with NoPeersSubscribedToTopic and the aggregator's own attestation never enters gossip_signatures. This means it is excluded from aggregated proofs in blocks. Fix by calling on_gossip_attestation() locally before publishing to gossip, matching the reference spec behavior. Also suppress the expected NoPeersSubscribedToTopic warning for the aggregator's attestation publish using a best-effort publish mode.
🤖 Claude Code ReviewReview: fix: self-deliver aggregator attestations to gossip_signaturesSummary: This fix correctly addresses a consensus-critical bug where the aggregator's own validator signature was excluded from aggregated proofs. The approach matches the reference Python spec. The implementation is clean and well-reasoned overall. CorrectnessSelf-delivery ordering ( Idempotency on re-delivery: Full validation preserved: The self-delivery path goes through Concerns
let result = swarm.behaviour_mut().gossipsub.publish(topic, data);
if !best_effort {
let _ = result.inspect_err(|err| warn!(%err, "Swarm adapter: publish failed"));
}The PR's intent is to suppress the expected if let Err(err) = result {
if !best_effort || !matches!(err, PublishError::NoPeersSubscribedToTopic) {
warn!(%err, "Swarm adapter: publish failed");
}
}This keeps the rationale precise and avoids silently masking real problems in future.
Minor points
OverallThe fix is correct and necessary for consensus progress in small devnets. The self-delivery approach is the right solution for the gossipsub limitation. The main actionable concern is the overly broad Automated review by Claude (Anthropic) · sonnet · custom prompt |
🤖 Codex Code Review
No other correctness or consensus regressions stood out from the diff. The self-delivery change in lib.rs is consistent with the current aggregation flow and should fix the local-validator exclusion problem without affecting fork-choice accounting. I couldn’t run the Rust test suites in this environment because Automated review by OpenAI Codex · gpt-5.4 · custom prompt |
🤖 Kimi Code ReviewOverall Assessment: This PR correctly addresses a critical liveness issue where aggregators were excluding their own attestations from aggregated proofs due to gossipsub not echoing messages back to the sender. The implementation is clean and the separation of Issues and Concerns:
Code Quality Suggestions:
Security Considerations:
Positive Notes:
Recommendation: Approve with minor fixes for error handling severity and idempotency documentation. Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt |
Greptile SummaryThis PR fixes a critical small-devnet finalization bug: the aggregator now self-delivers its own attestation via
Confidence Score: 3/5Not safe to merge as-is: the BTreeMap→Vec change likely breaks XMSS proof verification in any multi-validator scenario where the aggregator's validator_id is not the lowest among attestors. One P1 finding: removing the BTreeMap ordering guarantee for gossip_signatures silently breaks the producer/verifier pubkey-order contract that was explicitly documented in the removed comment. The self-delivery fix itself is correct and valuable, but the Vec change introduces a regression that would surface whenever more than one attestation signature is aggregated and the aggregator is not the lowest-id validator. crates/storage/src/store.rs — the GossipDataEntry.signatures Vec needs a sort step in iter_gossip_signatures or insert_gossip_signature to restore ascending validator_id order.
|
| Filename | Overview |
|---|---|
| crates/storage/src/store.rs | Changed GossipDataEntry.signatures from BTreeMap (ascending-ordered) to Vec (insertion-ordered); breaks XMSS aggregation ordering requirement in multi-validator scenarios where the self-delivering aggregator has a non-lowest validator_id. |
| crates/blockchain/src/lib.rs | Adds aggregator self-delivery via on_gossip_attestation before publishing; also simplifies proposer attestation to use pre-import store state instead of post-STF PostBlockCheckpoints — may produce stale source checkpoint when block advances justification. |
| crates/blockchain/src/store.rs | Removes PostBlockCheckpoints and get_attestation_target_with_checkpoints; simplifies produce_block_with_signatures return type; logic change is a cleanup but removes a correctness guard for proposer attestation source. |
| crates/net/p2p/src/gossipsub/handler.rs | Aggregator now uses publish_best_effort for attestation subnet, suppressing expected NoPeersSubscribedToTopic error; non-aggregator path unchanged. |
| crates/net/p2p/src/lib.rs | Threads is_aggregator flag from SwarmConfig through BuiltSwarm into P2PServer cleanly; no logic concerns. |
| crates/net/p2p/src/swarm_adapter.rs | Adds best_effort flag to SwarmCommand::Publish; silently swallows all publish errors when set (not just NoPeersSubscribedToTopic), reducing observability for unexpected failures. |
Sequence Diagram
sequenceDiagram
participant BC as BlockChain (aggregator)
participant Store as Store (gossip_signatures)
participant P2P as P2PServer
participant Swarm as Swarm (gossipsub)
participant Peers as Other Nodes
BC->>BC: produce_attestation(slot, validator_id)
BC->>BC: sign_attestation → signed_attestation
Note over BC: NEW: self-deliver before publish
BC->>Store: on_gossip_attestation(signed_attestation)
Store->>Store: insert_gossip_signature (Vec, insertion order)
BC->>P2P: publish_attestation(signed_attestation)
P2P->>Swarm: publish_best_effort(topic, data)
Swarm-->>Peers: gossip (may fail: NoPeersSubscribedToTopic, silently ignored)
Peers-->>Store: insert_gossip_signature (other validators, appended)
Note over Store: Vec order: [aggregator_id, peer0, peer1, ...]
Note over Store: Verifier expects ascending order [0, 1, 2, ...]
BC->>Store: aggregate_committee_signatures (interval 2)
Store->>Store: iter_gossip_signatures → snapshot in Vec order
Store->>Store: aggregate_signatures(pubkeys, sigs) → proof
Note over Store: proof built with non-ascending pubkey order
Store->>Store: verify_signatures → pubkeys from bitfield (ascending)
Note over Store: mismatch → verification fails if order differs
Comments Outside Diff (2)
-
crates/storage/src/store.rs, line 993-1007 (link)Vec ordering breaks XMSS aggregation for multi-validator devnets
The
BTreeMapthat was replaced here guaranteed ascendingvalidator_idorder on iteration. The original code included an explicit comment explaining that XMSS aggregate proofs are order-dependent: verification reconstructs pubkeys from the participation bitfield in low-to-high index order, so aggregation must receive pubkeys in the same ascending order. TheVecdoes not sort on insertion, so when the aggregator self-delivers its attestation first (before network arrivals), itsvalidator_idwill appear at position 0 in theVecregardless of its numeric value.In
aggregate_committee_signatures,pubkeysandsigsare built by iterating the snapshot in Vec order. Inverify_signatures,validator_indices(&attestation.aggregation_bits)reconstructs pubkeys in ascending bitfield order (0 … max). If the aggregator is validator 3 and delivers first, aggregation uses order[3, 0, 1, 2]but verification expects[0, 1, 2, 3]; the proof will fail every block with multiple attestors.A minimal fix is to sort entries by
validator_idwhen iterating the snapshot:// In iter_gossip_signatures, before mapping: let mut sigs: Vec<_> = entry.signatures.iter() .map(|e| (e.validator_id, e.signature.clone())) .collect(); sigs.sort_unstable_by_key(|(vid, _)| *vid);
Or sort in
insert_gossip_signatureusingentry.signatures.sort_unstable_by_key(|e| e.validator_id)after each push.Prompt To Fix With AI
This is a comment left during a code review. Path: crates/storage/src/store.rs Line: 993-1007 Comment: **Vec ordering breaks XMSS aggregation for multi-validator devnets** The `BTreeMap` that was replaced here guaranteed ascending `validator_id` order on iteration. The original code included an explicit comment explaining that XMSS aggregate proofs are order-dependent: verification reconstructs pubkeys from the participation bitfield in low-to-high index order, so aggregation must receive pubkeys in the same ascending order. The `Vec` does not sort on insertion, so when the aggregator self-delivers its attestation first (before network arrivals), its `validator_id` will appear at position 0 in the `Vec` regardless of its numeric value. In `aggregate_committee_signatures`, `pubkeys` and `sigs` are built by iterating the snapshot in Vec order. In `verify_signatures`, `validator_indices(&attestation.aggregation_bits)` reconstructs pubkeys in ascending bitfield order (0 … max). If the aggregator is validator 3 and delivers first, aggregation uses order `[3, 0, 1, 2]` but verification expects `[0, 1, 2, 3]`; the proof will fail every block with multiple attestors. A minimal fix is to sort entries by `validator_id` when iterating the snapshot: ```rust // In iter_gossip_signatures, before mapping: let mut sigs: Vec<_> = entry.signatures.iter() .map(|e| (e.validator_id, e.signature.clone())) .collect(); sigs.sort_unstable_by_key(|(vid, _)| *vid); ``` Or sort in `insert_gossip_signature` using `entry.signatures.sort_unstable_by_key(|e| e.validator_id)` after each push. How can I resolve this? If you propose a fix, please make it concise.
-
crates/blockchain/src/lib.rs, line 234-245 (link)Proposer attestation now uses pre-import justification
The removed
PostBlockCheckpointsstruct capturedpost_state.latest_justifiedandpost_state.latest_finalizedfrom insidebuild_block, so the proposer's attestation source reflected the checkpoints after the block's attestations were processed. Nowself.store.latest_justified()is read from the store before the new block is imported, so if the block's packed attestations cross the 2/3 threshold and advance justification, the proposer's own attestation will carry a stale source checkpoint.Is this change intentional? The original comment ("the block's attestations may have advanced justification/finalization but the block hasn't been imported into the store yet") described an active correctness concern, not just a caveat.
Prompt To Fix With AI
This is a comment left during a code review. Path: crates/blockchain/src/lib.rs Line: 234-245 Comment: **Proposer attestation now uses pre-import justification** The removed `PostBlockCheckpoints` struct captured `post_state.latest_justified` and `post_state.latest_finalized` from inside `build_block`, so the proposer's attestation source reflected the checkpoints *after* the block's attestations were processed. Now `self.store.latest_justified()` is read from the store before the new block is imported, so if the block's packed attestations cross the 2/3 threshold and advance justification, the proposer's own attestation will carry a stale source checkpoint. Is this change intentional? The original comment ("the block's attestations may have advanced justification/finalization but the block hasn't been imported into the store yet") described an active correctness concern, not just a caveat. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: crates/storage/src/store.rs
Line: 993-1007
Comment:
**Vec ordering breaks XMSS aggregation for multi-validator devnets**
The `BTreeMap` that was replaced here guaranteed ascending `validator_id` order on iteration. The original code included an explicit comment explaining that XMSS aggregate proofs are order-dependent: verification reconstructs pubkeys from the participation bitfield in low-to-high index order, so aggregation must receive pubkeys in the same ascending order. The `Vec` does not sort on insertion, so when the aggregator self-delivers its attestation first (before network arrivals), its `validator_id` will appear at position 0 in the `Vec` regardless of its numeric value.
In `aggregate_committee_signatures`, `pubkeys` and `sigs` are built by iterating the snapshot in Vec order. In `verify_signatures`, `validator_indices(&attestation.aggregation_bits)` reconstructs pubkeys in ascending bitfield order (0 … max). If the aggregator is validator 3 and delivers first, aggregation uses order `[3, 0, 1, 2]` but verification expects `[0, 1, 2, 3]`; the proof will fail every block with multiple attestors.
A minimal fix is to sort entries by `validator_id` when iterating the snapshot:
```rust
// In iter_gossip_signatures, before mapping:
let mut sigs: Vec<_> = entry.signatures.iter()
.map(|e| (e.validator_id, e.signature.clone()))
.collect();
sigs.sort_unstable_by_key(|(vid, _)| *vid);
```
Or sort in `insert_gossip_signature` using `entry.signatures.sort_unstable_by_key(|e| e.validator_id)` after each push.
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/net/p2p/src/swarm_adapter.rs
Line: 143-157
Comment:
**`best_effort` silences all errors, not just `NoPeersSubscribedToTopic`**
When `best_effort: true` the entire publish result is dropped silently. `NoPeersSubscribedToTopic` is expected for the aggregator, but other errors (e.g., message-too-large, duplicate `MessageId`) would also be swallowed with no trace. Consider logging at `debug!` or `trace!` even in best-effort mode so unexpected failures remain observable:
```rust
SwarmCommand::Publish {
topic,
data,
best_effort,
} => {
let result = swarm.behaviour_mut().gossipsub.publish(topic, data);
if best_effort {
let _ = result.inspect_err(|err| {
tracing::debug!(%err, "Swarm adapter: best-effort publish failed (ignored)")
});
} else {
let _ = result.inspect_err(|err| warn!(%err, "Swarm adapter: publish failed"));
}
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: crates/blockchain/src/lib.rs
Line: 234-245
Comment:
**Proposer attestation now uses pre-import justification**
The removed `PostBlockCheckpoints` struct captured `post_state.latest_justified` and `post_state.latest_finalized` from inside `build_block`, so the proposer's attestation source reflected the checkpoints *after* the block's attestations were processed. Now `self.store.latest_justified()` is read from the store before the new block is imported, so if the block's packed attestations cross the 2/3 threshold and advance justification, the proposer's own attestation will carry a stale source checkpoint.
Is this change intentional? The original comment ("the block's attestations may have advanced justification/finalization but the block hasn't been imported into the store yet") described an active correctness concern, not just a caveat.
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "fix: self-deliver aggregator attestation..." | Re-trigger Greptile
pablodeymo
left a comment
There was a problem hiding this comment.
Style nit (non-blocking): The if let Err pattern here works, but the codebase convention prefers inspect_err for side-effect-only error handling. One option:
let result = swarm.behaviour_mut().gossipsub.publish(topic, data);
if let Err(err) = &result {
if !(ignore_no_peers && matches!(err, PublishError::NoPeersSubscribedToTopic)) {
warn!(%err, "Swarm adapter: publish failed");
}
}Purely cosmetic — current version reads fine too.
| // this the aggregator never sees its own validator's signature in | ||
| // gossip_signatures and it is excluded from aggregated proofs. | ||
| if self.is_aggregator { | ||
| let _ = store::on_gossip_attestation(&mut self.store, signed_attestation.clone()) |
There was a problem hiding this comment.
can we avoid this clone? we can do this in another PR
Summary
on_gossip_attestation()locally before publishing to gossip, ensuring its own validator's attestation entersgossip_signaturesand is included in aggregated proofspublish_best_effort()to suppress the expectedNoPeersSubscribedToTopicwarning when the aggregator publishes to attestation subnets it subscribes tois_aggregatorfromSwarmConfigthroughBuiltSwarmintoP2PServerso the gossipsub handler can use best-effort publishingContext
Gossipsub does not deliver messages back to the sender. The aggregator subscribes to the attestation subnet and publishes via mesh, but in small devnets no other node subscribes to attestation subnets. This causes the aggregator's own attestation publish to fail silently with
NoPeersSubscribedToTopic, meaning its vote never entersgossip_signatures, is never aggregated, and is never included in blocks.The reference spec handles this explicitly in
_produce_attestations():Test plan
cargo buildpassescargo clippy --workspace -- -D warningspassescargo test --workspace --release --libpasses (all 32 unit tests)NoPeersSubscribedToTopicwarnings on the aggregator node