Advance XMSS key preparation window before signing by pablodeymo · Pull Request #261 · lambdaclass/ethlambda

pablodeymo · 2026-04-07T21:55:27Z

Summary

Expose is_prepared_for(slot) and advance_preparation() on ValidatorSecretKey, delegating to the leansig SignatureSchemeSecretKey trait
Before signing in KeyManager::sign_message(), check if the target slot is within the prepared window and advance if needed
Return a descriptive error if the key's activation interval is fully exhausted
Add a timing test for advance_preparation() (526ms in release mode on Apple Silicon)

Root Cause

XMSS keys use a Top-Bottom Tree Traversal scheme where only two consecutive bottom trees are loaded in memory at any time. Each bottom tree covers sqrt(LIFETIME) = 2^16 = 65,536 slots, so the prepared window spans 131,072 slots (~6 days at 4s/slot).

The leansig library provides advance_preparation() to slide this window forward by computing the next bottom tree, but ethlambda's KeyManager never called it. When the devnet at admin@ethlambda-1 reached slot 131,072, all 4 nodes panicked simultaneously:

Signing: key not yet prepared for this epoch, try calling sk.advance_preparation.

The fix checks the prepared interval before every sign operation and advances the window on demand. This is a lazy approach — advance_preparation is called at signing time rather than proactively in the tick loop — because:

It happens once every ~3 days (65,536 slots)
The computation (one bottom tree of hash leaves) takes ~526ms in release mode
It keeps the change minimal and avoids tick-loop complexity

Test plan

make fmt clean
make lint clean
make test passes (existing tests use small lifetimes or skip verification)
test_advance_preparation_duration passes (cargo test -p ethlambda-types test_advance_preparation_duration --release -- --ignored --nocapture)
Deploy to devnet with fresh genesis and verify it runs past slot 131,072 without panic

…t bottom-tree boundaries The XMSS signing key maintains a sliding window of 2 bottom trees (131,072 slots). When the window is exhausted, advance_preparation() must be called to compute the next bottom tree. Without this, all nodes panic at the window boundary.

github-actions · 2026-04-07T21:57:12Z

🤖 Codex Code Review

crates/blockchain/src/key_manager.rs:108 The new “advance preparation” loop is functionally a single-step check, not a real loop. After one advance_preparation(), line 112 immediately returns an exhaustion error if the slot is still outside the prepared interval, so slots that are more than one bottom-tree ahead can never recover. Example: if a validator is offline long enough to skip more than 65,536 slots, signing will fail even though repeated advances should make the target slot signable. This is a consensus-liveness bug for local validator duties. The fix is to keep advancing until either is_prepared_for(slot) becomes true or you can positively detect that advance_preparation() no longer changes the interval.
crates/blockchain/src/key_manager.rs:113 The new error text says the slot is “beyond the key’s activation interval”, but this branch is also reached whenever the slot is merely still outside the prepared window after the first advance. With the current control flow, that message can be false and will mislead debugging. If the code keeps this API shape, it should distinguish “needs more window advances” from true key exhaustion.

No other issues stood out in the diff itself. I could not fully verify with cargo test because dependency fetching is blocked in this sandbox.

Automated review by OpenAI Codex · gpt-5.4 · custom prompt

greptile-apps · 2026-04-07T21:58:06Z

Greptile Summary

This PR fixes a production panic that occurred at slot 131,072 by checking whether the XMSS key's prepared window covers the target slot before signing, and calling advance_preparation() if it doesn't. The fix is minimal and well-targeted: two new methods are exposed on ValidatorSecretKey in signature.rs, and KeyManager::sign_message uses them before invoking the signer.

Confidence Score: 4/5

Safe to merge for the common case; one logic concern in the advancement loop is worth addressing for robustness.

The while loop contains an inner if !is_prepared_for guard that returns an error immediately after the first advance, making the loop semantically equivalent to a single-shot if. In normal operation (slots advancing monotonically, at most one window behind) this is fine and the PR fully solves the reported panic. The concern only surfaces if a node is offline for more than ~131k slots (~6 days), where two or more advances would be needed but the current code would incorrectly report key exhaustion.

crates/blockchain/src/key_manager.rs — the while loop in the preparation window advancement logic.

Important Files Changed

Filename	Overview
crates/blockchain/src/key_manager.rs	Adds XMSS key preparation window advancement before signing; the while loop logic has a subtle issue where it only ever advances once before erroring, rather than iterating until prepared or truly exhausted.
crates/common/types/src/signature.rs	Exposes `is_prepared_for(slot)` and `advance_preparation()` on `ValidatorSecretKey` by delegating to the leansig `SignatureSchemeSecretKey` trait; straightforward and correct.

Sequence Diagram

sequenceDiagram
    participant BC as BlockChain
    participant KM as KeyManager
    participant VSK as ValidatorSecretKey
    participant LS as leansig (inner)

    BC->>KM: sign_attestation(validator_id, attestation_data)
    KM->>KM: slot = attestation_data.slot as u32
    KM->>KM: sign_message(validator_id, slot, message_hash)
    KM->>VSK: is_prepared_for(slot)
    VSK->>LS: get_prepared_interval().contains(slot as u64)
    LS-->>VSK: bool
    VSK-->>KM: false (slot outside window)
    note over KM: log info — advancing window
    KM->>VSK: advance_preparation()
    VSK->>LS: advance_preparation() — slides window by 65,536 slots
    KM->>VSK: is_prepared_for(slot)
    VSK-->>KM: true
    KM->>VSK: sign(slot, message)
    VSK->>LS: LeanSignatureScheme::sign(&inner, slot, message)
    LS-->>VSK: ValidatorSignature
    VSK-->>KM: Ok(signature)
    KM-->>BC: Ok(XmssSignature)

Prompt To Fix All With AI

This is a comment left during a code review.
Path: crates/blockchain/src/key_manager.rs
Line: 110-118

Comment:
**`while` loop only ever advances once**

The inner `if !is_prepared_for` guard returns an error immediately after the first `advance_preparation()` call, so the `while` never actually iterates a second time. If the slot happens to be two or more windows ahead of the current prepared interval (e.g., after an extended outage), the key would be incorrectly reported as exhausted even though another advance would bring it in range.

To detect true exhaustion (where `advance_preparation()` is a no-op), capture the interval before and after and compare:

```rust
while !secret_key.is_prepared_for(slot) {
    let before = secret_key.get_prepared_interval();
    secret_key.advance_preparation();
    if secret_key.get_prepared_interval() == before {
        // No-op: key activation interval is fully exhausted
        return Err(KeyManagerError::SigningError(format!(
            "XMSS key exhausted for validator {validator_id}: \
             slot {slot} is beyond the key's activation interval"
        )));
    }
}
```

This requires exposing `get_prepared_interval()` on `ValidatorSecretKey`, but correctly distinguishes "key exhausted" from "slot is multiple windows ahead".

How can I resolve this? If you propose a fix, please make it concise.

_{Reviews (1): Last reviewed commit: "Advance XMSS key preparation window befo..." | Re-trigger Greptile}

github-actions · 2026-04-07T21:58:09Z

🤖 Claude Code Review

Now I have everything I need to write the review.

Code Review: Advance XMSS key preparation window before signing

Summary: The PR fixes a real production panic and the overall approach is sound. There's one logic bug in the loop structure and a misleading error message that should be addressed.

Bug: `while` loop is effectively single-iteration (`key_manager.rs:110–118`)

while !secret_key.is_prepared_for(slot) {
    secret_key.advance_preparation();
    if !secret_key.is_prepared_for(slot) {
        return Err(KeyManagerError::SigningError(format!(
            "XMSS key exhausted for validator {validator_id}: \
             slot {slot} is beyond the key's activation interval"
        )));
    }
}

The while suggests it can loop multiple times, but it cannot. After each advance_preparation() call, if the slot is still not covered, the function immediately returns an error—the while condition is never re-evaluated for a second iteration. The structure is equivalent to a plain if:

secret_key.advance_preparation();
if !secret_key.is_prepared_for(slot) { return Err(...); }

When does this matter? Each advance_preparation() slides the window by one bottom tree (65,536 slots). If a node restarts with an un-advanced key at a high-numbered slot—e.g., slot 400,000 with window [0, 131,071]—five advances are required. The current code bails out after the first failed advance and reports "key exhausted" for a key that is not actually exhausted.

A concrete failure case:

Window: [0, 131,071], target slot: 200,000
One advance: [65,536, 196,607] — slot 200,000 still not covered → premature error
Two advances: [131,072, 262,143] — slot 200,000 covered ✓

A correct implementation loops until prepared or until advance_preparation is provably a no-op. One clean approach—add get_prepared_interval() to ValidatorSecretKey and detect stall:

if !secret_key.is_prepared_for(slot) {
    info!(validator_id, slot, "Advancing XMSS key preparation window");
    loop {
        let prev_end = secret_key.prepared_interval_end(); // new helper
        secret_key.advance_preparation();
        if secret_key.is_prepared_for(slot) {
            break;
        }
        if secret_key.prepared_interval_end() == prev_end {
            // advance_preparation was a no-op; key truly exhausted
            return Err(KeyManagerError::SigningError(format!(
                "XMSS key exhausted for validator {validator_id}: \
                 slot {slot} is beyond the key's activation interval"
            )));
        }
    }
}

Alternatively, a bounded loop with a generous MAX_ADVANCES constant (e.g., 64) keeps it simple without needing to expose the interval:

for _ in 0..MAX_ADVANCES {
    if secret_key.is_prepared_for(slot) { break; }
    secret_key.advance_preparation();
}
if !secret_key.is_prepared_for(slot) {
    return Err(...);
}

Minor: Misleading error text

The error message "slot {slot} is beyond the key's activation interval" can fire (due to the single-advance logic above) for slots that are within the activation interval but just need more than one window advance. The message would mislead operators into thinking the key is permanently dead when they may simply need to advance it further. The message should be revised once the loop logic is fixed.

Rest of the PR

signature.rs: The two new delegation methods (is_prepared_for, advance_preparation) are clean and correctly placed. The cast slot as u64 in is_prepared_for is fine (u32 max ≈ 170 years at 4s/slot).
key_manager.rs: The tracing::info! log (line 109) fires once per advance event outside the inner loop, which is the right level of verbosity.
The lazy-advance approach (vs. proactive tick-loop advancing) is a reasonable trade-off as explained in the PR description.
sign_attestation is the only public entry point and routes through sign_message, so the fix covers all signing paths.

Automated review by Claude (Anthropic) · sonnet · custom prompt

greptile-apps · 2026-04-07T21:58:09Z

+            while !secret_key.is_prepared_for(slot) {
+                secret_key.advance_preparation();
+                if !secret_key.is_prepared_for(slot) {
+                    return Err(KeyManagerError::SigningError(format!(
+                        "XMSS key exhausted for validator {validator_id}: \
+                         slot {slot} is beyond the key's activation interval"
+                    )));
+                }
+            }


while loop only ever advances once

The inner if !is_prepared_for guard returns an error immediately after the first advance_preparation() call, so the while never actually iterates a second time. If the slot happens to be two or more windows ahead of the current prepared interval (e.g., after an extended outage), the key would be incorrectly reported as exhausted even though another advance would bring it in range.

To detect true exhaustion (where advance_preparation() is a no-op), capture the interval before and after and compare:

while !secret_key.is_prepared_for(slot) { let before = secret_key.get_prepared_interval(); secret_key.advance_preparation(); if secret_key.get_prepared_interval() == before { // No-op: key activation interval is fully exhausted return Err(KeyManagerError::SigningError(format!( "XMSS key exhausted for validator {validator_id}: \ slot {slot} is beyond the key's activation interval" ))); } }

This requires exposing get_prepared_interval() on ValidatorSecretKey, but correctly distinguishes "key exhausted" from "slot is multiple windows ahead".

Prompt To Fix With AI

This is a comment left during a code review. Path: crates/blockchain/src/key_manager.rs Line: 110-118 Comment: **`while` loop only ever advances once** The inner `if !is_prepared_for` guard returns an error immediately after the first `advance_preparation()` call, so the `while` never actually iterates a second time. If the slot happens to be two or more windows ahead of the current prepared interval (e.g., after an extended outage), the key would be incorrectly reported as exhausted even though another advance would bring it in range. To detect true exhaustion (where `advance_preparation()` is a no-op), capture the interval before and after and compare: ```rust while !secret_key.is_prepared_for(slot) { let before = secret_key.get_prepared_interval(); secret_key.advance_preparation(); if secret_key.get_prepared_interval() == before { // No-op: key activation interval is fully exhausted return Err(KeyManagerError::SigningError(format!( "XMSS key exhausted for validator {validator_id}: \ slot {slot} is beyond the key's activation interval" ))); } } ``` This requires exposing `get_prepared_interval()` on `ValidatorSecretKey`, but correctly distinguishes "key exhausted" from "slot is multiple windows ahead". How can I resolve this? If you propose a fix, please make it concise.

Generates a production-scheme XMSS key with 3 bottom trees and measures the time to slide the preparation window forward. Marked #[ignore] since key generation takes ~1.6s even in release mode.

MegaRedHand

The while loop is redundant

MegaRedHand · 2026-04-07T22:09:08Z


+        // Advance XMSS key preparation window if the slot is outside the current window.
+        // Each bottom tree covers 65,536 slots; the window holds 2 at a time.
+        if !secret_key.is_prepared_for(slot) {


I opened #262 to tackle this later

Got it, thanks for tracking it.

…tages The previous loop returned an error after a single failed advance, but a node that was offline for more than one window (65,536 slots) needs multiple advances. Now detects true key exhaustion by comparing the prepared interval before and after advance_preparation().

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

Add test for advance_preparation duration (526ms in release mode)

e1e2583

Generates a production-scheme XMSS key with 3 bottom trees and measures the time to slide the preparation window forward. Marked #[ignore] since key generation takes ~1.6s even in release mode.

MegaRedHand mentioned this pull request Apr 7, 2026

Advance XMSS preparation window in the background #262

Open

MegaRedHand requested changes Apr 7, 2026

View reviewed changes

MegaRedHand approved these changes Apr 8, 2026

View reviewed changes

MegaRedHand merged commit 172d9b0 into main Apr 8, 2026
3 checks passed

MegaRedHand deleted the xmss-advance-key-preparation branch April 8, 2026 18:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Advance XMSS key preparation window before signing#261

Advance XMSS key preparation window before signing#261
MegaRedHand merged 3 commits intomainfrom
xmss-advance-key-preparation

pablodeymo commented Apr 7, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

greptile-apps bot commented Apr 7, 2026

Important Files Changed

Uh oh!

github-actions bot commented Apr 7, 2026

Uh oh!

greptile-apps bot Apr 7, 2026

Uh oh!

MegaRedHand left a comment

Uh oh!

MegaRedHand Apr 7, 2026 •

edited

Loading

Uh oh!

pablodeymo Apr 8, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pablodeymo commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Test plan

Uh oh!

github-actions bot commented Apr 7, 2026

🤖 Codex Code Review

Uh oh!

greptile-apps bot commented Apr 7, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

github-actions bot commented Apr 7, 2026

🤖 Claude Code Review

Code Review: Advance XMSS key preparation window before signing

Bug: while loop is effectively single-iteration (key_manager.rs:110–118)

Minor: Misleading error text

Rest of the PR

Uh oh!

greptile-apps bot Apr 7, 2026

Choose a reason for hiding this comment

Uh oh!

MegaRedHand left a comment

Choose a reason for hiding this comment

Uh oh!

MegaRedHand Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pablodeymo Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pablodeymo commented Apr 7, 2026 •

edited

Loading

Bug: `while` loop is effectively single-iteration (`key_manager.rs:110–118`)

MegaRedHand Apr 7, 2026 •

edited

Loading