Skip to content

fix(consensus): Inconsistent Safe And Finalized Heads#2163

Open
refcell wants to merge 1 commit intomainfrom
fix/follow-delegated-safe-finalized
Open

fix(consensus): Inconsistent Safe And Finalized Heads#2163
refcell wants to merge 1 commit intomainfrom
fix/follow-delegated-safe-finalized

Conversation

@refcell
Copy link
Copy Markdown
Contributor

@refcell refcell commented Apr 14, 2026

Summary

This change fixes a follow-node-only inconsistency between delegated safe and finalized updates. Before this patch, the L2 delegate actor sent safe and finalized signals as separate fire-and-forget requests, so a restart could leave the engine on a conservative safe head while the delegate path still enqueued a newer finalized target. If the safe forkchoice update returned Syncing, the engine's actual safe head did not advance, but the separate finalize request could still arrive and trip the invariant that finalized blocks must already be safe.

The fix adds a dedicated delegated forkchoice path for follow mode that applies delegated safe and finalized labels together inside the engine. The new task consolidates the delegated safe head first and then derives the finalized target from the engine's real post-consolidation safe head, which prevents follow nodes from finalizing past what the local EL has actually accepted while leaving validator and sequencer behavior unchanged. The PR also adds engine and service regressions covering the restart path and the case where delegated safe data is unavailable.

@refcell refcell added bug Flag: Something isn't working A-consensus labels Apr 14, 2026
@refcell refcell self-assigned this Apr 14, 2026
@cb-heimdall
Copy link
Copy Markdown
Collaborator

cb-heimdall commented Apr 14, 2026

🟡 Heimdall Review Status

Requirement Status More Info
Reviews 🟡 0/1
Denominator calculation
Show calculation
1 if user is bot 0
1 if user is external 0
2 if repo is sensitive 0
From .codeflow.yml 1
Additional review requirements
Show calculation
Max 0
0
From CODEOWNERS 0
Global minimum 0
Max 1
1
1 if commit is unverified 0
Sum 1

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
base Ignored Ignored Preview Apr 14, 2026 4:20pm

Request Review

@refcell refcell mentioned this pull request Apr 14, 2026
@refcell refcell force-pushed the fix/follow-delegated-safe-finalized branch from 43fbd1d to e5de3f7 Compare April 14, 2026 03:16
@refcell refcell requested a review from meyer9 April 14, 2026 03:16
}

// Consolidate and delegated forkchoice share equal priority.
(Self::Consolidate(_) | Self::DelegatedForkchoice(_), _) => Ordering::Equal,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PartialEq and Ord now disagree for Consolidate vs DelegatedForkchoice: eq() returns false (only same-variant pairs match at line 172-173), but cmp() returns Ordering::Equal here. Rust's Ord contract requires a == b iff cmp(a, b) == Equal.

This extends a pre-existing inconsistency (Seal vs GetPayload has the same issue), and since the only consumer is a BinaryHeap (which only calls cmp), there's no practical bug today. But if EngineTask were ever placed in a BTreeSet, used with .dedup() after sorting, or compared via == in code that assumes Ord-consistency, it would silently misbehave.

Non-blocking, but worth a tracking issue. The simplest fix: fn eq(&self, other: &Self) -> bool { self.cmp(other) == Ordering::Equal }.

@meyer9
Copy link
Copy Markdown
Contributor

meyer9 commented Apr 14, 2026

LGTM! Will approve once tested on our nodes

@refcell refcell force-pushed the fix/follow-delegated-safe-finalized branch from 483b612 to ae567e6 Compare April 14, 2026 16:20
@github-actions
Copy link
Copy Markdown
Contributor

Review Summary

Overall: Looks good. The fix correctly addresses the root cause — safe and finalized signals were sent independently, allowing finalized to race ahead of a safe head that the EL hadn't actually accepted yet.

What the PR does well

  • Atomic delegated labels: DelegatedForkchoiceTask::execute consolidates safe first, reads back the actual safe head from engine state, then clamps finalized to remote_finalized.min(actual_safe). This closes the invariant violation.
  • Clean separation: The new task type is scoped to follow-mode only; validator/sequencer paths remain untouched.
  • Good test coverage: Both the engine-level unit test (syncing_safe_update_skips_finalization_beyond_actual_safe) and the integration test (follow_restart_delegated_forkchoice_does_not_finalize_past_actual_safe_head) exercise the exact restart scenario described in the PR body. The safe_payload_unavailable test covers the graceful degradation path.
  • Removal of engine_head: Eliminating the stale engine_head tracking in DelegateL2DerivationActor and clamping to sent_head instead is the right call — it avoids the fresh-node genesis-pinning problem explained in the comments.

Notes on prior inline comments

  • The metrics label comment was already addressed — task_metrics_label() returns DELEGATED_FORKCHOICE_TASK_LABEL.
  • The PartialEq/Ord inconsistency comment identifies a pre-existing issue (also affects Seal vs GetPayload). The new variant follows the same pattern. Non-blocking, but a tracking issue would be worthwhile.

No new findings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Flag: Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants