HYPERFLEET-733 - fix: sentinel delays first event publish for newly created clusters#77
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis change introduces a prioritized three-part decision strategy in the reconciliation engine. It adds a highest-priority "never-processed" check that immediately publishes when Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
internal/engine/decision_test.go (1)
430-447: Add one precedence test where both triggers are true.Please add a case with
lastUpdated = time.Time{}andgeneration > observedGenerationto lock in that never-processed has priority over generation-based reconciliation.Suggested table case
+ { + name: "never processed takes priority over generation mismatch", + createdTime: now.Add(-1 * time.Hour), + lastUpdated: time.Time{}, + ready: true, + wantShouldPublish: true, + wantReasonContains: ReasonNeverProcessed, + description: "When both conditions match, never-processed should win by priority", + },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/engine/decision_test.go` around lines 430 - 447, Add a new test case in the decision_test.go table (the slice of test cases used by the decision evaluation) named e.g. "never processed takes precedence over generation" where lastUpdated is time.Time{} (zero), generation is greater than observedGeneration, ready can be true or false, wantShouldPublish is true and wantReasonContains is ReasonNeverProcessed; this ensures the never-processed branch (lastUpdated == zero) is preferred over a generation-based reconciliation trigger by asserting the decision returns publish=true and the reason contains ReasonNeverProcessed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/sentinel-operator-guide.md`:
- Around line 84-89: Update the intro paragraph that currently calls the model a
“dual-trigger” system to match the three-part reconciliation strategy described
in the section; in the "Sentinel's decision engine" intro replace "dual-trigger"
and the two-trigger list with wording that states a three-part decision strategy
and enumerates the three triggers: Never-Processed Reconciliation, State-Based
Reconciliation, and Time-Based Reconciliation so the intro and the detailed list
are consistent.
---
Nitpick comments:
In `@internal/engine/decision_test.go`:
- Around line 430-447: Add a new test case in the decision_test.go table (the
slice of test cases used by the decision evaluation) named e.g. "never processed
takes precedence over generation" where lastUpdated is time.Time{} (zero),
generation is greater than observedGeneration, ready can be true or false,
wantShouldPublish is true and wantReasonContains is ReasonNeverProcessed; this
ensures the never-processed branch (lastUpdated == zero) is preferred over a
generation-based reconciliation trigger by asserting the decision returns
publish=true and the reason contains ReasonNeverProcessed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a04a5f87-b88b-4286-9da0-646ffa09feee
📒 Files selected for processing (3)
docs/sentinel-operator-guide.mdinternal/engine/decision.gointernal/engine/decision_test.go
b5acc74 to
baafeb8
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/sentinel-operator-guide.md`:
- Line 104: The docs use mixed casing for the same status field—replace all
occurrences of status.LastUpdated and any "LastUpdated" in mermaid diagrams with
snake_case status.last_updated to match the API/CEL conventions (e.g.,
created_time) and update the CEL variables table and diagram labels accordingly
so the field name is consistently last_updated throughout the document.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: eaafd01f-8926-419c-9318-aca242aba0d6
📒 Files selected for processing (3)
docs/sentinel-operator-guide.mdinternal/engine/decision.gointernal/engine/decision_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/engine/decision.go
- internal/engine/decision_test.go
baafeb8 to
d4ac25e
Compare
xueli181114
left a comment
There was a problem hiding this comment.
Clean, well-structured change. The three-part decision priority (never-processed → generation → max-age) is sound and the code reads clearly.
What's good:
- The
LastUpdated.IsZero()check before the generation check is the right call — it eliminates thecreated_timefallback which was a leaky abstraction (mixing API creation metadata with adapter processing state) - Removing the
referenceTimefallback toCreatedTimesimplifies the max-age path and eliminates a subtle bug where newly created resources withReady=truewould wait 30m instead of publishing immediately - Precedence test (
TestDecisionEngine_Evaluate_NeverProcessedPrecedence) locks in the ordering — good defensive testing - Doc updates are thorough and the mermaid diagram is clearer than the previous split diagrams
Minor observations (non-blocking):
-
decision.go:95— ThereferenceTimevariable is now redundant. Since theIsZero()early return guaranteesLastUpdatedis non-zero at this point, you could useresource.Status.LastUpdateddirectly in theAdd(maxAge)call and drop the local variable. Not worth a respin, but worth noting. -
The
CreatedTimefield onResourceis now unused by the decision engine. If no other code path references it for decision-making, consider documenting thatCreatedTimeis purely informational (not used in reconciliation logic) to prevent future developers from re-introducing the fallback pattern.
/lgtm
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ciaranRoche, xueli181114 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
a86043c
into
openshift-hyperfleet:main
Summary
Adds a new first-priority reconciliation check that immediately publishes events for resources that have never been processed by an adapter (status.LastUpdated is zero). This eliminates first-event latency where new clusters would wait for max age intervals (10s for not-ready) instead of publishing within one poll interval (~5s).
Changes
Testing
make test-all0 issue.Summary by CodeRabbit
New Features
Improvements
Documentation
Tests