Skip to content

feat: add revision result summary utility#393

Merged
eqrx merged 1 commit into
package-operator:mainfrom
perdasilva:result-summary
Dec 14, 2025
Merged

feat: add revision result summary utility#393
eqrx merged 1 commit into
package-operator:mainfrom
perdasilva:result-summary

Conversation

@perdasilva
Copy link
Copy Markdown
Contributor

@perdasilva perdasilva commented Dec 5, 2025

Summary

⚠️ Brings the changes proposed in operator-framework/operator-controller#2354 to this project

This effort came from @camilamacedo86. I'm just shepherding it over into the Boxcutter library with some refactoring for the changes in the probe interface (I've added the probe unknown case as well).

This PR introduces utility functions to format boxcutter reconciliation and teardown reports into concise, human-readable summaries. This makes debugging easier without enabling verbose V(1) logging and solves the scenarios reported over huge logs

Example Scenarios

Scenario 1: Object Collision During Reconcile

What happens: A ServiceAccount already exists and is managed by a different controller.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:32:18.445Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "apply failed: server rejected request",
  "summary": "phases with issues: deploy: 2 collision(s): ServiceAccount test-ns/argocd-operator-controller-manager, ServiceAccount test-ns/argocd-operator-metrics-reader; status: incomplete",
  "controller": "clusterextensionrevision",
  "controllerGroup": "olm.operatorframework.io",
  "controllerKind": "ClusterExtensionRevision",
  "ClusterExtensionRevision": {
    "name": "argocd-operator-1"
  },
  "namespace": "",
  "name": "argocd-operator-1",
  "reconcileID": "8f7e3d9a-b2c1-4e5f-a6d7-1c8e9f0a2b3d"
}

Key Information:

  • error: The original error from boxcutter
  • summary: Human-readable explanation showing 2 colliding ServiceAccounts
  • Easy to identify what's blocking the installation

Scenario 2: Multiple Collisions (Shows First 3)

What happens: Many resources collide, but we limit output to first 3.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:33:45.678Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "apply failed: multiple conflicts detected",
  "summary": "phases with issues: deploy: 5 collision(s) [showing first 3: ConfigMap test-ns/operator-config, ServiceAccount test-ns/operator-sa, Secret test-ns/operator-token]; status: incomplete",
  "controller": "clusterextensionrevision",
  "ClusterExtensionRevision": {
    "name": "my-operator-1"
  }
}

Key Information:

  • Shows total count (5) but only lists first 3
  • Prevents log spam while still being informative

Scenario 3: Probe Failure During Rollout

What happens: Deployment is created but not becoming ready.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:35:22.891Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "rollout incomplete: timeout waiting for resources",
  "summary": "phases with issues: deploy: 1 probe failure(s): 'progress' probe for Deployment test-ns/argocd-operator-controller-manager failed; status: in transition",
  "controller": "clusterextensionrevision",
  "controllerGroup": "olm.operatorframework.io",
  "controllerKind": "ClusterExtensionRevision",
  "ClusterExtensionRevision": {
    "name": "argocd-operator-1"
  }
}

Key Information:

  • Identifies which specific Deployment is failing probes
  • Shows the probe name ('progress')
  • Status indicates it's still transitioning

Scenario 4: Probe Unknown During Rollout

What happens: Deployment is created but status could not yet be observed.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:35:22.891Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "rollout incomplete: timeout waiting for resources",
  "summary": "phases with issues: deploy: 1 probe unknowns(s): 'progress' probe unknown for Deployment test-ns/argocd-operator-controller-manager; status: in transition",
  "controller": "clusterextensionrevision",
  "controllerGroup": "olm.operatorframework.io",
  "controllerKind": "ClusterExtensionRevision",
  "ClusterExtensionRevision": {
    "name": "argocd-operator-1"
  }
}

Key Information:

  • Identifies which specific Deployment is failing probes
  • Shows the probe name ('progress')
  • Status indicates it's still transitioning

Scenario 5: Validation Error

What happens: Preflight validation fails (e.g., CRD upgrade safety check).

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:37:10.234Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "validation failed",
  "summary": "validation error: CRD validation failed for CustomResourceDefinition mycrds.example.com; phases with issues: preflight: validation error",
  "controller": "clusterextensionrevision",
  "ClusterExtensionRevision": {
    "name": "example-operator-2"
  }
}

Key Information:

  • Shows validation error at the top level
  • Indicates which phase encountered the validation issue

Scenario 6: Teardown Blocked by Finalizer

What happens: Trying to teardown but resource has blocking finalizer.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:38:45.123Z",
  "logger": "cluster-extension-revision",
  "msg": "revision teardown failed",
  "error": "teardown incomplete: resources still exist",
  "summary": "waiting on phases: deploy, cleanup; incomplete phases: deploy; 1 phase(s) completed",
  "controller": "clusterextensionrevision",
  "controllerGroup": "olm.operatorframework.io",
  "controllerKind": "ClusterExtensionRevision",
  "ClusterExtensionRevision": {
    "name": "argocd-operator-1"
  }
}

Key Information:

  • Lists which phases are waiting
  • Shows which phases are incomplete vs completed
  • Helps diagnose stuck teardowns

Scenario 7: Successful Teardown (Info Level)

What happens: Teardown completes successfully.

Controller Logs:

{
  "level": "info",
  "ts": "2025-11-20T14:40:12.567Z",
  "logger": "cluster-extension-revision",
  "msg": "teardown report",
  "report": "teardown completed successfully",
  "controller": "clusterextensionrevision",
  "controllerGroup": "olm.operatorframework.io",
  "controllerKind": "ClusterExtensionRevision",
  "ClusterExtensionRevision": {
    "name": "argocd-operator-1"
  }
}

Key Information:

  • Logged at INFO level (not ERROR)
  • Confirms successful teardown
  • Clean and concise

Scenario 8: Mixed Issues in Multiple Phases

What happens: Multiple phases have different types of issues.

Controller Logs:

{
  "level": "error",
  "ts": "2025-11-20T14:42:30.890Z",
  "logger": "cluster-extension-revision",
  "msg": "revision reconcile failed",
  "error": "multiple failures during rollout",
  "summary": "phases with issues: preflight: validation error, deploy: 1 collision(s): ServiceAccount test-ns/operator-sa, configure: 1 failed object(s): ConfigMap test-ns/config (action: update); status: incomplete",
  "controller": "clusterextensionrevision",
  "ClusterExtensionRevision": {
    "name": "complex-operator-1"
  }
}

Motivation:

https://issues.redhat.com/browse/OCPBUGS-62964

Change Type

New Feature

Check List Before Merging

  • This PR passes all pre-commit hook validations.
  • This PR is fully tested and regression tests are included.
  • Relevant documentation has been updated.

Additional Information

Co-authored-by: Camila Macedo <7708031+camilamacedo86@users.noreply.github.com>
Co-authored-by: Per Goncalves da Silva <pegoncal@redhat.com>
Signed-off-by: Per Goncalves da Silva <pegoncal@redhat.com>
@perdasilva perdasilva requested a review from a team as a code owner December 5, 2025 08:41
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 5, 2025

Walkthrough

Introduces a new Go utility module for generating concise, human-readable summaries from boxcutter reconciliation and teardown results. Adds functions to summarize revision results, phases, objects, probe outcomes, and teardown operations, with comprehensive unit tests covering various scenarios including validation errors, object collisions, probe failures, and in-transition states.

Changes

Cohort / File(s) Summary
Summary Utility Module
util/summary.go
Adds 7 functions for generating human-readable status messages: SummarizeRevisionResult() for reconciliation summaries, summarizePhases() for phase aggregation, summarizeObjects() for object/collision/probe reporting, getObjectInfo() for object descriptors, allProbesSuccessful() for probe validation, SummarizeRevisionTeardownResult() for teardown summaries, and summarizeTeardownPhases() for teardown phase reporting.
Summary Tests
util/summary_test.go
Introduces comprehensive test coverage with mock implementations (mockRevisionResult, mockPhaseResult, mockObjectResult, etc.) and 20+ test cases validating nil handling, successful reconciliation, validation errors, collision reporting (with limits), probe failures/unknowns, in-transition messaging, teardown completion, and object info formatting.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Collision limiting logic: Verify the "first 3 collisions" filtering and count summary are correctly implemented
  • Probe result handling: Ensure all probe status types (success, failure, unknown) are properly summarized and propagated
  • Nil and edge case handling: Confirm nil checks and empty collections are handled consistently across all functions
  • String aggregation correctness: Review conditional logic for appending status hints, phase summaries, and teardown phase details to ensure appropriate message composition in all branches
  • Test mock completeness: Validate that mock implementations satisfy all required machinery interfaces and cover important edge cases

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: add revision result summary utility' is clear, specific, and directly reflects the main change: introduction of new utility functions for summarizing revision results.
Description check ✅ Passed The PR description comprehensively addresses all required template sections: Summary is detailed with context and motivation, Change Type is clearly marked as 'New Feature', and the Check List is present.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 77.93103% with 32 lines in your changes missing coverage. Please review.
✅ Project coverage is 68.82%. Comparing base (728d9e9) to head (dccfd42).
⚠️ Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
util/summary.go 77.93% 17 Missing and 15 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #393      +/-   ##
==========================================
+ Coverage   68.33%   68.82%   +0.49%     
==========================================
  Files          34       35       +1     
  Lines        2691     2836     +145     
==========================================
+ Hits         1839     1952     +113     
- Misses        728      745      +17     
- Partials      124      139      +15     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
util/summary.go (3)

1-9: Package name and imports look appropriate for this utility module.

The static analysis hint about "util" being a meaningless package name is valid from a purist standpoint, but this is a common and accepted pattern for general-purpose utility code. If the project prefers more descriptive names, consider renaming to something like summary or report.


121-136: Redundant call to ProbeResults() - store the result in a variable.

obj.ProbeResults() is called twice in this block (line 123 in the for-range and again at line 132). Store the result once and reuse it.

 		default:
+			probes := obj.ProbeResults()
 			// Check probe results
-			for probeType, probeResult := range obj.ProbeResults() {
+			for probeType, probeResult := range probes {
 				switch probeResult.Status {
 				case types.ProbeStatusFalse:
 					probeFailures = append(probeFailures, fmt.Sprintf("%q probe for %s failed", probeType, objInfo))
 				case types.ProbeStatusUnknown:
 					probeUnknowns = append(probeUnknowns, fmt.Sprintf("%q probe for %s unknown", probeType, objInfo))
 				}
 			}

-			probes := obj.ProbeResults()
 			if len(probes) == 0 || allProbesSuccessful(probes) {
 				successCount++
 			}

141-172: Consider extracting the "limit to first 3" logic into a helper function.

The pattern for limiting displayed items to first 3 is repeated three times. A helper function would reduce duplication:

func formatLimitedList(items []string, singular string, limit int) string {
    if len(items) > limit {
        return fmt.Sprintf("%d %s(s) [showing first %d: %s]", len(items), singular, limit, strings.Join(items[:limit], ", "))
    }
    return fmt.Sprintf("%d %s(s): %s", len(items), singular, strings.Join(items, ", "))
}
util/summary_test.go (1)

86-119: Good use of interface assertion.

The compile-time interface assertion at line 86 is a Go best practice. Consider adding similar assertions for the other mock types for consistency.

+var _ machinery.RevisionResult = mockRevisionResult{}
+var _ machinery.PhaseResult = mockPhaseResult{}
 var _ machinery.ObjectResult = &mockObjectResult{}
+var _ machinery.RevisionTeardownResult = mockRevisionTeardownResult{}
+var _ machinery.PhaseTeardownResult = mockPhaseTeardownResult{}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting

📥 Commits

Reviewing files that changed from the base of the PR and between 728d9e9 and dccfd42.

📒 Files selected for processing (2)
  • util/summary.go (1 hunks)
  • util/summary_test.go (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
util/summary.go (6)
machinery/revision.go (1)
  • RevisionResult (58-74)
machinery/phases.go (2)
  • PhaseResult (213-231)
  • PhaseTeardownResult (67-81)
machinery/results.go (3)
  • ObjectResult (14-34)
  • Action (278-278)
  • ActionCollision (293-293)
machinery/objects.go (1)
  • Object (60-63)
probing/probe.go (1)
  • Status (17-17)
machinery/types/probing.go (4)
  • ProbeStatusFalse (24-24)
  • ProbeStatusUnknown (26-26)
  • ProbeResultContainer (38-38)
  • ProbeStatusTrue (22-22)
🪛 golangci-lint (2.5.0)
util/summary.go

[warning] 1-1: var-naming: avoid meaningless package names

(revive)

util/summary_test.go

[warning] 1-1: var-naming: avoid meaningless package names

(revive)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: lint-unit-int
🔇 Additional comments (10)
util/summary.go (4)

11-49: LGTM!

The SummarizeRevisionResult function is well-structured with proper nil handling, clear logic flow, and appropriate default messaging.


51-97: LGTM!

The phase summarization logic correctly categorizes phases and provides a clear summary. The fallback to "unnamed" for empty phase names is a good defensive measure.


187-218: LGTM!

The getObjectInfo and allProbesSuccessful helper functions are clean, handle nil inputs appropriately, and follow Go idioms.


220-283: LGTM!

The teardown summarization functions are well-structured with proper nil handling and clear logic. The fallback message "teardown in progress" when no specific details are available is a good default.

util/summary_test.go (6)

16-85: LGTM!

The mock implementations for mockRevisionResult and mockPhaseResult are well-structured and implement the necessary interface methods. The String() methods returning placeholder text is appropriate for mocks.


121-183: LGTM!

The teardown mock implementations are adequate for the current test coverage. The Gone() and Waiting() methods aren't used by summarizeTeardownPhases, so their simplified logic doesn't affect test validity.


184-253: LGTM!

The revision result tests are comprehensive, covering nil input, success, validation errors, collisions, and probe states. Good use of t.Parallel() for concurrent test execution.


339-378: Good test for the "showing first 3" truncation behavior.

This test correctly validates that when there are more than 3 collisions, the summary shows the count and indicates truncation.


390-430: LGTM!

The teardown tests cover the key scenarios: complete, waiting phases, and incomplete phases. Assertions correctly validate the expected output strings.


432-474: LGTM!

The getObjectInfo tests thoroughly cover namespace-scoped objects, cluster-scoped objects, and nil handling.

@camilamacedo86
Copy link
Copy Markdown
Contributor

@thetechnick @erdii @eqrx

Could this one be accepted?
We have an OCPBUG open that would be fixed with it.

c/c @perdasilva

@eqrx eqrx merged commit 938880b into package-operator:main Dec 14, 2025
4 checks passed
@eqrx
Copy link
Copy Markdown
Member

eqrx commented Dec 14, 2025

Looks good to me, thank you for the ping :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants