Skip to content

CNTRLPLANE-3548: Add v2 e2e test framework documentation#8641

Merged
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
bryan-cox:make-v2-docs
Jun 4, 2026
Merged

CNTRLPLANE-3548: Add v2 e2e test framework documentation#8641
openshift-merge-bot[bot] merged 1 commit into
openshift:mainfrom
bryan-cox:make-v2-docs

Conversation

@bryan-cox

@bryan-cox bryan-cox commented May 29, 2026

Copy link
Copy Markdown
Member

Summary

  • Adds a 5-page documentation section under how-to/ci/v2-testing/ explaining the v2 Ginkgo-based E2E test framework
  • Covers architecture, writing tests, CI pipeline configuration, debugging failures, and v1→v2 migration
  • Uses Azure self-managed as the reference implementation throughout; framework is platform-agnostic via PlatformConfig interface
  • Updates mkdocs.yml nav with new "V2 E2E Testing" subsection under CI

Pages

Page Content
index.md Architecture overview, two-repo model (hypershift + release), test execution flow, key concepts
writing-tests.md Framework patterns, Ginkgo labels (two-layer model), TestContext, env vars, assertions, running locally
ci-pipeline.md Step registry anatomy, four CI binaries, when to create clusters, adding tests to jobs, new platforms
debugging.md JUnit artifact naming, mapping failures to clusters, Ginkgo output, Informing test skips
migration.md V1→V2 architectural shift, helper refactoring prerequisite, porting checklist, common pitfalls

Reference PRs

Test plan

  • Verify docs render correctly via cd docs && mkdocs serve
  • Check mermaid diagrams render in all pages
  • Verify cross-links between pages work
  • Verify nav shows "V2 E2E Testing" subsection under CI

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
    • Added comprehensive V2 E2E Test Framework docs: overview of the v2 CI model, end-to-end execution flow, pipeline/job configuration knobs, and guidance for adding tests and jobs.
    • Added authoring, migration, and debugging guides for v2 tests, including test structure, label/filter guidance, failure triage, and local run instructions; updated site navigation to surface the new V2 testing resources.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 29, 2026
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 29, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@bryan-cox: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

  • Adds a 5-page documentation section under how-to/ci/v2-testing/ explaining the v2 Ginkgo-based E2E test framework
  • Covers architecture, writing tests, CI pipeline configuration, debugging failures, and v1→v2 migration
  • Uses Azure self-managed as the reference implementation throughout; framework is platform-agnostic via PlatformConfig interface
  • Updates mkdocs.yml nav with new "V2 E2E Testing" subsection under CI

Pages

Page Content
index.md Architecture overview, two-repo model (hypershift + release), test execution flow, key concepts
writing-tests.md Framework patterns, Ginkgo labels (two-layer model), TestContext, env vars, assertions, running locally
ci-pipeline.md Step registry anatomy, four CI binaries, when to create clusters, adding tests to jobs, new platforms
debugging.md JUnit artifact naming, mapping failures to clusters, Ginkgo output, Informing test skips
migration.md V1→V2 architectural shift, helper refactoring prerequisite, porting checklist, common pitfalls

Reference PRs

Test plan

  • Verify docs render correctly via cd docs && mkdocs serve
  • Check mermaid diagrams render in all pages
  • Verify cross-links between pages work
  • Verify nav shows "V2 E2E Testing" subsection under CI

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new V2 E2E testing documentation section and mkdocs navigation: an index/overview, a writing-tests guide, CI pipeline and step-registry documentation, a debugging guide for CI failures, and a migration guide from V1 to V2. Pages document v2 binaries (create-guests, run-tests, dump-guests, destroy-guests), step execution order, TestContext usage and labeling, CI job/platform wiring, local run instructions, and CI regeneration steps.

Suggested reviewers

  • csrwng
  • sdminonne

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error)

Check name Status Explanation Resolution
Stable And Deterministic Test Names ❌ Error Documentation example uses Context(workload.Name, ...) with dynamic variable in test hierarchy, violating requirement for stable test names. Refactor the canonical test pattern example to use static test names (e.g., Context with static descriptions) rather than embedding workload.Name directly in test block creation.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Test Structure And Quality ✅ Passed PR is documentation-only (v2 test guides + mkdocs nav). No Ginkgo test code added/modified, so test code review check is not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed PR contains only documentation (5 markdown files + mkdocs.yml navigation). No deployment manifests, operator code, controllers, or scheduling constraints present. Check not applicable.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds documentation only (5 markdown files, 1 config file). No Go test code files were added. The check applies to actual Ginkgo test implementations, not documentation.
No-Weak-Crypto ✅ Passed PR is documentation-only (5 markdown files + 1 YAML navigation config). Zero matches for MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB, custom crypto, or insecure comparisons across all changed files.
Container-Privileges ✅ Passed Pull request is documentation-only (5 new markdown files + mkdocs.yml nav update); no container/K8s manifests present, so container-privileges check is not applicable.
No-Sensitive-Data-In-Logs ✅ Passed All 5 documentation files (1,051 lines) scanned for sensitive data: no passwords, tokens, API keys, PII, session IDs, internal hostnames, or customer data found.
Title check ✅ Passed The title 'CNTRLPLANE-3548: Add v2 e2e test framework documentation' clearly and specifically summarizes the main change: adding comprehensive documentation for the v2 end-to-end test framework.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the area/documentation Indicates the PR includes changes for documentation label May 29, 2026
@openshift-ci

openshift-ci Bot commented May 29, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels May 29, 2026

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/content/how-to/ci/v2-testing/index.md (1)

1-100: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Run docs spell-check before merge.

Please run make verify-codespell for these markdown additions and fix any flagged typos before landing.

As per coding guidelines, **/*.md: For markdown files, use make verify-codespell to catch spelling errors.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/content/how-to/ci/v2-testing/index.md` around lines 1 - 100, Run the
repository spell-check target and fix any markdown typos flagged in this new
documentation: run make verify-codespell (or the project’s verify-codespell
target) and correct all issues reported for
docs/content/how-to/ci/v2-testing/index.md (and any other **/*.md files
flagged), then re-run the check until it passes and commit the corrected
file(s); ensure the final commit contains the typo fixes referenced by the
spell-checker.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/content/how-to/ci/v2-testing/ci-pipeline.md`:
- Around line 218-223: The PlatformConfig interface in the docs is incorrect:
update the example to match the real v2 lifecycle contract by using the correct
method signatures and types; replace the current methods ClusterSpecs(),
TestMatrix(), and PostCreate(ctx context.Context, cluster ClusterSpec) error
with the actual v2 signatures (use the real parameter and return types used in
the implementation—e.g., correct context usage, cluster parameter type/name, and
return values for PostCreate) and ensure referenced types ClusterSpec and
TestGroup align with their true definitions so readers implement the correct
shape for PlatformConfig.

In `@docs/content/how-to/ci/v2-testing/debugging.md`:
- Around line 30-33: The three fenced code blocks that contain console/test
output (e.g., the blocks showing "Running public tests against
public-a1b2c3d4e5..." / "[RUNNING] Control Plane Workloads..." / "[SKIP]
informing test failure...") are missing language identifiers and trigger MD040;
update each triple-backtick fence to include a language tag such as ```text for
those output blocks, and repeat the same fix for the other occurrences noted
(around the other snippets at 43-45 and 102-104); after updating the fences, run
the markdown checks and make verify-codespell as per the repo guidelines to
ensure linting and spelling pass.

In `@docs/content/how-to/ci/v2-testing/index.md`:
- Around line 80-88: The fenced directory-structure block in docs content lacks
a language hint; update the code fence that surrounds the tree (the block
starting with ``` and ending with ```) to include a language identifier (e.g.,
change ``` to ```text) so the markdown linter MD040 is satisfied and the snippet
is treated as plain text.

In `@docs/content/how-to/ci/v2-testing/migration.md`:
- Around line 26-30: Replace the legacy call to framework.NewTestContext(ctx,
cancel) with the v2 TestContext initialization pattern: construct and assign the
v2 TestContext (type TestContext) using the v2 constructor/initializer defined
in the v2 test contract and update the ValidateHostedCluster invocation to call
the v2-compatible API (pass the v2 TestContext or call the v2 helper) instead of
framework.ValidateHostedCluster; ensure the local variable testCtx is the v2
TestContext instance and matches the signatures used throughout the v2 tests
(e.g., the constructors and methods declared in the v2 TestContext type).
- Around line 132-139: The example register function is outdated; update the
documented signature to match current v2 suites by changing the zero-arg example
to accept the shared context getter (e.g. use RegisterMyFeatureTests(getTestCtx
internal.TestContextGetter)) and mention wiring shared context via the provided
getTestCtx inside the Describe block; reference the symbol TestContextGetter and
the RegisterMyFeatureTests function so readers adopt the correct pattern.
- Line 130: Replace invalid Go build constraint occurrences of the literal
string "//go:build e2ev2,backuprestore" with a proper boolean expression, e.g.
"//go:build e2ev2 && backuprestore" (or use "||" if an OR is intended); update
every instance (the ones noted in the docs and the similar occurrence in
writing-tests) so the build tag becomes a valid boolean expression.

---

Outside diff comments:
In `@docs/content/how-to/ci/v2-testing/index.md`:
- Around line 1-100: Run the repository spell-check target and fix any markdown
typos flagged in this new documentation: run make verify-codespell (or the
project’s verify-codespell target) and correct all issues reported for
docs/content/how-to/ci/v2-testing/index.md (and any other **/*.md files
flagged), then re-run the check until it passes and commit the corrected
file(s); ensure the final commit contains the typo fixes referenced by the
spell-checker.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 8fc89443-c94e-4d95-8fef-385828726770

📥 Commits

Reviewing files that changed from the base of the PR and between 988f2be and 35b5621.

📒 Files selected for processing (6)
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/index.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/content/how-to/ci/v2-testing/writing-tests.md
  • docs/mkdocs.yml

Comment thread docs/content/how-to/ci/v2-testing/ci-pipeline.md
Comment thread docs/content/how-to/ci/v2-testing/debugging.md Outdated
Comment thread docs/content/how-to/ci/v2-testing/index.md Outdated
Comment thread docs/content/how-to/ci/v2-testing/migration.md
Comment thread docs/content/how-to/ci/v2-testing/migration.md Outdated
Comment thread docs/content/how-to/ci/v2-testing/migration.md
@bryan-cox

Copy link
Copy Markdown
Member Author

/retest

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8641 June 2, 2026 15:10 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/content/how-to/ci/v2-testing/writing-tests.md`:
- Around line 278-284: Guard against a nil Replicas pointer before dereferencing
in the BeforeEach block: in the BeforeEach that calls getNodePool(), check
nodePool.Spec.Replicas for nil and only dereference it if non-nil (e.g., set
originalReplicas = *nodePool.Spec.Replicas), otherwise assign a safe default
(e.g., 0) so the test example using getNodePool, nodePool and originalReplicas
does not panic.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5b6c8e55-c8de-44ad-a3e5-bfe8746180bc

📥 Commits

Reviewing files that changed from the base of the PR and between 35274b6 and e2306a5.

⛔ Files ignored due to path filters (1)
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
📒 Files selected for processing (6)
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/index.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/content/how-to/ci/v2-testing/writing-tests.md
  • docs/mkdocs.yml
✅ Files skipped from review due to trivial changes (3)
  • docs/mkdocs.yml
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md
  • docs/content/how-to/ci/v2-testing/migration.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/content/how-to/ci/v2-testing/debugging.md

Comment thread docs/content/how-to/ci/v2-testing/writing-tests.md
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8641 June 2, 2026 15:30 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/content/how-to/ci/v2-testing/writing-tests.md (1)

280-284: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a nil guard before dereferencing nodePool.Spec.Replicas.

The example still does originalReplicas = *nodePool.Spec.Replicas directly. If Replicas is nil, this panics and propagates a crash-prone pattern in docs.

Proposed fix
 var _ = Describe("NodePool Lifecycle", Label("lifecycle", "nodepool-lifecycle"), func() {
     var originalReplicas int32
     BeforeEach(func() {
         // Capture original state
         nodePool := getNodePool()
+        if nodePool.Spec.Replicas == nil {
+            Skip("nodepool replicas is nil")
+        }
         originalReplicas = *nodePool.Spec.Replicas
     })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/content/how-to/ci/v2-testing/writing-tests.md` around lines 280 - 284,
The test's BeforeEach currently dereferences nodePool.Spec.Replicas without a
nil check which can panic; update the BeforeEach that calls getNodePool() to
guard before dereferencing: verify nodePool and nodePool.Spec and
nodePool.Spec.Replicas are non-nil, and only then set originalReplicas =
*nodePool.Spec.Replicas, otherwise set originalReplicas to a safe default (e.g.,
0) or handle the nil case explicitly so no dereference occurs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@docs/content/how-to/ci/v2-testing/writing-tests.md`:
- Around line 280-284: The test's BeforeEach currently dereferences
nodePool.Spec.Replicas without a nil check which can panic; update the
BeforeEach that calls getNodePool() to guard before dereferencing: verify
nodePool and nodePool.Spec and nodePool.Spec.Replicas are non-nil, and only then
set originalReplicas = *nodePool.Spec.Replicas, otherwise set originalReplicas
to a safe default (e.g., 0) or handle the nil case explicitly so no dereference
occurs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 627e7890-a2e2-408e-841b-5da25ce6cc7c

📥 Commits

Reviewing files that changed from the base of the PR and between e2306a5 and 0989f1c.

⛔ Files ignored due to path filters (1)
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
📒 Files selected for processing (6)
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/index.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/content/how-to/ci/v2-testing/writing-tests.md
  • docs/mkdocs.yml
✅ Files skipped from review due to trivial changes (4)
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/mkdocs.yml
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8641 June 2, 2026 15:36 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/content/how-to/ci/v2-testing/writing-tests.md (1)

280-284: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard nodePool.Spec.Replicas before dereferencing in the lifecycle example.

This sample still dereferences *nodePool.Spec.Replicas without a nil guard, which can panic when copied into tests.

Proposed fix
     BeforeEach(func() {
         // Capture original state
         nodePool := getNodePool()
+        if nodePool.Spec.Replicas == nil {
+            Skip("nodepool replicas is nil")
+        }
         originalReplicas = *nodePool.Spec.Replicas
     })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/content/how-to/ci/v2-testing/writing-tests.md` around lines 280 - 284,
The BeforeEach block dereferences nodePool.Spec.Replicas without a nil check
which can panic; update the BeforeEach that calls getNodePool() so it guards
nodePool.Spec.Replicas before dereferencing: check if nodePool.Spec != nil and
nodePool.Spec.Replicas != nil, and if so set originalReplicas =
*nodePool.Spec.Replicas, otherwise set originalReplicas to a safe default (e.g.,
0) or handle the nil case appropriately to avoid a panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@docs/content/how-to/ci/v2-testing/writing-tests.md`:
- Around line 280-284: The BeforeEach block dereferences nodePool.Spec.Replicas
without a nil check which can panic; update the BeforeEach that calls
getNodePool() so it guards nodePool.Spec.Replicas before dereferencing: check if
nodePool.Spec != nil and nodePool.Spec.Replicas != nil, and if so set
originalReplicas = *nodePool.Spec.Replicas, otherwise set originalReplicas to a
safe default (e.g., 0) or handle the nil case appropriately to avoid a panic.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 351a2bd3-f1eb-48b9-abf5-b86e452e1782

📥 Commits

Reviewing files that changed from the base of the PR and between 0989f1c and f4d9be4.

⛔ Files ignored due to path filters (1)
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
📒 Files selected for processing (6)
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/index.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/content/how-to/ci/v2-testing/writing-tests.md
  • docs/mkdocs.yml
✅ Files skipped from review due to trivial changes (4)
  • docs/mkdocs.yml
  • docs/content/how-to/ci/v2-testing/debugging.md
  • docs/content/how-to/ci/v2-testing/migration.md
  • docs/content/how-to/ci/v2-testing/ci-pipeline.md

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8641 June 2, 2026 15:48 Inactive
@bryan-cox bryan-cox marked this pull request as ready for review June 2, 2026 15:56
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 2, 2026
@openshift-ci openshift-ci Bot requested review from muraee and sdminonne June 2, 2026 15:56

@cblecker cblecker left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good set of docs — the architecture overview, the label taxonomy, and the migration guide are all things that would have saved me time when ramping up on the framework. A few things to address before this lands:

Broken links (will silently fail in MkDocs):

  • writing-tests.md#labels in ci-pipeline.md → should be #labels-two-layer-model
  • #add-registertests-export-function in migration.md → anchors only work on headings, not list items; the existing canonical-test-pattern link already covers this

Factual errors:

  • PR #8527 reference says "AWS" but it's the Azure self-managed implementation
  • The NewPlatformConfig diff shows a "before" state that doesn't match the actual code (the "" case is already there)

Code example issue:

  • The lifecycle BeforeEach dereferences *nodePool.Spec.Replicas without a nil check — since this is a copy-paste teaching example, it should model the safe pattern (Convention #11)

A few smaller things inline — missing table entries for HostedClusterConfigured and GetHostedClusterRESTConfig(), a note on --ginkgo.focus for local debugging, and a suggestion to link the duplicated convention examples back to AGENTS.md rather than maintaining them in two places.

BeforeEach(func() {
// Capture original state
nodePool := getNodePool()
originalReplicas = *nodePool.Spec.Replicas

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nodePool.Spec.Replicas is *int32 and can be nil — dereferencing it here without a nil check will panic if the field isn't set. Since this is a teaching example that contributors will copy, it should model the correct pattern. AGENTS.md Convention #11 applies even in documentation.

BeforeEach(func() {
    nodePool := getNodePool()
    Expect(nodePool.Spec.Replicas).NotTo(BeNil(),
        "nodePool %s/%s should have replicas set", nodePool.Namespace, nodePool.Name)
    originalReplicas = *nodePool.Spec.Replicas
})

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added Expect(nodePool.Spec.Replicas).NotTo(BeNil()) guard before dereference — teaching examples should model Convention #11.


AI-assisted response via Claude Code


Regenerate CI config with `make jobs WHAT=openshift/hypershift` from the release repo root.

For a complete example, see the AWS v2 implementation in [openshift/hypershift#8527](https://github.com/openshift/hypershift/pull/8527).

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #8527 is the Azure self-managed lifecycle test port, not an AWS implementation. Should be "the Azure self-managed v2 implementation".

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Fixed to "Azure self-managed v2 implementation".


AI-assisted response via Claude Code


**`Sequential`** groups run their `Steps` one after another on the same cluster. If any step fails, remaining steps in that group are skipped. Use sequential groups for ordered workflows like upgrade → validate → downgrade.

See [Labels](writing-tests.md#labels) for how to control which tests run in each group.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken anchor — the header in writing-tests.md is ## Labels (two-layer model), which MkDocs renders as #labels-two-layer-model. migration.md step 6 already uses the correct anchor. Should be [Labels](writing-tests.md#labels-two-layer-model).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Fixed anchor to #labels-two-layer-model to match the actual MkDocs heading ID.


AI-assisted response via Claude Code

```

!!! note "Simplified example"
This example omits the `RegisterXxxTests` + `TestContextGetter` pattern used by real tests. See [Step 8](#add-registertests-export-function) and the [canonical test pattern](writing-tests.md#canonical-test-pattern) for the full structure.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MkDocs only generates anchor IDs for headings, not numbered list items, so #add-registertests-export-function won't resolve. Since the sentence already links to writing-tests.md#canonical-test-pattern (which is the right landing point), the [Step 8] internal anchor link can just be dropped — "See the canonical test pattern for the full structure" covers it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Removed the dead anchor link — the canonical test pattern link already covers it.


AI-assisted response via Claude Code

```diff
func NewPlatformConfig(platform, sharedDir string) (PlatformConfig, error) {
switch platform {
- case "azure":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "before" state shown here (case "azure":) doesn't match what's actually in the code — lifecycle/platform.go already has case "azure", "":. The diff makes it look like adding "" is part of onboarding a new platform, which would confuse someone trying to follow these steps. The starting side should show the current actual state:

 func NewPlatformConfig(platform, sharedDir string) (PlatformConfig, error) {
     switch platform {
     case "azure", "":
         return NewAzurePlatformConfig(sharedDir), nil
+    case "my-platform":
+        return NewMyPlatformConfig(sharedDir), nil
     default:
         return nil, fmt.Errorf("unsupported platform %q (supported: azure)", platform)
     }
 }

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Updated diff to show the correct current state (case "azure", "" already present) and used a generic my-platform example.


AI-assisted response via Claude Code

| `ArtifactDir` | Directory for test artifacts, from the `ARTIFACT_DIR` environment variable |
| `ValidateHostedCluster()` | Skips if no cluster configured; panics if fetch fails |
| `ValidateHostedClusterClient()` | Calls `ValidateHostedCluster()`, then panics if the hosted cluster client is nil (e.g., kubeconfig not ready) |
| `Context` | Embedded `context.Context` — use for all API calls |

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two exported members are missing from this table:

  • HostedClusterConfigured (bool) — set to true when both ClusterName and ClusterNamespace are populated. This is what ValidateHostedCluster() checks internally to decide between skip and fetch.
  • GetHostedClusterRESTConfig() — returns the raw *rest.Config for the hosted cluster, lazy-loaded via sync.Once. Useful when a test needs a kubernetes.Interface client or a custom REST client beyond what GetHostedClusterClient() provides.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added HostedClusterConfigured and GetHostedClusterRESTConfig() to the TestContext table.


AI-assisted response via Claude Code

E2E_SHOW_ENV_HELP=1 bin/test-e2e-v2
```

## Assertions and gotchas

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several subsections here duplicate AGENTS.md nearly verbatim, which creates a maintenance split — if a convention changes, it has to be updated in two places and they'll drift:

The tutorial value here comes from things like the Eventually timeout guidance and the lifecycle worked example, which genuinely extend what AGENTS.md covers. For the straight duplicates, linking out would be cleaner:

For pointer safety, vacuous pass prevention, IPv6 URL construction, and the non-lifecycle vs. lifecycle mutation rule, see AGENTS.md conventions #11, #13, #15, #16.

This keeps AGENTS.md as the authoritative source for coding conventions and writing-tests.md as the tutorial for framework concepts and patterns.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Replaced duplicated convention examples with a reference to AGENTS.md conventions #11, #13, #15, #16. Kept the Eventually guidance and lifecycle worked example — those add tutorial value beyond what AGENTS.md covers.


AI-assisted response via Claude Code

export E2E_HOSTED_CLUSTER_NAMESPACE=clusters

make e2ev2
bin/test-e2e-v2 --ginkgo.label-filter="hosted-cluster-health" --ginkgo.v

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding --ginkgo.focus here — it's probably the first thing someone reaches for when a single test is failing and they want to iterate quickly:

# Run a single test by name (regex match against the full description path)
bin/test-e2e-v2 --ginkgo.focus="should not indicate rapid rollouts" --ginkgo.v

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Added --ginkgo.focus example for single-test debugging.


AI-assisted response via Claude Code

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8641 June 3, 2026 10:01 Inactive
@hypershift-jira-solve-ci

Copy link
Copy Markdown

I now have the complete root cause. Here is the analysis:

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

docs/content/reference/aggregated-docs.md: needs update
Process completed with exit code 1.

Summary

The Verify job failed at the "dirty tree" check (git update-index --refresh) because the committed docs/content/reference/aggregated-docs.md in the PR is stale. The make generate update step re-ran the docs-aggregator tool against the merge commit (PR branch + current main), which now includes spot-instances documentation from PR #8485 (merged to main at 09:02 UTC on June 3, ~57 minutes before this CI run started at 09:59 UTC). The PR's version of aggregated-docs.md was generated against an older main that did not include those docs, so the regenerated file differs from the committed one, causing the dirty-tree check to fail.

Root Cause

The docs-aggregator tool (hack/tools/docs-aggregator/main.go) walks all .md files under docs/content/ and aggregates them into a single file at docs/content/reference/aggregated-docs.md. This file is checked into the repo and must stay in sync with the source docs.

The CI workflow (verify-reusable.yaml) checks out the merge commit of the PR against current main, runs make generate update (which re-runs the docs-aggregator), and then verifies no files changed. The sequence of events:

  1. May 29: PR CNTRLPLANE-3548: Add v2 e2e test framework documentation #8641 was created. The author ran the docs-aggregator and committed aggregated-docs.md based on main as of that date, which included the 5 new v2-testing docs.
  2. June 2–3: PR OCPSTRAT-1677: Add Spot instances documentation #8485 (OCPSTRAT-1677: Add Spot instances documentation) was merged to main at 09:02 UTC on June 3, adding docs/content/how-to/automated-machine-management/spot-instances.md and updating aggregated-docs.md accordingly.
  3. June 3 at 09:59 UTC: CI for PR CNTRLPLANE-3548: Add v2 e2e test framework documentation #8641 runs. It checks out the merge commit (PR + current main). Now main contains the new spot-instances doc. The docs-aggregator re-generates aggregated-docs.md including the spot-instances content (300 total docs), but the PR's committed version lacks it → file differs → git update-index --refresh reports "needs update" → exit code 1.

This is a stale generated file problem, not a code bug. The PR's branch simply needs to be rebased on current main and the aggregated-docs.md regenerated.

Recommendations
  1. Rebase the PR branch on current main and regenerate the aggregated docs:

    git fetch origin main
    git rebase origin/main
    make generate update
    git add docs/content/reference/aggregated-docs.md
    git commit --amend --no-edit
    git push --force-with-lease
  2. Alternatively, just re-run the generator without a full rebase (merge main into the branch):

    git fetch origin main
    git merge origin/main
    make generate update
    git add docs/content/reference/aggregated-docs.md
    git commit -m "docs: regenerate aggregated-docs.md"
    git push
  3. For future PRs that touch docs/content/: Always regenerate aggregated-docs.md immediately before pushing, especially if other doc PRs have merged to main since the branch was created.

Evidence
Evidence Detail
Failed step Step 7: git update-index --refresh in verify-reusable.yaml
Error message docs/content/reference/aggregated-docs.md: needs update
Docs-aggregator output Successfully aggregated 300 documentation files to docs/content/reference/aggregated-docs.md
Root cause PR PR #8485 (Spot instances docs) merged to main at 09:02 UTC on June 3, 2026
CI run start 09:59 UTC on June 3, 2026 — 57 minutes after #8485 merged
PR creation date May 29, 2026 — before #8485 merged
Files added by #8485 docs/content/how-to/automated-machine-management/spot-instances.md (new doc not in PR #8641's aggregated-docs.md)
Generator tool hack/tools/docs-aggregator/main.go — walks all .md under docs/content/
CI workflow .github/workflows/verify-reusable.yaml — runs make generate update then checks for dirty files

Add a 5-page documentation section under how-to/ci/v2-testing/
explaining the v2 Ginkgo-based E2E test framework:

- index.md: Architecture overview, two-repo model, test execution flow
- writing-tests.md: Framework patterns, labels, TestContext, assertions
- ci-pipeline.md: Release repo config, CI binaries, adding tests/platforms
- debugging.md: JUnit artifacts, failure tracing, Informing test skips
- migration.md: V1 to V2 porting guide, helper refactoring, pitfalls

Azure self-managed is the reference implementation. The framework is
platform-agnostic via the PlatformConfig interface.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@bryan-cox bryan-cox changed the title NO-JIRA: Add v2 e2e test framework documentation CNTRLPLANE-3548: Add v2 e2e test framework documentation Jun 3, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 3, 2026

Copy link
Copy Markdown

@bryan-cox: This pull request references CNTRLPLANE-3548 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds a 5-page documentation section under how-to/ci/v2-testing/ explaining the v2 Ginkgo-based E2E test framework
  • Covers architecture, writing tests, CI pipeline configuration, debugging failures, and v1→v2 migration
  • Uses Azure self-managed as the reference implementation throughout; framework is platform-agnostic via PlatformConfig interface
  • Updates mkdocs.yml nav with new "V2 E2E Testing" subsection under CI

Pages

Page Content
index.md Architecture overview, two-repo model (hypershift + release), test execution flow, key concepts
writing-tests.md Framework patterns, Ginkgo labels (two-layer model), TestContext, env vars, assertions, running locally
ci-pipeline.md Step registry anatomy, four CI binaries, when to create clusters, adding tests to jobs, new platforms
debugging.md JUnit artifact naming, mapping failures to clusters, Ginkgo output, Informing test skips
migration.md V1→V2 architectural shift, helper refactoring prerequisite, porting checklist, common pitfalls

Reference PRs

Test plan

  • Verify docs render correctly via cd docs && mkdocs serve
  • Check mermaid diagrams render in all pages
  • Verify cross-links between pages work
  • Verify nav shows "V2 E2E Testing" subsection under CI

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Documentation
  • Added comprehensive V2 E2E Test Framework docs: overview of the v2 CI model, end-to-end execution flow, pipeline/job configuration knobs, and guidance for adding tests and jobs.
  • Added authoring, migration, and debugging guides for v2 tests, including test structure, label/filter guidance, failure triage, and local run instructions; updated site navigation to surface the new V2 testing resources.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@csrwng csrwng left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue with mermaid diagram rendering. Otherwise lgtm
Really good stuff.

### Step Execution Order

```mermaid
flowchart LR

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mermaid diagrams using ["..."] node syntax aren't rendering text labels correctly with the current mermaid2 plugin + mermaid@11.2.0 setup — node shapes appear but text is empty. This affects both diagrams in this file and the one in index.md.

Consider switching to unquoted or parenthesis syntax, e.g.:

Suggested change
flowchart LR
flowchart LR
A(create-selfmanaged-guests) --> B(run-e2e-v2-selfmanaged)
B --> C(dump-selfmanaged-guests)
C --> D(destroy-selfmanaged-guests)

You'll want to test all three diagrams (this one, the decision tree at line 164, and the architecture diagram in index.md) with mkdocs serve after fixing.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image Screenshot From 2026-06-03 16-22-57

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Cesar! The mermaid diagrams use the same mermaid2 plugin configuration as all other docs pages (e.g., agentic-sdlc.md, ci-infrastructure.md). The CI "Build Docs" step passes with mkdocs build --strict and the plugin correctly detects all diagrams. The rendering issue in the screenshots looks like a client-side JS loading issue on the Cloudflare Pages preview — the mermaid JS is loaded from unpkg.com and can sometimes fail to execute on first load. A page refresh usually resolves it. If you're seeing it persistently, it may be worth opening a separate issue to investigate the mermaid2 plugin + Cloudflare Pages interaction.


AI-assisted response via Claude Code

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @bryan-cox !
Indeed, the issue was with my browser. I switched to a different browser and cleared the cache and the diagrams render just fine.

@csrwng

csrwng commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification

No second-stage tests were triggered for this PR.

This can happen when:

  • The changed files don't match any pipeline_run_if_changed patterns
  • All files match pipeline_skip_if_only_changed patterns
  • No pipeline-controlled jobs are defined for the main branch

Use /test ? to see all available tests.

@bryan-cox

Copy link
Copy Markdown
Member Author

/verified by GHA deployment

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 4, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@bryan-cox: This PR has been marked as verified by GHA deployment.

Details

In response to this:

/verified by GHA deployment

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci

openshift-ci Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

@bryan-cox: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit b07ae90 into openshift:main Jun 4, 2026
19 checks passed
@bryan-cox bryan-cox deleted the make-v2-docs branch June 4, 2026 19:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/documentation Indicates the PR includes changes for documentation jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants