Skip to content

feat: add INFRASTRUCTURE classification for environmental failures#4732

Open
rnetser wants to merge 4 commits intoRedHatQE:mainfrom
rnetser:feat/add-infrastructure-classification
Open

feat: add INFRASTRUCTURE classification for environmental failures#4732
rnetser wants to merge 4 commits intoRedHatQE:mainfrom
rnetser:feat/add-infrastructure-classification

Conversation

@rnetser
Copy link
Copy Markdown
Collaborator

@rnetser rnetser commented May 4, 2026

Problem

The AI analysis prompt only allows CODE ISSUE and PRODUCT BUG classifications. When a failure is caused by an environmental/infrastructure issue (broken node, OCI hook failure, storage outage, etc.), the AI is forced to mislabel it as CODE ISSUE with disclaimer text like:

CLASSIFICATION CONTEXT: This is an environmental blocker with low confidence. PRODUCT BUG is used as a forced binary-label fallback because the schema lacks an ENVIRONMENTAL category...

The rootcoz backend already supports INFRASTRUCTURE in OverrideClassificationLiteral, but the prompt never tells the AI it can use it.

Solution

Add INFRASTRUCTURE as a third allowed classification value in the prompt:

  • CODE ISSUE — Test framework, test code, or test-owned configuration problem
  • PRODUCT BUG — Actual product defect
  • INFRASTRUCTURE — Environmental blocker, lab/cluster infrastructure failure, or external dependency outage

Changes

  • Add INFRASTRUCTURE to the allowed classification values in Section 2
  • Update all environmental blocker guidance to use INFRASTRUCTURE instead of vague "environmental" wording
  • Add container runtime failures and missing operators to the environmental blockers list
  • Remove workaround guidance about forced binary-label fallbacks (no longer needed)
  • Update pattern guidance throughout (VM lifecycle, migration, SSH, storage, cleanup, etc.) to use INFRASTRUCTURE label

Assisted-by: Claude noreply@anthropic.com

Summary by CodeRabbit

Documentation

  • Documentation
    • Enhanced failure classification with new infrastructure category to distinguish environment/lab blockers from product bugs and code issues
    • Introduced systematic analysis procedures for detecting test teardown failures and determining their impact on downstream test results

The AI analysis prompt only allowed CODE ISSUE and PRODUCT BUG
classifications, forcing environmental/infrastructure failures to be
mislabeled as CODE ISSUE with disclaimer text. The rootcoz backend
already supports INFRASTRUCTURE in OverrideClassificationLiteral.

Changes:
- Add INFRASTRUCTURE as a third allowed classification value
- Update all environmental blocker guidance to use INFRASTRUCTURE
- Add container runtime failures and missing operators to the
  environmental blockers list
- Remove workaround guidance about forced binary-label fallbacks

Signed-off-by: rnetser <rnetser@redhat.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

This PR expands the failure classification prompt system from a two-label binary (CODE ISSUE or PRODUCT BUG) to a three-label system that introduces INFRASTRUCTURE as a distinct classification. It adds a mandatory pre-classification check for test teardown failures and updates pattern guidance throughout to reflect this distinction and refine when each label applies.

Changes

Failure Classification System Expansion

Layer / File(s) Summary
Classification Framework
JOB_INSIGHT_PROMPT.md
Three-label system introduced (CODE ISSUE, PRODUCT BUG, INFRASTRUCTURE) with allowed confidence levels; binary-only constraint removed.
Teardown History Gate
JOB_INSIGHT_FAILURE_HISTORY_ANALYSIS_PROMPT.md
New mandatory pre-classification check detects whether a previous test's failed teardown caused the current failure, with detection steps, cluster impact matching, and explicit exclusion cases.
INFRASTRUCTURE Classification Rules
JOB_INSIGHT_PROMPT.md
Defines INFRASTRUCTURE for operator-missing, lab/cluster blockers, container runtime failures, and dependent-operator prerequisites; replaces "environmental" terminology.
Exception & Pattern Guidance
JOB_INSIGHT_PROMPT.md
Refines guidance for common failure modes (VM lifecycle, live migration, SSH, DataVolume/CDI, cleanup blockers) to differentiate infrastructure outages from product defects.
Quarantine Rules
JOB_INSIGHT_PROMPT.md
Preserves rule that quarantined tests are not product defects unless failure mode differs from quarantine issue.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding INFRASTRUCTURE as a third classification type to the failure analysis prompt.
Description check ✅ Passed The description provides clear problem statement, solution overview, and specific changes with good structure covering the why and what, though slightly light on reviewer-specific notes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: A tracking issue is created for this PR and will be closed when the PR is merged or closed
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: Enabled categories: branch, can-be-merged, cherry-pick, has-conflicts, hold, needs-rebase, size, verified, wip

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest verify-bugs-are-open - verify-bugs-are-open
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. LGTM Count: Minimum 2 /lgtm from reviewers
  3. Status Checks: All required status checks must pass
  4. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  5. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • dshchedr
  • myakove
  • rnetser
  • vsibirsk

Reviewers:

  • RoniKishner
  • dshchedr
  • rnetser
  • vsibirsk
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
AI Features
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 4, 2026
@rnetser
Copy link
Copy Markdown
Collaborator Author

rnetser commented May 4, 2026

/approve
/verified

@openshift-virtualization-qe-bot-3
Copy link
Copy Markdown
Contributor

@coderabbitai

Test execution plan request details

CRITICAL: You MUST post an inline review comment on the first changed line of the first file.
The inline comment should contain the Test Execution Plan with the smoke tests recommendation.
Do NOT submit a formal review - just post the inline comment directly.

As an expert software testing engineer, analyze all modified files in this PR and create a targeted test execution plan.
You will post an inline review comment with the test execution plan on the first changed file.
If you fail to run or post a comment, retry.

Analysis Requirements:

  1. Examine code changes in each modified file
  2. Identify affected code paths, functions, and classes
  3. Analyze pytest-specific elements: fixtures (scope, dependencies), parametrization, markers, conftest changes
  4. Trace test dependencies through imports, shared utilities, and fixture inheritance
  5. Detect new tests introduced in the PR
  6. Smoke Test Impact Analysis: Determine if any changes could affect smoke tests by checking:
    • Changes to files/functions used by tests marked with @pytest.mark.smoke
    • Changes to fixtures or utilities imported by smoke tests
    • Changes to conftest.py files that may affect smoke test execution
    • Changes to core infrastructure code (utilities/, libs/) that smoke tests depend on

Your deliverable:
Your change request comment will be based on the following requirements:

Test Execution Plan

  • Run smoke tests: True / False (Based on your smoke test impact analysis - True if any changes may affect smoke tests)
  • path/to/test_file.py - When the entire test file needs verification
  • path/to/test_file.py::TestClass::test_method - When specific test(s) needed
  • path/to/test_file.py::test_function - When specific test(s) needed
  • -m marker - When specific marker(s) can be used to cover multiple cases.

Guidelines:

  • Include only tests directly affected by the changes
  • Use a full file path only if ALL tests in that file require verification
  • Use file path + test name if only specific tests are needed
  • If a test marker can cover multiple files/tests, provide the marker
  • Balance coverage vs over-testing - Keep descriptions minimal

Hardware-Related Checks (SR-IOV, GPU, DPDK):

When PR modifies fixtures for hardware-specific resources:

  • Collection Safety: Fixtures MUST have existence checks (return None when hardware unavailable)
  • Test Plan: MUST verify both WITH and WITHOUT hardware:
    • Run affected tests on cluster WITH hardware
    • Verify collection succeeds on cluster WITHOUT hardware

CRITICAL WORKFLOW COMPLETION RULES:

When responding to this test execution plan request, you MUST follow these rules EXACTLY:

  1. YOUR ONLY DELIVERABLE: Post an inline review comment containing the test execution plan on the first changed line
  2. THEN STOP IMMEDIATELY - Do NOT generate any additional response
  3. FALLBACK ONLY: If submitting the review fails after retrying, post as a regular PR comment
  4. SILENCE = SUCCESS: After successfully submitting the review, your task is complete. No confirmation needed.

ABSOLUTE PROHIBITIONS (violating these creates empty/meaningless reviews):

  • ❌ Do NOT post acknowledgment messages like "Test execution plan posted", "Review posted successfully", "I've successfully posted"
  • ❌ Do NOT mention review IDs, URLs, or confirmation of posting in the PR thread
  • ❌ Do NOT add any follow-up comments after submitting the review
  • ❌ Do NOT reply to confirm task completion
  • ❌ Do NOT explain what you did - just do it and stop

Remember: The pull request review is visible to users. Additional comments are redundant noise.

CRITICAL: Before setting "Run smoke tests: True", you MUST use shell scripts to:
- Trace the actual fixture dependency chain from smoke tests to changed fixtures
- Verify that smoke tests actually import/use changed utilities or functions
- Confirm the dependency path exists; do NOT assume based on scope or semantics
- Be conservative: Session-scoped fixtures or infrastructure-sounding names do NOT
automatically mean smoke test impact. Only flag smoke test impact when you can
demonstrate a concrete dependency path.
WRONG: "This session-scoped storage fixture might affect smoke tests"
RIGHT: "Smoke test X uses fixture Y, which depends on the changed fixture Z"

Comment thread JOB_INSIGHT_PROMPT.md
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the list of changed files and their first changed lines
gh pr view 4732 --repo RedHatQE/openshift-virtualization-tests --json files -q '.files[].path'

Length of output: 22


🏁 Script executed:

#!/bin/bash
# Get the diff to find the first changed line
gh pr diff 4732 --repo RedHatQE/openshift-virtualization-tests | head -100

Length of output: 5455


🏁 Script executed:

#!/bin/bash
# Get the PR head commit SHA
gh pr view 4732 --repo RedHatQE/openshift-virtualization-tests --json headRefOid -q '.headRefOid'

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Post an inline review comment on the first changed line of JOB_INSIGHT_PROMPT.md
gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  /repos/RedHatQE/openshift-virtualization-tests/pulls/4732/comments \
  -f commit_id="850008d76e68128f4896f6dbeee594f0d536e04f" \
  -f path="JOB_INSIGHT_PROMPT.md" \
  -f line=44 \
  -f side="RIGHT" \
  -f body="## Test Execution Plan

- **Run smoke tests: False**

> **Rationale:** This PR modifies only \`JOB_INSIGHT_PROMPT.md\`, a markdown AI-analysis prompt file. No Python source files, pytest fixtures, test utilities, or \`conftest.py\` files were changed. There is no dependency path from any smoke test to this file. No test execution is required to validate this change."

Length of output: 676


🏁 Script executed:

#!/bin/bash
# Use -F for integer fields
gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  /repos/RedHatQE/openshift-virtualization-tests/pulls/4732/comments \
  -f commit_id="850008d76e68128f4896f6dbeee594f0d536e04f" \
  -f path="JOB_INSIGHT_PROMPT.md" \
  -F line=44 \
  -f side="RIGHT" \
  -f body="## Test Execution Plan

- **Run smoke tests: False**

> **Rationale:** This PR modifies only \`JOB_INSIGHT_PROMPT.md\`, a markdown AI-analysis prompt file. No Python source files, pytest fixtures, test utilities, or \`conftest.py\` files were changed. There is no dependency path from any smoke test to this file. No test execution is required to validate this change."

Length of output: 3297


1 similar comment
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 4, 2026

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the list of changed files and their first changed lines
gh pr view 4732 --repo RedHatQE/openshift-virtualization-tests --json files -q '.files[].path'

Length of output: 22


🏁 Script executed:

#!/bin/bash
# Get the diff to find the first changed line
gh pr diff 4732 --repo RedHatQE/openshift-virtualization-tests | head -100

Length of output: 5455


🏁 Script executed:

#!/bin/bash
# Get the PR head commit SHA
gh pr view 4732 --repo RedHatQE/openshift-virtualization-tests --json headRefOid -q '.headRefOid'

Length of output: 41


🏁 Script executed:

#!/bin/bash
# Post an inline review comment on the first changed line of JOB_INSIGHT_PROMPT.md
gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  /repos/RedHatQE/openshift-virtualization-tests/pulls/4732/comments \
  -f commit_id="850008d76e68128f4896f6dbeee594f0d536e04f" \
  -f path="JOB_INSIGHT_PROMPT.md" \
  -f line=44 \
  -f side="RIGHT" \
  -f body="## Test Execution Plan

- **Run smoke tests: False**

> **Rationale:** This PR modifies only \`JOB_INSIGHT_PROMPT.md\`, a markdown AI-analysis prompt file. No Python source files, pytest fixtures, test utilities, or \`conftest.py\` files were changed. There is no dependency path from any smoke test to this file. No test execution is required to validate this change."

Length of output: 676


🏁 Script executed:

#!/bin/bash
# Use -F for integer fields
gh api \
  --method POST \
  -H "Accept: application/vnd.github+json" \
  /repos/RedHatQE/openshift-virtualization-tests/pulls/4732/comments \
  -f commit_id="850008d76e68128f4896f6dbeee594f0d536e04f" \
  -f path="JOB_INSIGHT_PROMPT.md" \
  -F line=44 \
  -f side="RIGHT" \
  -f body="## Test Execution Plan

- **Run smoke tests: False**

> **Rationale:** This PR modifies only \`JOB_INSIGHT_PROMPT.md\`, a markdown AI-analysis prompt file. No Python source files, pytest fixtures, test utilities, or \`conftest.py\` files were changed. There is no dependency path from any smoke test to this file. No test execution is required to validate this change."

Length of output: 3297


dshchedr
dshchedr previously approved these changes May 4, 2026
@rnetser
Copy link
Copy Markdown
Collaborator Author

rnetser commented May 5, 2026

/verified cancel

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@JOB_INSIGHT_FAILURE_HISTORY_ANALYSIS_PROMPT.md`:
- Around line 1-7: The markdown headings are indented and the file lacks a
top-level heading at the start; remove leading spaces and left-align all
headings so they begin at column 0, convert the intended main title to a
top-level heading "# Pre-Classification Check: Did a Previous Test Break the
Cluster?" (remove the surrounding HTML comment block so the title is real), and
ensure the subsequent heading "## MANDATORY: Before classifying any failure,
answer this question first:" and the other flagged headings (the ones matching
the same heading texts elsewhere) are also left-aligned with no leading
indentation to satisfy MD041/MD023 lint rules.
- Around line 35-40: Update the teardown-root-cause guidance so the root
teardown classification can be CODE ISSUE, PRODUCT BUG, or INFRASTRUCTURE:
modify the sentence under "The test whose teardown failed" to include
INFRASTRUCTURE as a valid classification (e.g., "Classify based on why its
teardown failed — CODE ISSUE, PRODUCT BUG, or INFRASTRUCTURE"), and ensure the
later rule for "All other tests that failed after it" still instructs using the
same classification and reason format ("Caused by [test_name] teardown failure —
[resource] was not reverted.") so INFRASTRUCTURE is propagated the same way as
the other labels.

In `@JOB_INSIGHT_PROMPT.md`:
- Around line 44-46: The PR description is missing required PR template headers
causing validation failures; update the PR body to restore the mandatory
sections exactly as specified: add the headers "##### What this PR does / why we
need it:", "##### Which issue(s) this PR fixes:", "##### Special notes for
reviewer:", and "##### jira-ticket:" (each present even if empty), and populate
"What this PR does / why we need it:" with meaningful content summarizing the
change; ensure these exact header strings appear in the PR description so the
repo's PR Template Validation rules recognize them.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: d8c15a18-a5b3-44af-a58b-b1d543a692e3

📥 Commits

Reviewing files that changed from the base of the PR and between 267f965 and f34bb96.

📒 Files selected for processing (2)
  • JOB_INSIGHT_FAILURE_HISTORY_ANALYSIS_PROMPT.md
  • JOB_INSIGHT_PROMPT.md

Comment on lines +1 to +7
<!-- To be used with https://github.com/myk-org/rootcoz
Complements the main analysis prompt — history-aware classification.
-->

# Pre-Classification Check: Did a Previous Test Break the Cluster?

## MANDATORY: Before classifying any failure, answer this question first:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

LOW: Fix markdown heading structure to satisfy lint and rendering consistency

Static analysis flags heading formatting (MD041, MD023): headings are indented and the first line is not a top-level heading. Left-aligned headings improve predictable rendering and keep CI lint clean.

Proposed fix
-   # Pre-Classification Check: Did a Previous Test Break the Cluster?
+# Pre-Classification Check: Did a Previous Test Break the Cluster?
...
-   ## MANDATORY: Before classifying any failure, answer this question first:
+## MANDATORY: Before classifying any failure, answer this question first:
...
-   ## How to Check
+## How to Check
...
-   ## Classification
+## Classification
...
-   ## When This Check Does NOT Apply
+## When This Check Does NOT Apply

Also applies to: 14-14, 33-33, 42-42

🧰 Tools
🪛 markdownlint-cli2 (0.22.1)

[warning] 1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)


[warning] 5-5: Headings must start at the beginning of the line

(MD023, heading-start-left)


[warning] 7-7: Headings must start at the beginning of the line

(MD023, heading-start-left)

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@JOB_INSIGHT_FAILURE_HISTORY_ANALYSIS_PROMPT.md` around lines 1 - 7, The
markdown headings are indented and the file lacks a top-level heading at the
start; remove leading spaces and left-align all headings so they begin at column
0, convert the intended main title to a top-level heading "# Pre-Classification
Check: Did a Previous Test Break the Cluster?" (remove the surrounding HTML
comment block so the title is real), and ensure the subsequent heading "##
MANDATORY: Before classifying any failure, answer this question first:" and the
other flagged headings (the ones matching the same heading texts elsewhere) are
also left-aligned with no leading indentation to satisfy MD041/MD023 lint rules.

Comment on lines +35 to +40
- **The test whose teardown failed**: Classify based on why its teardown failed —
CODE ISSUE if the cleanup logic is wrong, PRODUCT BUG if the product blocked
the revert.
- **All other tests that failed after it**: Use the **same classification** as the
root-cause test. In the reason, state: "Caused by [test_name] teardown failure —
[resource] was not reverted."
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Root teardown classification still excludes INFRASTRUCTURE

Line 35–37 limits the failed-teardown root cause to CODE ISSUE or PRODUCT BUG, which conflicts with the new 3-label model. Teardown can fail due to cluster/lab outages too, and forcing binary classification will regress the exact behavior this PR is adding.

Proposed fix
-- **The test whose teardown failed**: Classify based on why its teardown failed —
-  CODE ISSUE if the cleanup logic is wrong, PRODUCT BUG if the product blocked
-  the revert.
+- **The test whose teardown failed**: Classify based on why its teardown failed —
+  `CODE ISSUE` if cleanup logic is wrong, `PRODUCT BUG` if product behavior blocks
+  revert under valid conditions, or `INFRASTRUCTURE` if cluster/lab conditions
+  (API outage, node failure, missing operators, storage/network outage) prevent cleanup.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@JOB_INSIGHT_FAILURE_HISTORY_ANALYSIS_PROMPT.md` around lines 35 - 40, Update
the teardown-root-cause guidance so the root teardown classification can be CODE
ISSUE, PRODUCT BUG, or INFRASTRUCTURE: modify the sentence under "The test whose
teardown failed" to include INFRASTRUCTURE as a valid classification (e.g.,
"Classify based on why its teardown failed — CODE ISSUE, PRODUCT BUG, or
INFRASTRUCTURE"), and ensure the later rule for "All other tests that failed
after it" still instructs using the same classification and reason format
("Caused by [test_name] teardown failure — [resource] was not reverted.") so
INFRASTRUCTURE is propagated the same way as the other labels.

Comment thread JOB_INSIGHT_PROMPT.md
Comment on lines +44 to +46
Your goal is to classify each failure as `CODE ISSUE`, `PRODUCT BUG`, or
`INFRASTRUCTURE` based on the available evidence. Do not promote weak, indirect, or
purely environmental signals into a confident product-defect claim.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

HIGH: Restore required PR template sections in the PR description

The PR body does not include the required template headers, especially ##### What this PR does / why we need it: (with meaningful content), plus ##### Which issue(s) this PR fixes:, ##### Special notes for reviewer:, and ##### jira-ticket:. Please restore them to satisfy repo policy and reviewer workflow expectations.

As per coding guidelines ## PR Template Validation ... Required sections (must be present, even if empty).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@JOB_INSIGHT_PROMPT.md` around lines 44 - 46, The PR description is missing
required PR template headers causing validation failures; update the PR body to
restore the mandatory sections exactly as specified: add the headers "##### What
this PR does / why we need it:", "##### Which issue(s) this PR fixes:", "#####
Special notes for reviewer:", and "##### jira-ticket:" (each present even if
empty), and populate "What this PR does / why we need it:" with meaningful
content summarizing the change; ensure these exact header strings appear in the
PR description so the repo's PR Template Validation rules recognize them.


## Classification

- **The test whose teardown failed**: Classify based on why its teardown failed —
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to add INFRASTRUCTURE here? A teardown can fail due to infrastructure as well.

Comment thread JOB_INSIGHT_PROMPT.md
4. **Prefer direct evidence over wrapper location.**
File paths like `tests/`, `utilities/`, `libs/`, and `conftest.py` are useful clues,
but they are not verdicts. Those modules often wrap product, cluster, or node state.
5. **Separate test-owned, product-owned, and environment-owned problems.**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

INFRASTRUCTURE has no entry in the decision order. Per this PR logic it should.

Comment thread JOB_INSIGHT_PROMPT.md
- NodeNetworkConfigurationPolicy status:
`oc get nncp -o yaml`

### For environmental blockers, suggest collecting
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
### For `INFRASTRUCTURE`, suggest collecting`

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants