Skip to content

feat: add /qualify AI qualification workflow#474

Open
myakove wants to merge 10 commits into
mainfrom
feature/qualify-workflow
Open

feat: add /qualify AI qualification workflow#474
myakove wants to merge 10 commits into
mainfrom
feature/qualify-workflow

Conversation

@myakove
Copy link
Copy Markdown
Collaborator

@myakove myakove commented May 6, 2026

Summary

Adds the /qualify AI qualification workflow — a fully automated end-to-end pipeline that takes a feature design doc or bug report and produces verified tests with cluster proof.

What's Included

Prompt Template

  • llm/qualify/prompts/qualify.md — Main /qualify command that orchestrates all 4 phases

Agents

  • llm/qualify/agents/test-planner.md — Reads feature/bug docs → produces structured test plans
  • llm/qualify/agents/cluster-verifier.md — Independently verifies OpenShift cluster state after test execution

Skill

  • llm/qualify/skills/proof-generator/SKILL.md — Assembles proof.md reports with test results + cluster evidence

Templates

  • llm/qualify/templates/test-plan-template.md — Test plan skeleton
  • llm/qualify/templates/proof-template.md — Proof report skeleton

Documentation

  • llm/qualify/README.md — Full usage guide with setup instructions for pi, Claude Code, Cursor, and other AI CLIs
  • llm/qualify/workflow-diagrams.md — Mermaid flowcharts (workflow, components, sequence diagram)

Workflow Overview

/qualify --type feature --source <url> --cluster ~/kubeconfig
  1. Phase 0: Parse args, validate cluster, collect versions (OCP/MTV/CNV)
  2. Phase 1: AI reads source → produces test plan → human reviews
  3. Phase 2: AI writes tests → runs on real cluster → cluster-verifier independently validates
  4. Phase 3: 3 parallel code reviewers → pre-commit → PR
  5. Phase 4: Generates self-contained proof.md with verdict (QUALIFIED / NOT QUALIFIED / BUG FIXED / BUG NOT FIXED)

Human Checkpoints

  • Test plan review (Phase 1)
  • Bug: permanent test or verify-only? (Phase 0)
  • AI stuck after 3 retries (Phase 2)
  • PR review (Phase 3)

Do NOT auto-merge. This needs human review.

Summary by CodeRabbit

  • New Features

    • Introduces an end-to-end "qualify" workflow to generate tests, run them on a real cluster when explicitly invoked, perform independent cluster verification, and produce a self-contained qualification proof.
  • Documentation

    • Adds comprehensive guides, agent/skill and prompt specifications, test-plan and proof templates, workflow diagrams, CLI argument details, and operational/setup instructions; clarifies real-cluster pytest runs only occur with explicit user invocation and credentials.
  • Chores

    • Updated ignore rules to exclude qualification output artifacts.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2fb66760-8857-463a-8958-3847b39fc252

📥 Commits

Reviewing files that changed from the base of the PR and between f83f6ab and f382175.

📒 Files selected for processing (1)
  • llm/qualify/prompts/qualify.md

Walkthrough

Adds an end-to-end /qualify workflow: orchestrator prompt, agent and skill specifications (test-planner, cluster-verifier, proof-generator), templates, workflow diagrams, docs, and a .qualify/ gitignore entry for generated artifacts.

Changes

AI Qualification Workflow for MTV API Tests

Layer / File(s) Summary
Configuration and Policy
.gitignore, CLAUDE.md
Adds .qualify/ to .gitignore and documents that /qualify may run pytest on a real cluster when explicitly invoked with cluster credentials (pointer to llm/qualify/README.md).
Documentation Overview
README.md, llm/qualify/README.md
Adds the /qualify command overview, five-phase sequence, CLI args (--type,--source,--cluster,--name), usage examples, human checkpoints, AI-CLI setup instructions, expected directory structure, outputs, and prerequisites.
Workflow Architecture
llm/qualify/workflow-diagrams.md
Adds detailed Mermaid diagrams and sequence flows describing Phases 0–4, bug verify-only branching, component relationships, and key takeaways.
Main Orchestration Prompt
llm/qualify/prompts/qualify.md
Defines Phase 0 (arg parsing, cluster validation, version collection), Phase 1 (test-plan generation with human approval), Phase 2 (test writing, pytest execution, cluster verification with retry loop and verify-only path), Phase 3 (code review, pre-commit, PR creation), Phase 4 (proof assembly and verdict), and critical gating rules.
Test Planner Agent
llm/qualify/agents/test-planner.md
Specifies Phase 1 agent: required inputs/prereqs, exact test-plan.md structure (overview, scenarios, step→test mappings, expected outcomes, verification points, VM config, tests_params, pytest markers), quality checklist, and output behavior.
Templates
llm/qualify/templates/test-plan-template.md, llm/qualify/templates/proof-template.md
Adds Markdown templates for test-plan and proof reports: placeholders for metadata, prerequisites, scenarios, test configuration (tests_params), pytest marker guidance, test result tables with full output blocks, cluster verification evidence sections, and risk assessment.
Cluster Verifier Agent
llm/qualify/agents/cluster-verifier.md
Specifies independent oc-based verification: connectivity gating (oc whoami/oc cluster-info), mandatory version capture (OCP/MTV/CNV), migration verification checklist (VM readiness, disks/DVs/PVCs, networks, StorageMap/NetworkMap, Plans/Migrations), per-check evidence capture rules, structured Markdown report, bug verification mode (BUG FIXED / BUG NOT FIXED), and failure handling.
Proof Generator Skill
llm/qualify/skills/proof-generator/SKILL.md
Defines exact proof.md template and rules: Summary, Environment (versions), Test Execution Results (collapsible pytest output), Cluster Verification (verification table + raw evidence), Qualification Decision/verdict wording, evidence redaction rules, and output path conventions.
Orchestration: Write & Verify
llm/qualify/prompts/qualify.md
Describes delegation to test writers, running pytest on a real cluster (permanent test vs verify-only), logging pytest outputs, invoking cluster-verifier, retry/escalation loops, and branch to PR creation only for permanent tests.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant Orchestrator
  participant TestPlannerAgent
  participant PythonWriter
  participant RealCluster
  participant ClusterVerifier
  participant ProofGenerator
  participant Git
  User->>Orchestrator: invoke /qualify --type --source [--cluster]
  Orchestrator->>TestPlannerAgent: generate test-plan.md from source
  Orchestrator->>User: request human approval of test-plan.md
  Orchestrator->>PythonWriter: write tests & config (branch for permanent tests)
  PythonWriter->>RealCluster: run pytest (capture output -> test-output.log)
  RealCluster-->>Orchestrator: pytest stdout/stderr + exit code
  Orchestrator->>ClusterVerifier: run independent cluster verification
  ClusterVerifier-->>Orchestrator: cluster verification report
  Orchestrator->>ProofGenerator: assemble proof.md (tests + verification + versions)
  Orchestrator->>Git: create PR (permanent-test path) with proof reference
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add /qualify AI qualification workflow' clearly and concisely summarizes the main change—introducing a new end-to-end AI-driven qualification pipeline for the MTV project.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/qualify-workflow

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@rh-bot-1
Copy link
Copy Markdown

rh-bot-1 commented May 6, 2026

Report bugs in Issues

Welcome! 🎉

This pull request will be automatically processed with the following features:

🔄 Automatic Actions

  • Reviewer Assignment: Reviewers are automatically assigned based on the OWNERS file in the repository root
  • Size Labeling: PR size labels (XS, S, M, L, XL, XXL) are automatically applied based on changes
  • Issue Creation: Disabled for this repository
  • Branch Labeling: Branch-specific labels are applied to track the target branch
  • Auto-verification: Auto-verified users have their PRs automatically marked as verified
  • Labels: All label categories are enabled (default configuration)

📋 Available Commands

PR Status Management

  • /wip - Mark PR as work in progress (adds WIP: prefix to title)
  • /wip cancel - Remove work in progress status
  • /hold - Block PR merging (approvers only)
  • /hold cancel - Unblock PR merging
  • /verified - Mark PR as verified
  • /verified cancel - Remove verification status
  • /reprocess - Trigger complete PR workflow reprocessing (useful if webhook failed or configuration changed)
  • /regenerate-welcome - Regenerate this welcome message

Review & Approval

  • /lgtm - Approve changes (looks good to me)
  • /approve - Approve PR (approvers only)
  • /automerge - Enable automatic merging when all requirements are met (maintainers and approvers only)
  • /assign-reviewers - Assign reviewers based on OWNERS file
  • /assign-reviewer @username - Assign specific reviewer
  • /check-can-merge - Check if PR meets merge requirements

Testing & Validation

  • /retest tox - Run Python test suite with tox
  • /retest build-container - Rebuild and test container image
  • /retest conventional-title - Validate commit message format
  • /retest all - Run all available tests

Container Operations

  • /build-and-push-container - Build and push container image (tagged with PR number)
    • Supports additional build arguments: /build-and-push-container --build-arg KEY=value

Cherry-pick Operations

  • /cherry-pick <branch> - Schedule cherry-pick to target branch when PR is merged
    • Multiple branches: /cherry-pick branch1 branch2 branch3

Label Management

  • /<label-name> - Add a label to the PR
  • /<label-name> cancel - Remove a label from the PR

✅ Merge Requirements

This PR will be automatically approved when the following conditions are met:

  1. Approval: /approve from at least one approver
  2. Status Checks: All required status checks must pass
  3. No Blockers: No wip, hold, has-conflicts labels and PR must be mergeable (no conflicts)
  4. Verified: PR must be marked as verified

📊 Review Process

Approvers and Reviewers

Approvers:

  • myakove
  • solenoci

Reviewers:

  • krcmarik
  • myakove
  • solenoci
Available Labels
  • hold
  • verified
  • wip
  • lgtm
  • approve
  • automerge
AI Features
  • Conventional Title: Mode: fix (claude/claude-opus-4-6[1m])
  • Cherry-Pick Conflict Resolution: Enabled (claude/claude-opus-4-6[1m])
  • Test Oracle: Triggers: approved (cursor/gpt-5.4-xhigh-fast); /test-oracle can be used anytime

💡 Tips

  • WIP Status: Use /wip when your PR is not ready for review
  • Verification: The verified label is removed on new commits unless the push is detected as a clean rebase
  • Cherry-picking: Cherry-pick labels are processed when the PR is merged
  • Container Builds: Container images are automatically tagged with the PR number
  • Permission Levels: Some commands require approver permissions
  • Auto-verified Users: Certain users have automatic verification and merge privileges

For more information, please refer to the project documentation or contact the maintainers.

@rh-bot-1
Copy link
Copy Markdown

Clean rebase detected — no code changes compared to previous head (e42aac5).

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/agents/cluster-verifier.md`:
- Around line 146-150: Add mandatory redaction rules to the evidence collection
section that currently instructs to store "The full output" and raw evidence
into proof.md: update the cluster-verifier.md checklist to require automatic
redaction before persisting raw logs (e.g., mask secrets, API keys, tokens,
emails, IPs, and YAML anchors), provide a short canonical redaction policy and
example regex patterns, and add a one-liner command or reference to the
redaction utility to run prior to saving evidence so any step that calls out
"The full output" or writes to proof.md enforces redaction.
- Line 35: The document currently has contradictory failure semantics ("stop
immediately" vs "continue checking other items"); update the verification
guidance to clearly distinguish connectivity/authentication failures from
individual check failures: state that failure of any cluster-level connectivity
or authentication check (the sentence currently saying "stop immediately") must
abort the entire verification and report no-trust, whereas non-connectivity
per-check failures should be logged and verification should continue to collect
all failures (the area currently saying "continue checking other items"); change
the two conflicting sentences so the first explicitly names
"connectivity/authentication checks" as abort conditions and the later paragraph
(around the per-check rules) explicitly documents that per-check failures do not
abort but are aggregated as partial failures, and ensure the same clarified rule
text replaces the existing lines referenced in the doc.

In `@llm/qualify/prompts/qualify.md`:
- Around line 136-139: The doc has a conflict between the temporary test
location and the verify-only execution command: update the verify-only example
to run pytest directly against the temporary test file path
(/tmp/qualify-<name>/<test_file>.py) instead of using the repo test selector
(tests/<path>::<TestClass>); change the command shown (the "uv run pytest ..."
example) to point to /tmp/qualify-<name>/<test_file>.py, keep the same pytest
flags (--tc-file, --tc-format, -p no:xdist), and continue piping output to
.qualify/<type>/<name>/test-output.log so verify-only remains isolated from repo
state.

In `@llm/qualify/skills/proof-generator/SKILL.md`:
- Around line 130-132: Update SKILL.md's evidence policy for proof.md so raw
YAML/logs are required to be sanitized before inclusion: change the bullet that
mandates raw evidence in collapsible <details> to require "sanitized, redacted
evidence" and add a short checklist in SKILL.md/proof.md that instructs
redacting secrets/PII (API keys, tokens, passwords, private IPs, certs),
replacing values with placeholders like <REDACTED> and documenting what was
removed; ensure the collapsible details still include enough context (file
names, non-sensitive fields, and diffs) and add a statement on how to mark
omitted sections (e.g., "[…redacted…]") so reviewers know evidence was
intentionally redacted.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b340adfa-9cf1-4110-860e-e91ae4f91673

📥 Commits

Reviewing files that changed from the base of the PR and between 922e9a8 and 077b62d.

📒 Files selected for processing (15)
  • .gitignore
  • CLAUDE.md
  • README.md
  • llm/qualify/README.md
  • llm/qualify/agents/.gitkeep
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/agents/test-planner.md
  • llm/qualify/prompts/.gitkeep
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/.gitkeep
  • llm/qualify/skills/proof-generator/SKILL.md
  • llm/qualify/templates/.gitkeep
  • llm/qualify/templates/proof-template.md
  • llm/qualify/templates/test-plan-template.md
  • llm/qualify/workflow-diagrams.md

Comment thread llm/qualify/agents/cluster-verifier.md Outdated
Comment thread llm/qualify/agents/cluster-verifier.md Outdated
Comment thread llm/qualify/prompts/qualify.md
Comment thread llm/qualify/skills/proof-generator/SKILL.md
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented May 11, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/prompts/qualify.md`:
- Around line 42-53: After the "Collect environment versions" block in
qualify.md, add explicit failure-handling instructions: if any of the three oc
commands (OCP, MTV, CNV) fail (missing CSV/namespace or RBAC denied), record
that specific version as `UNKNOWN` and capture the command's error message;
state that the proof report will include these `UNKNOWN` entries and that
missing versions will cause proof-generator/SKILL.md's qualification logic to
mark the report NOT QUALIFIED, mirroring the behavior described in
cluster-verifier.md.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b19bc28d-78cd-4c42-8328-1589dd033318

📥 Commits

Reviewing files that changed from the base of the PR and between 077b62d and 6453247.

📒 Files selected for processing (3)
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/SKILL.md

Comment thread llm/qualify/prompts/qualify.md Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/prompts/qualify.md`:
- Around line 68-71: Add explicit bug-ID extraction/validation in Phase 0
immediately after argument parsing: when type=bug, parse the --source (e.g.,
Jira/GitHub URLs or issue text) to extract a canonical bug identifier and use
that for the `.qualify/bugs/<id>/` directory; if --name is provided, validate it
against accepted patterns (e.g., JIRA-\d+, BZ-\d+, #\d+) and reject or prompt
when it does not match the extracted ID; if extraction fails and --name is
absent or invalid, prompt the user to supply a bug ID via --name and abort
directory creation until a valid ID is given so `.qualify/bugs/<id>/` always
contains a proper bug identifier.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1bd31b3f-26b8-41c3-afcc-cf2e657b0de0

📥 Commits

Reviewing files that changed from the base of the PR and between f1af204 and 252c98e.

📒 Files selected for processing (15)
  • .gitignore
  • CLAUDE.md
  • README.md
  • llm/qualify/README.md
  • llm/qualify/agents/.gitkeep
  • llm/qualify/agents/cluster-verifier.md
  • llm/qualify/agents/test-planner.md
  • llm/qualify/prompts/.gitkeep
  • llm/qualify/prompts/qualify.md
  • llm/qualify/skills/proof-generator/.gitkeep
  • llm/qualify/skills/proof-generator/SKILL.md
  • llm/qualify/templates/.gitkeep
  • llm/qualify/templates/proof-template.md
  • llm/qualify/templates/test-plan-template.md
  • llm/qualify/workflow-diagrams.md

Comment thread llm/qualify/prompts/qualify.md Outdated
Add explicit step in Phase 0 to extract canonical bug identifiers
from --source URLs (Jira/GitHub patterns), support --name override,
and prompt user on extraction failure.
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented May 19, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented May 19, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@llm/qualify/prompts/qualify.md`:
- Around line 68-78: Update the docs and any referenced paths to consistently
use the normalized bug identifier placeholder (<id>) for bug flows instead of
<name>: ensure step 4 clearly states that --name, if provided, overrides
extraction and that the extracted/provided bug ID is normalized using the same
slug rules from step 1 (lowercase, safe chars, max 63) and then used as the <id>
in the output path `.qualify/bugs/<id>/`; change all occurrences that currently
reference `.qualify/<type>/<name>/...` in the bug flow (the instances noted that
read/write paths) to `.qualify/bugs/<id>/...` and update any descriptive text
referencing `--source`/`--name` handling so they all point to the single
canonical identifier variable `<id>`.
- Around line 49-53: The OCP version step currently runs a direct `oc get
clusterversion` call and lacks the failure pattern used for MTV/CNV; update the
OCP version collection so it captures raw stdout/stderr and, on any non-zero
exit, records the version as `UNKNOWN` plus the diagnostics/error output (same
pattern used for MTV/CNV collection). Locate the "OCP version" command in the
"Collect environment versions" section and change its invocation to capture and
save both output and error, and ensure the resulting recorded entry follows the
`UNKNOWN + diagnostics` format used by the MTV/CNV entries.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7c131416-be93-4d26-a2ef-62aa8f4b9024

📥 Commits

Reviewing files that changed from the base of the PR and between 252c98e and f83f6ab.

📒 Files selected for processing (1)
  • llm/qualify/prompts/qualify.md

Comment thread llm/qualify/prompts/qualify.md Outdated
Comment thread llm/qualify/prompts/qualify.md Outdated
- Add UNKNOWN+diagnostics pattern for OCP version retrieval,
  matching the existing MTV/CNV error-capture approach
- Introduce artifact_key concept (name for features, id for bugs)
  and unify all downstream path references to use it consistently
@myakove
Copy link
Copy Markdown
Collaborator Author

myakove commented May 19, 2026

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants