kaizen: AgentRunnerの失敗分類について、command_missing/timeout/rate_limitedの未網羅ケースを追加しました。 (#76) by s-hiraoku · Pull Request #86 · kaizen-agents-org/builder-agent

s-hiraoku · 2026-07-05T01:44:32Z

Closes #76

Summary

AgentRunnerの失敗分類について、command_missing/timeout/rate_limitedの未網羅ケースを追加しました。
fallbackOnが対象分類を含む場合のフォールバック成功と、含まない場合の停止を検証しています。

Builder notes

検証: npm test / npm run validate:json / 共有skillファイル存在確認がすべて成功。保護パスの変更なし。

Provider evidence:

codex: exitCode=0, status=selected, failureClass=none, fallbackReason=none, payloadSource=last-message
Selected backend: codex
Final payload source: last-message

Verification

npm test
npm run validate:json
test -f skills/gh-link-issue-pr/SKILL.md && test -f skills/kaizen-bug-router/SKILL.md && test -f skills/pr-guardian/SKILL.md

Verifier

verifier: open_pr_with_warning
summary: Open PR with warning and 2 should_fix item(s); risk is medium.
notes: evidence_grade=reported
risk=medium
confidence=60
should_fix=Verification output contains a non-blocking risk signal.; Verification output contains a non-blocking risk signal.

Kaizen risk policy

Verifier cleared PR with warning: Open PR with warning and 2 should_fix item(s); risk is medium.

Changed files: 1
Changed lines: 143

Summary by CodeRabbit

Tests
- Expanded fallback coverage for failed command execution, timeouts, and rate-limiting scenarios.
- Added validation for both allowed fallback handling and blocked failure outcomes, improving confidence in CLI behavior under common failure conditions.

…ケースを追加しました。 (#76)

coderabbitai · 2026-07-05T01:44:40Z

Warning

Review limit reached

@s-hiraoku, you've reached your PR review limit, so we couldn't start this review.

Next review available in: 53 minutes

Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available.
You're only billed for reviews past your plan's rate limits ($0.25/file).

How can I continue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews.

How do review limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window.

Please refer docs for additional details.

Review details

⚙️ Run configuration

Configuration used: Repository: kaizen-agents-org/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: b4b8ac38-689d-4ae9-a639-fc5f9de5562c

📥 Commits

Reviewing files that changed from the base of the PR and between fdcdbcb and aba2d84.

📒 Files selected for processing (1)

test/agents/AgentRunner.test.ts

📝 Walkthrough

Walkthrough

This PR adds test coverage for three previously untested failure classification categories (command_missing, timeout, rate_limited) in the builder-agent test suite, including a shared fixture helper and CLI tests validating fallback behavior based on fallbackOn configuration.

Changes

Failure Classification Fallback Tests

Layer / File(s)	Summary
Failure case matrix and fixture runner `test/builder-agent.test.js`	Introduces `failureClassificationCases` covering command_missing (ENOENT, "command not found"), timeout (two raw variants), and rate_limited (429, rate limit reached, too many requests, quota exceeded), each with expected evidence regexes. Adds `runFailureClassificationFixture` to set up a temp workspace, optionally omit the provider command, run the CLI with a Hermes provider and `fallbackOn` list, and return parsed `build-result.json`, stdout, and any thrown error.
CLI tests validating classification-driven fallback `test/builder-agent.test.js`	Adds tests iterating over `failureClassificationCases` to verify `fixed` outcome with fallback+selection evidence when `fallbackOn` includes the failure class, versus `blocked` outcome with non-zero exit and stopped evidence when `fallbackOn` is empty.

Estimated code review effort: 2 (Simple) | ~10 minutes

Possibly related PRs

kaizen-agents-org/builder-agent#30: Both PRs modify builder-agent fallback tests around fallbackOn-gated behavior for classified provider failures and recorded fallback/evidence fields.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title matches the main change: adding missing AgentRunner failure-classification test coverage for command_missing, timeout, and rate_limited cases.
Linked Issues check	✅ Passed	The added tests cover the requested command_missing, timeout, and rate_limited variants and verify fallbackOn behavior in both fallback and stop cases.
Out of Scope Changes check	✅ Passed	The changes stay within the test file and focus on the linked classification coverage without introducing unrelated code or public API changes.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch kaizen/issue-76-add-missing-test-coverage-for-command-missing-ti

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 371d2fc43d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…r-command-missing-ti

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/builder-agent.test.js`:
- Around line 1072-1102: The two `it()` blocks in `builder-agent.test.js` repeat
the same `failureClassificationCases` loop and most assertions, so factor the
shared scaffolding into a helper near `runFailureClassificationFixture` or the
test block. Keep the helper focused on the common evidence checks and pass in
the expected outcome so the fallback-on and fallback-off cases only differ by
their specific assertions.
- Around line 1427-1431: The temp directory created in
runFailureClassificationFixture is never removed, so each fixture run leaks
files. Add cleanup for the mkdtemp-created dir in
runFailureClassificationFixture, ideally using a try/finally around the fixture
setup and execution so the directory is always deleted even if the test fails.
Use the existing runFailureClassificationFixture helper and its
dir/binDir/resultPath setup to locate the cleanup point.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: kaizen-agents-org/coderabbit/.coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro Plus

Run ID: 3c64e066-b193-4bea-b6c3-3729f195d549

📥 Commits

Reviewing files that changed from the base of the PR and between 185c9c6 and fdcdbcb.

📒 Files selected for processing (1)

test/builder-agent.test.js

…issing-test-coverage-for-command-missing-ti # Conflicts: # test/builder-agent.test.js

s-hiraoku · 2026-07-05T04:11:05Z

pr-guardian follow-up completed.

Resolved the merge conflict by merging current main into this branch and moving the Add missing test coverage for command_missing/timeout/rate_limited failure classification #76 fallback classification coverage into the split TypeScript test suite.
Addressed the outstanding CodeRabbit threads: shared evidence assertions are factored through assertClassifiedProviderEvidence, and the fixture temp directory is cleaned up with try/finally + rm.
Verified locally with npm test and npm run validate:json.
GitHub checks are passing: CI test and CodeRabbit.
Review threads are resolved; PR state is CLEAN / MERGEABLE.

kaizen: AgentRunnerの失敗分類について、command_missing/timeout/rate_limitedの未網羅…

371d2fc

…ケースを追加しました。 (#76)

s-hiraoku mentioned this pull request Jul 5, 2026

Add missing test coverage for command_missing/timeout/rate_limited failure classification #76

Closed

chatgpt-codex-connector Bot reviewed Jul 5, 2026

View reviewed changes

Comment thread test/builder-agent.test.js Outdated

s-hiraoku added 2 commits July 5, 2026 11:26

test: assert fallback success exits cleanly

75c2aa7

Merge branch 'main' into kaizen/issue-76-add-missing-test-coverage-fo…

fdcdbcb

…r-command-missing-ti

coderabbitai Bot reviewed Jul 5, 2026

View reviewed changes

Comment thread test/builder-agent.test.js Outdated

Comment thread test/builder-agent.test.js Outdated

s-hiraoku added 2 commits July 5, 2026 13:08

Merge remote-tracking branch 'origin/main' into kaizen/issue-76-add-m…

ddab3ed

…issing-test-coverage-for-command-missing-ti # Conflicts: # test/builder-agent.test.js

test: clean up fallback classification fixtures

aba2d84

s-hiraoku merged commit 84a2967 into main Jul 5, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kaizen: AgentRunnerの失敗分類について、command_missing/timeout/rate_limitedの未網羅ケースを追加しました。 (#76)#86

kaizen: AgentRunnerの失敗分類について、command_missing/timeout/rate_limitedの未網羅ケースを追加しました。 (#76)#86
s-hiraoku merged 5 commits into
mainfrom
kaizen/issue-76-add-missing-test-coverage-for-command-missing-ti

s-hiraoku commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jul 5, 2026 •

edited

Loading

Review limit reached

Walkthrough

Changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

s-hiraoku commented Jul 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

s-hiraoku commented Jul 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Builder notes

Verification

Verifier

Kaizen risk policy

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jul 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review limit reached

Walkthrough

Changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

s-hiraoku commented Jul 5, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

s-hiraoku commented Jul 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jul 5, 2026 •

edited

Loading