Skip to content

Enforce VM size minimum during node selection#875

Open
simple-agent-manager[bot] wants to merge 10 commits intomainfrom
sam/possible-opinion-bug-terms-01kqje
Open

Enforce VM size minimum during node selection#875
simple-agent-manager[bot] wants to merge 10 commits intomainfrom
sam/possible-opinion-bug-terms-01kqje

Conversation

@simple-agent-manager
Copy link
Copy Markdown
Contributor

@simple-agent-manager simple-agent-manager Bot commented May 1, 2026

Summary

  • Treat requested VM size as a minimum capacity during node reuse.
  • Allow larger nodes to satisfy smaller requests, while preventing smaller nodes from satisfying larger requests.
  • Add shared VM size satisfaction helper, behavioral selector/TaskRunner regression tests, task tracking, post-mortem, and a process rule update.

Validation

  • pnpm lint
  • pnpm typecheck
  • pnpm test
  • Additional validation run (if applicable)

Additional validation:

  • pnpm --filter @simple-agent-manager/shared test -- vm-sizes
  • pnpm --filter @simple-agent-manager/api test -- durable-objects/task-runner-node-selection services/node-selector
  • pnpm --filter @simple-agent-manager/shared typecheck
  • pnpm --filter @simple-agent-manager/api typecheck
  • git diff --check

Staging Verification (REQUIRED for all code changes — merge-blocking)

  • Staging deployment greenDeploy Staging workflow triggered manually and passed for this branch
  • Live app verified via Playwright — staging workflow smoke-tests logged into app.sammy.party and passed
  • Existing workflows confirmed working — staging workflow smoke-tests passed against the live app
  • New feature/fix verified on staging — deployment/smoke path verified on staging; VM size minimum behavior verified by behavioral regression tests rather than live VM provisioning
  • Infrastructure verification completed — N/A: no infrastructure paths changed
  • Mobile and desktop verification notes added for UI changes — N/A: no runtime UI changes

Staging Verification Evidence

  • Deploy Staging workflow passed: https://github.com/raphaeltm/simple-agent-manager/actions/runs/25232215432
  • Deploy to Cloudflare job passed.
  • Staging smoke-tests job passed.
  • The staging workflow deployed code commit 2f89ce1a; later commits 148fdd36, af7412f1, and 95097497 only refreshed PR/task evidence and added a backlog deferral for audit persistence.
  • Local one-off browser smoke reached token-login successfully, then stopped because this workspace does not have the Playwright Chromium binary installed. The GitHub staging smoke-tests job is the browser-backed live staging evidence.
  • Full live VM provisioning was not run to avoid staging node quota/cost; the VM-size behavior is covered by behavioral tests for helper semantics, standalone selector warm/existing reuse, and TaskRunner preferred/warm/existing reuse.

UI Compliance Checklist (Required for UI changes)

N/A: no runtime UI changes. packages/ui/tests/ButtonGroup.test.tsx was updated only to tolerate equivalent CSS zero-value serialization (0 / 0px).

End-to-End Verification (Required for multi-component changes)

  • Data flow traced from user input to final outcome with code path citations (see .claude/rules/10-e2e-verification.md)
  • Capability test exercises the complete happy path across system boundaries
  • All spec/doc assumptions about existing behavior verified against code (not just "read the code")
  • If any gap exists between automated test coverage and full E2E, manual verification steps documented below

Data Flow Trace

  1. Project/task VM size resolves before TaskRunner start:
    apps/api/src/routes/tasks/submit.ts resolves explicit task value, then agent profile value, then project.defaultVmSize, then platform default.
  2. TaskRunner receives the resolved minimum size:
    apps/api/src/services/task-runner-do.ts:startTaskRunnerDO() forwards TaskRunConfig.vmSize.
  3. Preferred-node reuse rejects undersized nodes:
    apps/api/src/durable-objects/task-runner/node-steps.ts:handleNodeSelection() reads nodes.vm_size and calls canSatisfyVmSize(node.vm_size, state.config.vmSize).
  4. Warm-node reuse rejects undersized nodes:
    apps/api/src/durable-objects/task-runner/node-steps.ts:tryClaimWarmNode() filters warm candidates with canSatisfyVmSize.
  5. Existing-node reuse rejects undersized nodes:
    apps/api/src/durable-objects/task-runner/node-steps.ts:findNodeWithCapacity() filters running candidates with canSatisfyVmSize.
  6. Standalone selector follows the same rule:
    apps/api/src/services/node-selector.ts:selectNodeForTaskRun() filters warm and existing candidates with canSatisfyVmSize.

Untested Gaps

Full live VM provisioning was not run to avoid staging node quota/cost. Automated coverage exercises helper semantics, standalone selector warm/existing paths, and TaskRunner preferred/warm/existing paths with mixed VM sizes.

Post-Mortem (Required for bug fix PRs)

What broke

Node reuse treated requested VM size as a soft preference. A larger request could reuse a smaller node when no exact-size node was selected first.

Root cause

Standalone and TaskRunner node selection sorted by exact VM-size match but did not reject undersized candidates before reuse.

Class of bug

Selection logic treated a hard compatibility constraint as a ranking preference.

Why it wasn't caught

Existing tests covered helper-like behavior and source structure, but did not execute mixed-size node selection paths with incompatible candidates.

Process fix included in this PR

.claude/rules/10-e2e-verification.md now requires behavioral tests for compatibility constraints in selection logic, including incompatible candidates that would otherwise rank well.

Post-mortem file

docs/notes/2026-05-01-vm-size-minimum-selection-postmortem.md

Specialist Review Evidence (Required for agent-authored PRs)

  • All dispatched reviewers completed and findings addressed before merge
  • If any reviewer did NOT complete: needs-human-review label added and merge deferred to human — resolved after successful retry
Reviewer Status Outcome
cloudflare-specialist PASS No blocking findings; D1/DO usage remains parameterized and no migration/binding changes required.
constitution-validator ADDRESSED Initial duplicate VM_SIZE_RANK finding fixed by deriving ordering from DEFAULT_VM_SIZE_VCPUS; re-check passed.
test-engineer ADDRESSED Initial behavioral coverage findings fixed with standalone selector and TaskRunner tests; re-check passed.
task-completion-validator PASS Retry found one uncovered research finding about audit persistence; fixed by adding tasks/backlog/2026-05-01-persist-task-requested-vm-size.md and linking the explicit deferral from the active task. Final re-check passed with no blocking findings.

Exceptions (If any)

  • Scope: UI visual audit
  • Rationale: only UI test assertion changed; no runtime UI component/source change.
  • Expiration: This PR only.

Agent Preflight (Required)

  • Preflight completed before code changes

Classification

  • external-api-change
  • cross-component-change
  • business-logic-change
  • public-surface-change
  • docs-sync-change
  • security-sensitive-change
  • ui-change
  • infra-change

External References

N/A: no external API behavior changed. Investigation used repo code and staging D1 state.

Codebase Impact Analysis

Affected paths:

  • packages/shared/src/constants/vm-sizes.ts
  • apps/api/src/services/node-selector.ts
  • apps/api/src/durable-objects/task-runner/node-steps.ts
  • API/shared tests for VM size minimum semantics
  • Task tracking and bug-fix process docs

Documentation & Specs

Updated:

  • tasks/active/2026-05-01-vm-size-minimum-selection.md
  • docs/notes/2026-05-01-vm-size-minimum-selection-postmortem.md
  • .claude/rules/10-e2e-verification.md

Constitution & Risk Check

Checked Principle XI. Initial review found a duplicate hardcoded VM-size rank table; fixed by deriving ordering from the existing DEFAULT_VM_SIZE_VCPUS source of truth.

@simple-agent-manager simple-agent-manager Bot added the needs-human-review Agent could not complete all review gates — human must approve before merge label May 1, 2026
@simple-agent-manager simple-agent-manager Bot removed the needs-human-review Agent could not complete all review gates — human must approve before merge label May 1, 2026
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 1, 2026

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant