Skip to content

Add environment feasibility gates#70

Merged
NianJiuZst merged 9 commits into
NianJiuZst:mainfrom
kesmeey:feat/environment-feasibility-gates
Jun 17, 2026
Merged

Add environment feasibility gates#70
NianJiuZst merged 9 commits into
NianJiuZst:mainfrom
kesmeey:feat/environment-feasibility-gates

Conversation

@kesmeey

@kesmeey kesmeey commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Add environment feasibility gates

kesmeey and others added 5 commits June 12, 2026 18:41
Introduce cross-platform hardware and toolchain detection as a
non-blocking capability layer. The environment is collected
asynchronously early in the agent flow so it can later power
feasibility gates without blocking issue discovery or scoring.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Detect virtualized environments (Hyper-V, VirtualBox, VMware,
KVM, QEMU, Parallels, WSL), Docker containers, and CI platforms
(GitHub Actions, GitLab CI, Jenkins, etc.) as part of the local
capability profile.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kesmeey

kesmeey commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Summary

  • add local environment capability detection for OS, tools, hardware, virtualization, and services
  • add scout feasibility hints to downrank risky issues before selection
  • add LLM execution feasibility gate before patch generation with full/partial/static-only/blocked outcomes
  • update tests and fixtures for the new feasibility flow

Validation

  • bunx tsc --noEmit
  • bun run typecheck
  • bun test test/agent-orchestrator.test.ts test/agent-run.test.ts test/init-orchestrator.test.ts test/machine-agent.test.ts test/machine-commands.test.ts test/scout-targeting.test.ts test/feasibility-hints.test.ts test/llm.test.ts test/contracts.test.ts
  • bun run build
  • bunx biome check src/infra/environment.ts src/orchestration/analyze.ts test/agent-orchestrator.test.ts test/agent-run.test.ts test/init-orchestrator.test.ts test/machine-agent.test.ts test/machine-commands.test.ts test/scout-targeting.test.ts

@NianJiuZst

Copy link
Copy Markdown
Owner

Thanks your contribution.But you need to provide more information about this pr.

@kesmeey

kesmeey commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

Thanks your contribution.But you need to provide more information about this pr.

This PR adds a local environment detection layer so the agent can understand what the current
machine is capable of before selecting and executing issues.

The main motivation is to avoid discovering infeasible tasks too late. For example, some issues require CUDA/GPU,
Docker, browser E2E, mobile SDKs, native build tools, or local services. Without an environment profile, the agent may
only realize the machine cannot validate or run the task after it has already selected the issue and started working
on it.

This PR introduces:

  • Local capability detection for OS, CPU, memory, GPU/CUDA, disks, common tools, package managers, Docker, browsers,
    databases/services, WSL, VM/container, and CI context.
  • A scout-time feasibility hint layer that uses repo/issue signals plus local capabilities to softly adjust issue
    ranking. This is intentionally conservative and does not hard-block issues during ranking.
  • An execution-time feasibility gate before patch generation. This gate uses LLM + repo context + local environment
    context to decide whether the issue can proceed as full, partial, static-only, or should be blocked.
  • Tests covering the new feasibility contracts, LLM prompt path, and scout feasibility behavior.

The intended workflow is:

  1. Detect local environment capabilities.
  2. Use lightweight scout hints to downrank risky issues before selection.
  3. After an issue is selected but before patching, run a stricter feasibility assessment.
  4. Proceed only if the agent has a reasonable validation path, or fall back to partial/static-only modes when
    appropriate.

This should make the agent more reliable on machines with limited hardware or missing runtime dependencies, especially
for projects involving GPU, Docker, browser E2E, native builds, or external services.

@NianJiuZst

Copy link
Copy Markdown
Owner

This is a PR worthy of merging. Let me take a moment to review it first.

@NianJiuZst NianJiuZst left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is still one threshold mismatch to fix in the preset flow before this lands.

In src/orchestration/agent.ts the preset gate still checks rankedIssues[0]?.opportunity.overallScore >= config.automation.minMatchScore (both in the machine path and the interactive path), but issueRankingService.selectIssueForAutomation() now evaluates candidates with the feasibility-adjusted score via getAdjustedOverallScore().

That means a preset issue can become selectable after feasibility adjustment, but the flow never reaches selectIssueForAutomation() because presetIssueFlowAllowed is decided from the pre-adjustment score and short-circuits to repository-analysis fallback instead. A simple example would be an issue at 73 that gets adjusted to 76 against a threshold of 75.

Can we make this gate use the same adjusted-score semantics as selectIssueForAutomation() (or defer the threshold decision to that method entirely) so preset mode actually honors the new feasibility ranking?

@NianJiuZst

Copy link
Copy Markdown
Owner

@kesmeey

@kesmeey

kesmeey commented Jun 17, 2026

Copy link
Copy Markdown
Contributor Author

I think there is still one threshold mismatch to fix in the preset flow before this lands.

In src/orchestration/agent.ts the preset gate still checks rankedIssues[0]?.opportunity.overallScore >= config.automation.minMatchScore (both in the machine path and the interactive path), but issueRankingService.selectIssueForAutomation() now evaluates candidates with the feasibility-adjusted score via getAdjustedOverallScore().

That means a preset issue can become selectable after feasibility adjustment, but the flow never reaches selectIssueForAutomation() because presetIssueFlowAllowed is decided from the pre-adjustment score and short-circuits to repository-analysis fallback instead. A simple example would be an issue at 73 that gets adjusted to 76 against a threshold of 75.

Can we make this gate use the same adjusted-score semantics as selectIssueForAutomation() (or defer the threshold decision to that method entirely) so preset mode actually honors the new feasibility ranking?

Thanks, good catch. I updated the preset issue gate in both the machine and interactive paths to use the same
feasibility-adjusted score semantics as selectIssueForAutomation().

I also updated preset issue aggregation to sort/dedupe by adjusted score, so multi-repo presets do not accidentally
put a lower adjusted-score issue first. Added regression coverage for both interactive and machine flows where a raw
73 issue becomes selectable after adjustment to 76 against a 75 threshold.

@NianJiuZst

Copy link
Copy Markdown
Owner

Thanks your contributions.Merged

@NianJiuZst NianJiuZst merged commit 135b4a0 into NianJiuZst:main Jun 17, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants