diff --git a/AGENTS.md b/AGENTS.md index df71663..9e1c489 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -2,10 +2,19 @@ ## Project Structure & Module Organization - `src/`: core action runtime. - - `index.js`: orchestration entrypoint (PR file loading, planner/sub-agent loop, publish). - - `agents.js`: OpenAI Agents setup and structured-output execution. - - `diff-map.js`, `aggregate.js`, `globs.js`, `publish.js`, `config.js`: focused utility modules. -- `test/`: Node built-in test suite (`*.test.js`) for diff mapping, aggregation, globs, and publish helpers. + - `index.js`: orchestration entrypoint (PR file loading, planner/sub-agent parallel loop, publish). + - `agents.js`: planner and reviewer agent setup with structured-output schemas. + - `model-runtime.js`: AI SDK structured-output execution with repair retry. + - `provider.js`: multi-provider factory (OpenAI, Anthropic, Google, Mistral, OpenAI-compatible). + - `config.js`: action input parsing, validation, and defaults. + - `diff-map.js`: diff hunk parsing and inline comment line resolution. + - `aggregate.js`: finding normalization, deduplication, and sorting. + - `globs.js`: include/exclude glob filtering. + - `publish.js`: PR review and summary comment publishing with marker-based dedup. + - `inline-key.js`: stable inline comment key generation for cross-run dedup. + - `public-error.js`: error message sanitization for public-facing outputs. + - `repo-guidance.js`: project guidance loader (AGENTS.md / AGENT.md / CLAUDE.md). +- `test/`: Node built-in test suite (`*.test.js`) for all modules above. - `.github/workflows/`: CI and self-test workflows. - `examples/`: non-executed example workflow templates for consumers. - `action.yml`: public action contract (inputs/outputs/defaults). @@ -14,6 +23,7 @@ - `npm ci`: install locked dependencies exactly (same as CI). - `npm run check`: syntax check for `src/index.js`. - `npm test`: run all unit tests via `node --test`. +- `npm run test:schema-support`: run local compatibility check script. - `npm start`: run action entrypoint locally (requires GitHub Actions env context to be meaningful). Use Node `22` (see CI and `package.json` engines). @@ -27,13 +37,22 @@ Use Node `22` (see CI and `package.json` engines). - constants: `UPPER_SNAKE_CASE`. - Keep modules small and single-purpose; add brief comments only when logic is non-obvious. +## Architecture Notes +- Multi-provider support via AI SDK: OpenAI, Anthropic, Google, Mistral, and OpenAI-compatible endpoints. +- Parallel execution within each round: batches run concurrently via `Promise.allSettled` with a shared semaphore (`max_concurrency`); within each batch, `general` dimension runs first, then remaining dimensions run in parallel. +- Round-level execution remains serial (each round depends on updated `coverageState`). +- Budget pre-allocation with 2x multiplier to account for structured-output repair retries. + ## Testing Guidelines - Framework: Node built-in `node:test` + `node:assert/strict`. - Test files: `test/.test.js`. - Add/extend tests whenever changing: - diff line resolution (`LEFT`/`RIGHT`), - finding normalization/deduplication/sorting, - - publish/update marker behavior. + - publish/update marker behavior, + - config parsing or validation, + - provider creation or model instantiation, + - concurrency control (semaphore) behavior. - Run locally before PR: `npm run check && npm test`. ## Commit & Pull Request Guidelines @@ -48,6 +67,7 @@ Use Node `22` (see CI and `package.json` engines). - any behavior changes for fork PR/security/token handling. ## Security & Configuration Tips -- Required secrets/permissions for real runs: `OPENAI_API_KEY`, `pull-requests: write`, `issues: write`. +- Required secrets/permissions for real runs: `api_key` (or provider-specific env), `pull-requests: write`, `issues: write`. - `GITHUB_TOKEN` is runtime-injected by Actions; do not hardcode tokens. -- Keep `OPENAI_API_BASE` optional and compatibility-safe. +- Keep `api_base` optional and enforce HTTPS-only with hostname allowlist. +- Legacy `OPENAI_API_KEY` / `OPENAI_API_BASE` env vars are still supported as fallbacks. diff --git a/README.md b/README.md index 221e6ce..f3f65f8 100644 --- a/README.md +++ b/README.md @@ -2,12 +2,12 @@ [![Build and Test](https://github.com/TiyAgents/code-review-agent-action/actions/workflows/self-test-current-branch.yml/badge.svg)](https://github.com/TiyAgents/code-review-agent-action/actions/workflows/self-test-current-branch.yml) -Reusable GitHub Action for automated Pull Request code review using an OpenAI-compatible structured runtime. +Reusable GitHub Action for automated Pull Request code review with multi-provider AI support (OpenAI, Anthropic, Google, Mistral, OpenAI-compatible) via [AI SDK](https://sdk.vercel.ai/). This action: - Runs on `pull_request` events. - Reviews all changed files that match `include`/`exclude` filters. -- Uses planner + subagents (general/security/performance/testing) in multi-round batches for large diffs. +- Uses planner + subagents (general/security/performance/testing) in multi-round batches for large diffs, with **parallel batch and dimension execution** within each round. - Publishes: - one PR Review (`pulls.createReview`) with inline comments (`LEFT`/`RIGHT`), and - one updatable summary issue comment (marker-based, no spam). @@ -20,11 +20,14 @@ This action: Simple flow explanation: - `Planner` decides each round's batches under `max_rounds`, `max_model_calls`, and `max_files_per_batch`. - `SubAgent(general)` always runs first for each batch, and can dynamically request extra dimensions (`security/performance/testing`). +- Within each round, batches execute in parallel (controlled by `max_concurrency`); within each batch, remaining dimensions run in parallel after `general` completes. - All sub-agent outputs are aggregated, normalized, deduplicated, then mapped to inline-commentable diff lines. - The publisher writes one review + one updatable summary, with historical dedupe and best-effort outdated comment minimization. ## Features +- **Multi-provider AI support**: OpenAI, Anthropic, Google, Mistral, and OpenAI-compatible endpoints via AI SDK. +- **Parallel execution**: batches and dimensions within each round run concurrently, controlled by `max_concurrency` (default 4). Set `max_concurrency=1` for serial execution. - Full coverage target over filtered file set, including no-patch/binary files as file-level review entries. - Structured schema output validation with one repair retry. - Degradation mode: if structured output still fails after repair, posts summary-only with explicit reason. @@ -34,8 +37,6 @@ Simple flow explanation: - Stage 2: auto-minimize outdated historical inline comments (GraphQL best-effort). - Confidence/evidence gating and semantic deduplication to reduce repeated/low-quality findings. - Configurable review language via `review_language` (default `English`). -- Supports custom OpenAI base URL via `OPENAI_API_BASE` or `openai_api_base` input. -- Automatically falls back across `responses_json_schema`, `chat_json_schema`, `chat_json_object`, and prompt-only JSON modes for broader third-party compatibility. - Enforces `openai_api_base` safety: HTTPS only, no URL credentials, and hostname allowlist (default `api.openai.com`). - Automatically loads project guidance from `AGENTS.md`, `AGENT.md`, or `CLAUDE.md` (priority order) and passes it to review agents. - General-first routing: batch review starts with `general`, and only `general` can dynamically request extra dimensions for that batch. @@ -87,6 +88,7 @@ jobs: coverage_first_round_primary_only: true auto_minimize_outdated_comments: true max_rounds: 8 + max_concurrency: 4 max_model_calls: 128 # example override (default: 40) max_files_per_batch: 8 max_context_chars: 256000 # example override (default: 128000) @@ -110,7 +112,6 @@ jobs: | `exclude` | no | empty | Exclude globs (comma/newline separated) | | `planner_model` | no | `gpt-5.3-codex` | Planner model | | `reviewer_model` | no | `gpt-5.3-codex` | Subagent model | -| `llm_compatibility_mode` | no | `auto` | **Deprecated**: AI SDK handles compatibility automatically | | `review_dimensions` | no | `general,security,performance,testing` | Subagent dimensions | | `review_language` | no | `English` | Preferred language for review comments and summary | | `min_finding_confidence` | no | `0.72` | Keep only findings at or above this confidence (0-1) | @@ -119,6 +120,7 @@ jobs: | `coverage_first_round_primary_only` | no | `true` | Round 1 runs only primary dimension for faster file coverage | | `auto_minimize_outdated_comments` | no | `true` | Best-effort GraphQL minimize for outdated historical inline comments from this action | | `max_rounds` | no | `8` | Max planning/review rounds | +| `max_concurrency` | no | `4` | Max concurrent API calls within a round (batch + dimension parallelism) | | `max_model_calls` | no | `40` | Hard cap for model calls | | `max_files_per_batch` | no | `8` | Batch size cap | | `max_context_chars` | no | `128000` | Per-batch context cap | @@ -127,7 +129,7 @@ jobs: ## Budget Sizing (Rough Estimate) -This action spends model calls by **rounds × batches × dimensions**. +This action spends model calls by **rounds × batches × dimensions**. With parallel execution (`max_concurrency > 1`), wall-clock time decreases but total call count stays the same. Approximation: @@ -200,7 +202,6 @@ The script performs planner/reviewer checks across the supported compatibility m It also includes a seeded-bug probe (`bug_probe`) to gauge defect detection capability: - By default, bug probe is non-blocking (reported as PASS/FAIL). - Set `BUG_PROBE_REQUIRED=true` to make bug probe failure exit non-zero. -- For third-party providers, use the reported recommended mode as the safest explicit `llm_compatibility_mode` override. ## Implementation Notes diff --git a/action.yml b/action.yml index c09fa29..dff19f5 100644 --- a/action.yml +++ b/action.yml @@ -83,6 +83,10 @@ inputs: description: Auto-minimize outdated historical inline comments posted by this action. required: false default: "true" + max_concurrency: + description: Maximum concurrent API calls within a single round (controls batch and dimension parallelism). + required: false + default: "4" max_rounds: description: Maximum planning/review rounds. required: false diff --git a/src/config.js b/src/config.js index ae731cf..099da0c 100644 --- a/src/config.js +++ b/src/config.js @@ -171,6 +171,7 @@ function loadConfig() { coverageFirstRoundPrimaryOnly: parseBooleanInput('coverage_first_round_primary_only', true), autoMinimizeOutdatedComments: parseBooleanInput('auto_minimize_outdated_comments', true), maxRounds: parsePositiveIntInput('max_rounds', 8), + maxConcurrency: parsePositiveIntInput('max_concurrency', 4), maxModelCalls: parsePositiveIntInput('max_model_calls', 40), maxFilesPerBatch: parsePositiveIntInput('max_files_per_batch', 8), maxContextChars: parsePositiveIntInput('max_context_chars', 128000), diff --git a/src/index.js b/src/index.js index a37bc67..b7ced39 100644 --- a/src/index.js +++ b/src/index.js @@ -158,6 +158,36 @@ function getErrorMessage(error, fallback = 'unknown_error') { return String(error?.message || error || fallback); } +/** + * Create a simple concurrency limiter (semaphore). + * Returns { run(asyncFn) → Promise } that ensures at most `concurrency` + * async functions execute simultaneously. + */ +function createSemaphore(concurrency) { + let active = 0; + const queue = []; + + function tryNext() { + while (queue.length > 0 && active < concurrency) { + const { fn, resolve, reject } = queue.shift(); + active += 1; + Promise.resolve().then(fn).then( + (value) => { active -= 1; resolve(value); tryNext(); }, + (error) => { active -= 1; reject(error); tryNext(); } + ); + } + } + + return { + run(fn) { + return new Promise((resolve, reject) => { + queue.push({ fn, resolve, reject }); + tryNext(); + }); + } + }; +} + function shouldUseSummaryOnlyMode({ hadStructuredOutputFailure, hasSuccessfulReviewerOutput }) { return Boolean(hadStructuredOutputFailure && !hasSuccessfulReviewerOutput); } @@ -393,6 +423,256 @@ function formatSummaryMarkdown({ ].join('\n'); } +/** + * Run a single dimension review for a batch. Returns a result object or null on skip. + */ +async function runDimension({ + dimension, round, batchIndex, batchCount, batchFiles, reviewerAgents, + config, projectGuidance +}) { + const tag = `Round ${round} [B${batchIndex + 1}/${batchCount}] SubAgent(${dimension})`; + + const agent = reviewerAgents[dimension] || createReviewerAgent({ + dimension, + model: config.reviewerModel, + language: config.reviewLanguage, + projectGuidance + }); + reviewerAgents[dimension] = agent; + + const reviewInput = buildBatchReviewInput({ + dimension, + round, + batchFiles, + maxContextChars: config.maxContextChars, + availableDimensions: config.reviewDimensions + }); + + if (reviewInput.selectedPaths.length === 0) { + core.warning(`${tag}: selectedPaths=0 due to context budget, skipped.`); + return null; + } + + const agentStart = Date.now(); + core.info(`${tag} start: files=${reviewInput.selectedPaths.length}`); + + const reviewResult = await runStructuredWithRepair(agent, reviewInput.prompt, { + allowRepair: true, + maxTurns: 10 + }); + const elapsedMs = Date.now() - agentStart; + + if (!reviewResult.ok) { + const reviewerErrorMessage = getErrorMessage(reviewResult.error); + core.warning( + `${tag} failed: calls=${reviewResult.calls} elapsed_ms=${elapsedMs} error=${reviewerErrorMessage}` + ); + return { + ok: false, + dimension, + calls: reviewResult.calls, + degradedReason: `reviewer_structured_output_failed_round_${round}_${dimension}: ${sanitizePublicErrorDetail(reviewerErrorMessage)}` + }; + } + + const findingCount = (reviewResult.output.findings || []).length; + const fileConclusionCount = (reviewResult.output.fileConclusions || []).length; + const suggestionCount = (reviewResult.output.actionableSuggestions || []).length; + core.info( + `${tag} done: calls=${reviewResult.calls} elapsed_ms=${elapsedMs} findings=${findingCount} file_conclusions=${fileConclusionCount} suggestions=${suggestionCount}` + ); + + return { + ok: true, + dimension, + calls: reviewResult.calls, + selectedPaths: reviewInput.selectedPaths, + output: reviewResult.output + }; +} + +/** + * Execute a single batch with parallel dimension execution. + * general dimension runs first (to collect recommendedExtraDimensions), + * then remaining dimensions run in parallel via the shared semaphore. + */ +async function executeBatch({ + batch, batchIndex, batchCount, round, roundDimensions, + pendingFileByPath, reviewerAgents, config, projectGuidance, semaphore +}) { + const tag = `Round ${round} [B${batchIndex + 1}/${batchCount}]`; + + // Build result accumulator + const result = { + executed: false, + calls: 0, + reviewerCalls: 0, + subAgentRuns: 0, + hadStructuredOutputFailure: false, + hasSuccessfulReviewerOutput: false, + degradedReasons: [], + findings: [], + actionableSuggestions: [], + potentialRisks: [], + testSuggestions: [], + fileConclusions: [], + coveredPaths: [], + dimensionResults: [] + }; + + const batchFiles = batch.filePaths + .map((path) => pendingFileByPath.get(path)) + .filter(Boolean); + + if (batchFiles.length === 0) { + core.warning(`${tag}: no resolvable files after mapping, skipped.`); + return result; + } + + result.executed = true; + core.info(`${tag}: focus=${batch.focus} files=${batchFiles.length}`); + + // Build execution dimensions: general first + const executionDimensions = []; + if (roundDimensions.includes('general')) { + executionDimensions.push('general'); + } + for (const dimension of roundDimensions) { + if (dimension !== 'general') { + executionDimensions.push(dimension); + } + } + if (executionDimensions.length === 0 && roundDimensions.length > 0) { + executionDimensions.push(roundDimensions[0]); + } + + // Helper to process a dimension result into the batch accumulator + function collectDimensionResult(dimResult) { + if (!dimResult) return; + result.calls += dimResult.calls; + result.reviewerCalls += dimResult.calls; + result.subAgentRuns += 1; + + if (!dimResult.ok) { + result.hadStructuredOutputFailure = true; + result.degradedReasons.push(dimResult.degradedReason); + return; + } + + result.hasSuccessfulReviewerOutput = true; + result.dimensionResults.push({ + dimension: dimResult.dimension, + selectedPaths: dimResult.selectedPaths + }); + + for (const finding of dimResult.output.findings || []) { + result.findings.push({ + ...finding, + category: finding.category || dimResult.dimension, + sourceDimension: dimResult.dimension + }); + } + + for (const suggestion of dimResult.output.actionableSuggestions || []) { + result.actionableSuggestions.push(suggestion); + } + + for (const risk of dimResult.output.potentialRisks || []) { + result.potentialRisks.push(risk); + } + + for (const testSuggestion of dimResult.output.testSuggestions || []) { + result.testSuggestions.push(testSuggestion); + } + + for (const fc of dimResult.output.fileConclusions || []) { + result.fileConclusions.push({ + path: fc.path, + conclusion: fc.conclusion, + risks: fc.risks || [], + testSuggestions: fc.testSuggestions || [], + note: fc.note || '' + }); + } + + for (const p of dimResult.selectedPaths) { + result.coveredPaths.push(p); + } + } + + // Phase 1: run general dimension first (if present) to get extra dimension recommendations + const hasGeneral = executionDimensions[0] === 'general'; + let parallelDimensions = executionDimensions; + + if (hasGeneral) { + const generalResult = await runDimension({ + dimension: 'general', + round, + batchIndex, + batchCount, + batchFiles, + reviewerAgents, + config, + projectGuidance + }); + collectDimensionResult(generalResult); + + // Check for recommended extra dimensions from general + if (generalResult?.ok) { + const recommended = normalizeDimensionNames(generalResult.output.recommendedExtraDimensions || []) + .filter((x) => x !== 'general') + .filter((x) => config.reviewDimensions.includes(x)); + const appendable = recommended.filter((x) => !executionDimensions.includes(x)); + const recommendationReason = String(generalResult.output.recommendationReason || '').trim(); + + if (appendable.length > 0) { + executionDimensions.push(...appendable); + core.info( + `${tag}: general requested extra dimensions=${appendable.join(', ')}${recommendationReason ? ` reason=${recommendationReason}` : ''}` + ); + } else if (recommendationReason) { + core.info(`${tag}: general recommendation note=${recommendationReason}`); + } + } + + // Remaining dimensions (everything except general) + parallelDimensions = executionDimensions.filter((d) => d !== 'general'); + } + + // Phase 2: run remaining dimensions in parallel via semaphore + if (parallelDimensions.length > 0) { + const dimPromises = parallelDimensions.map((dimension) => + semaphore.run(() => runDimension({ + dimension, + round, + batchIndex, + batchCount, + batchFiles, + reviewerAgents, + config, + projectGuidance + })) + ); + + const dimSettled = await Promise.allSettled(dimPromises); + for (let i = 0; i < dimSettled.length; i += 1) { + const outcome = dimSettled[i]; + if (outcome.status === 'rejected') { + core.warning(`${tag} SubAgent(${parallelDimensions[i]}) unexpected error: ${getErrorMessage(outcome.reason)}`); + result.subAgentRuns += 1; + continue; + } + collectDimensionResult(outcome.value); + } + } + + if (!result.hasSuccessfulReviewerOutput && result.executed) { + core.warning(`${tag}: no successful sub-agent outputs.`); + } + + return result; +} + async function listPullFiles(octokit, owner, repo, pullNumber) { return octokit.paginate(octokit.rest.pulls.listFiles, { owner, @@ -619,187 +899,120 @@ async function runAction() { let roundProgress = false; - for (let batchIndex = 0; batchIndex < plannedBatches.length; batchIndex += 1) { - const batch = plannedBatches[batchIndex]; - if (modelCalls >= config.maxModelCalls) { - break; - } + // --- Parallel batch execution --- + const semaphore = createSemaphore(config.maxConcurrency); + const batchCount = plannedBatches.length; + + // Pre-allocate budget: each batch may use up to roundDimensions.length calls, + // with a 2x multiplier to account for potential structured-output repair retries. + const estimatedCallsPerBatch = roundDimensions.length * 2; + let budgetAllowedBatches = batchCount; + if (estimatedCallsPerBatch > 0) { + const budgetRemaining = config.maxModelCalls - modelCalls; + budgetAllowedBatches = Math.min(batchCount, Math.floor(budgetRemaining / estimatedCallsPerBatch)); + } - const batchFiles = batch.filePaths - .map((path) => pendingFileByPath.get(path)) - .filter(Boolean); + if (budgetAllowedBatches <= 0) { + core.info(`Round ${round}: insufficient budget for any batch (remaining=${config.maxModelCalls - modelCalls}), skipping.`); + break; + } - if (batchFiles.length === 0) { - core.warning(`Round ${round} Batch ${batchIndex + 1}: no resolvable files after mapping, skipped.`); - continue; - } - executedBatchCount += 1; + const batchesToRun = plannedBatches.slice(0, budgetAllowedBatches); + if (batchesToRun.length < batchCount) { core.info( - `Round ${round} Batch ${batchIndex + 1}/${plannedBatches.length}: focus=${batch.focus} files=${batchFiles.length}` + `Round ${round}: budget limits parallel batches to ${batchesToRun.length}/${batchCount}` ); + } - const batchResultByDimension = []; - let batchSuccessful = false; - const executionDimensions = []; - if (roundDimensions.includes('general')) { - executionDimensions.push('general'); - } - for (const dimension of roundDimensions) { - if (dimension !== 'general') { - executionDimensions.push(dimension); - } - } - if (executionDimensions.length === 0 && roundDimensions.length > 0) { - executionDimensions.push(roundDimensions[0]); - } + const batchPromises = batchesToRun.map((batch, batchIndex) => + executeBatch({ + batch, + batchIndex, + batchCount, + round, + roundDimensions, + pendingFileByPath, + reviewerAgents, + config, + projectGuidance, + semaphore + }) + ); - for (let dimIndex = 0; dimIndex < executionDimensions.length; dimIndex += 1) { - const dimension = executionDimensions[dimIndex]; - if (modelCalls >= config.maxModelCalls) { - break; - } + const batchSettled = await Promise.allSettled(batchPromises); - const agent = reviewerAgents[dimension] || createReviewerAgent({ - dimension, - model: config.reviewerModel, - language: config.reviewLanguage, - projectGuidance - }); - reviewerAgents[dimension] = agent; - - const reviewInput = buildBatchReviewInput({ - dimension, - round, - batchFiles, - maxContextChars: config.maxContextChars, - availableDimensions: config.reviewDimensions - }); - - if (reviewInput.selectedPaths.length === 0) { - core.warning( - `Round ${round} Batch ${batchIndex + 1} SubAgent(${dimension}): selectedPaths=0 due to context budget, skipped.` - ); - continue; - } + // --- Merge batch results into global state --- + for (let i = 0; i < batchSettled.length; i += 1) { + const outcome = batchSettled[i]; + if (outcome.status === 'rejected') { + hadStructuredOutputFailure = true; + core.warning(`Round ${round} Batch ${i + 1}: unexpected error: ${getErrorMessage(outcome.reason)}`); + continue; + } - const agentStart = Date.now(); - core.info( - `Round ${round} Batch ${batchIndex + 1} SubAgent(${dimension}) start: files=${reviewInput.selectedPaths.length}` - ); - const reviewResult = await runStructuredWithRepair(agent, reviewInput.prompt, { - allowRepair: true, - maxTurns: 10 - }); - modelCalls += reviewResult.calls; - reviewerCalls += reviewResult.calls; - subAgentRuns += 1; - const elapsedMs = Date.now() - agentStart; - - if (!reviewResult.ok) { - const reviewerErrorMessage = getErrorMessage(reviewResult.error); - hadStructuredOutputFailure = true; - addPublicDegradedReason( - degradedReasons, - `reviewer_structured_output_failed_round_${round}_${dimension}`, - reviewerErrorMessage - ); - core.warning( - `Round ${round} Batch ${batchIndex + 1} SubAgent(${dimension}) failed: calls=${reviewResult.calls} elapsed_ms=${elapsedMs} error=${reviewerErrorMessage}` - ); - continue; - } - hasSuccessfulReviewerOutput = true; + const result = outcome.value; + modelCalls += result.calls; + reviewerCalls += result.reviewerCalls; + subAgentRuns += result.subAgentRuns; - const findingCount = (reviewResult.output.findings || []).length; - const fileConclusionCount = (reviewResult.output.fileConclusions || []).length; - const suggestionCount = (reviewResult.output.actionableSuggestions || []).length; - core.info( - `Round ${round} Batch ${batchIndex + 1} SubAgent(${dimension}) done: calls=${reviewResult.calls} elapsed_ms=${elapsedMs} findings=${findingCount} file_conclusions=${fileConclusionCount} suggestions=${suggestionCount}` - ); + if (result.executed) { + executedBatchCount += 1; + } - if (dimension === 'general') { - const recommended = normalizeDimensionNames(reviewResult.output.recommendedExtraDimensions || []) - .filter((x) => x !== 'general') - .filter((x) => config.reviewDimensions.includes(x)); - const appendable = recommended.filter((x) => !executionDimensions.includes(x)); - const recommendationReason = String(reviewResult.output.recommendationReason || '').trim(); - - if (appendable.length > 0) { - executionDimensions.push(...appendable); - core.info( - `Round ${round} Batch ${batchIndex + 1}: general requested extra dimensions=${appendable.join(', ')}${recommendationReason ? ` reason=${recommendationReason}` : ''}` - ); - } else if (recommendationReason) { - core.info( - `Round ${round} Batch ${batchIndex + 1}: general recommendation note=${recommendationReason}` - ); - } - } + if (result.hadStructuredOutputFailure) { + hadStructuredOutputFailure = true; + } + if (result.hasSuccessfulReviewerOutput) { + hasSuccessfulReviewerOutput = true; + } - batchSuccessful = true; - batchResultByDimension.push({ - dimension, - selectedPaths: reviewInput.selectedPaths, - output: reviewResult.output - }); + for (const reason of result.degradedReasons) { + degradedReasons.push(reason); } - if (!batchSuccessful) { - core.warning(`Round ${round} Batch ${batchIndex + 1}: no successful sub-agent outputs.`); - continue; + for (const finding of result.findings) { + rawFindings.push(finding); } - for (const result of batchResultByDimension) { - for (const finding of result.output.findings || []) { - rawFindings.push({ - ...finding, - category: finding.category || result.dimension, - sourceDimension: result.dimension - }); - } + for (const suggestion of result.actionableSuggestions) { + actionableSuggestionsSet.add(suggestion); + } - for (const suggestion of result.output.actionableSuggestions || []) { - actionableSuggestionsSet.add(suggestion); - } + for (const risk of result.potentialRisks) { + potentialRisksSet.add(risk); + } - for (const risk of result.output.potentialRisks || []) { - potentialRisksSet.add(risk); - } + for (const testSuggestion of result.testSuggestions) { + testSuggestionsSet.add(testSuggestion); + } - for (const testSuggestion of result.output.testSuggestions || []) { - testSuggestionsSet.add(testSuggestion); + for (const fc of result.fileConclusions) { + if (targetPathSet.has(fc.path)) { + fileConclusionsByPath.set(fc.path, fc); } + } - for (const fileConclusion of result.output.fileConclusions || []) { - if (!targetPathSet.has(fileConclusion.path)) { - continue; - } - - fileConclusionsByPath.set(fileConclusion.path, { - path: fileConclusion.path, - conclusion: fileConclusion.conclusion, - risks: fileConclusion.risks || [], - testSuggestions: fileConclusion.testSuggestions || [], - note: fileConclusion.note || '' + for (const coveredPath of result.coveredPaths) { + if (coverageState.get(coveredPath)?.status === 'pending') { + coverageState.set(coveredPath, { + status: 'covered', + reason: `reviewed_round_${round}` }); - } - - for (const path of result.selectedPaths) { - if (coverageState.get(path)?.status === 'pending') { - coverageState.set(path, { - status: 'covered', - reason: `reviewed_round_${round}` - }); - roundProgress = true; - } + roundProgress = true; } } - core.info( - `Round ${round} Batch ${batchIndex + 1}: merged_dimensions=${batchResultByDimension.length} total_model_calls=${modelCalls}/${config.maxModelCalls}` - ); + if (result.dimensionResults.length > 0) { + core.info( + `Round ${round} [B${i + 1}/${batchCount}]: merged_dimensions=${result.dimensionResults.length} batch_calls=${result.calls}` + ); + } } + core.info( + `Round ${round}: all batches settled. total_model_calls=${modelCalls}/${config.maxModelCalls}` + ); + if (plannerRequestedStop) { plannerStoppedEarly = true; core.info(`Planner requested stop after round ${round}.`); @@ -1070,6 +1283,7 @@ module.exports = { chunk, addPublicDegradedReason, getErrorMessage, + createSemaphore, shouldUseSummaryOnlyMode, sanitizePlannedBatches, summarizePlannerBatchesForLog, diff --git a/test/config.test.js b/test/config.test.js index e1553df..be57b61 100644 --- a/test/config.test.js +++ b/test/config.test.js @@ -271,3 +271,33 @@ test('loadConfig rejects api_base when allowlist resolves to empty after normali /allowlist is empty/ ); }); + +test('loadConfig defaults maxConcurrency to 4', () => { + const config = loadConfigWithMockedInputs({ + github_token: 'ghs_xxx', + api_key: 'sk-test' + }); + + assert.equal(config.maxConcurrency, 4); +}); + +test('loadConfig parses custom max_concurrency', () => { + const config = loadConfigWithMockedInputs({ + github_token: 'ghs_xxx', + api_key: 'sk-test', + max_concurrency: '8' + }); + + assert.equal(config.maxConcurrency, 8); +}); + +test('loadConfig rejects invalid max_concurrency', () => { + assert.throws( + () => loadConfigWithMockedInputs({ + github_token: 'ghs_xxx', + api_key: 'sk-test', + max_concurrency: '0' + }), + /max_concurrency must be a positive integer/ + ); +}); diff --git a/test/index.test.js b/test/index.test.js index 6f3accf..5495fe4 100644 --- a/test/index.test.js +++ b/test/index.test.js @@ -8,6 +8,7 @@ const { chunk, sanitizePlannedBatches, shouldUseSummaryOnlyMode, + createSemaphore, formatConfidenceValue, buildInlineBody, buildReviewBody, @@ -269,3 +270,68 @@ test('unknown confidence count stays consistent between summary and review body' assert.match(summary, /Findings with unknown confidence \(N\/A\): 2/); assert.match(reviewBody, /Findings with unknown confidence: 2/); }); + +// --- createSemaphore tests --- + +test('createSemaphore limits concurrency', async () => { + const sem = createSemaphore(2); + let active = 0; + let maxActive = 0; + + const task = () => sem.run(async () => { + active += 1; + maxActive = Math.max(maxActive, active); + await new Promise((r) => setTimeout(r, 20)); + active -= 1; + }); + + await Promise.all([task(), task(), task(), task(), task()]); + assert.equal(maxActive, 2); + assert.equal(active, 0); +}); + +test('createSemaphore with concurrency=1 runs sequentially', async () => { + const sem = createSemaphore(1); + let active = 0; + let maxActive = 0; + + const task = () => sem.run(async () => { + active += 1; + maxActive = Math.max(maxActive, active); + await new Promise((r) => setTimeout(r, 10)); + active -= 1; + }); + + await Promise.all([task(), task(), task()]); + assert.equal(maxActive, 1); +}); + +test('createSemaphore propagates errors without blocking others', async () => { + const sem = createSemaphore(2); + const results = []; + + const tasks = [ + sem.run(async () => { results.push('a'); return 'a'; }), + sem.run(async () => { throw new Error('boom'); }), + sem.run(async () => { results.push('c'); return 'c'; }) + ]; + + const settled = await Promise.allSettled(tasks); + assert.equal(settled[0].status, 'fulfilled'); + assert.equal(settled[0].value, 'a'); + assert.equal(settled[1].status, 'rejected'); + assert.equal(settled[1].reason.message, 'boom'); + assert.equal(settled[2].status, 'fulfilled'); + assert.equal(settled[2].value, 'c'); + assert.deepEqual(results, ['a', 'c']); +}); + +test('createSemaphore returns values from async functions', async () => { + const sem = createSemaphore(3); + const results = await Promise.all([ + sem.run(async () => 1), + sem.run(async () => 2), + sem.run(async () => 3) + ]); + assert.deepEqual(results, [1, 2, 3]); +});