From 3db7b27d9e59cac009af62f4b6690563bacfbe22 Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 16:49:07 +0000 Subject: [PATCH 1/7] pr-review: enable sub-agent delegation by default MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Change the `use-sub-agents` input default from `'false'` to `'true'` so that file-level sub-agent delegation is enabled out of the box. Changes: - action.yml: default `'false'` → `'true'` - pr-review-by-openhands.yml: explicitly pass `use-sub-agents: 'true'` - README.md: update the inputs table to reflect the new default Co-authored-by: openhands --- plugins/pr-review/README.md | 2 +- plugins/pr-review/action.yml | 2 +- plugins/pr-review/workflows/pr-review-by-openhands.yml | 1 + 3 files changed, 3 insertions(+), 2 deletions(-) diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index c5f9ebc9..87204fc3 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -142,7 +142,7 @@ PR reviews are automatically triggered when: | `llm-base-url` | No | `''` | Custom LLM endpoint URL | | `review-style` | No | `roasted` | **[DEPRECATED]** Previously chose between `standard` and `roasted` review styles. Now ignored — the styles have been merged into a single unified skill. | | `require-evidence` | No | `'false'` | Require the reviewer to enforce an `Evidence` section in the PR description with end-to-end proof: screenshots/videos for frontend work, commands and runtime output for backend or scripts, and an agent conversation link when applicable. Test output alone does not qualify. | -| `use-sub-agents` | No | `'false'` | **(Experimental)** Enable sub-agent delegation for file-level reviews. The main agent acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. Useful for large PRs with many changed files. | +| `use-sub-agents` | No | `'true'` | **(Experimental)** Enable sub-agent delegation for file-level reviews. The main agent acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. Useful for large PRs with many changed files. | | `extensions-repo` | No | `OpenHands/extensions` | Extensions repository | | `extensions-version` | No | `main` | Git ref (tag, branch, or SHA) | | `llm-api-key` | Yes | - | LLM API key | diff --git a/plugins/pr-review/action.yml b/plugins/pr-review/action.yml index 48211754..19842997 100644 --- a/plugins/pr-review/action.yml +++ b/plugins/pr-review/action.yml @@ -33,7 +33,7 @@ inputs: When true, the agent gets the TaskToolSet and decides at runtime whether to delegate based on diff size and complexity. required: false - default: 'false' + default: 'true' extensions-repo: description: GitHub repository for extensions (owner/repo) required: false diff --git a/plugins/pr-review/workflows/pr-review-by-openhands.yml b/plugins/pr-review/workflows/pr-review-by-openhands.yml index 744de8a6..1b236579 100644 --- a/plugins/pr-review/workflows/pr-review-by-openhands.yml +++ b/plugins/pr-review/workflows/pr-review-by-openhands.yml @@ -45,6 +45,7 @@ jobs: llm-base-url: https://llm-proxy.app.all-hands.dev # [DEPRECATED] review-style is no longer used; standard and roasted are merged # review-style: roasted + use-sub-agents: 'true' llm-api-key: ${{ secrets.LLM_API_KEY }} github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC }} lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }} From 46e84dbf22949c511ac0d1bfd9e4e4d1df37c2f1 Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:00:35 +0000 Subject: [PATCH 2/7] fix: sync .github/workflows copy with plugin workflow The test_workflow_files_are_in_sync test requires .github/workflows/ and plugins/pr-review/workflows/ copies to be identical. Copy the updated pr-review-by-openhands.yml to .github/workflows/. Co-authored-by: openhands --- .github/workflows/pr-review-by-openhands.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/pr-review-by-openhands.yml b/.github/workflows/pr-review-by-openhands.yml index 744de8a6..1b236579 100644 --- a/.github/workflows/pr-review-by-openhands.yml +++ b/.github/workflows/pr-review-by-openhands.yml @@ -45,6 +45,7 @@ jobs: llm-base-url: https://llm-proxy.app.all-hands.dev # [DEPRECATED] review-style is no longer used; standard and roasted are merged # review-style: roasted + use-sub-agents: 'true' llm-api-key: ${{ secrets.LLM_API_KEY }} github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC }} lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }} From e2682bfe277f9148cdb06b3d84cfe7c695a973c0 Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:04:06 +0000 Subject: [PATCH 3/7] chore: address PR review feedback (#205) - Remove 'experimental' label from use-sub-agents since it is now the default - Add opt-out guidance ('false' to restore single-agent behavior) in action.yml description, README inputs table, and Known Limitations section - Add clarity comment on the explicit workflow setting - Sync .github/workflows/ copy Co-authored-by: openhands --- .github/workflows/pr-review-by-openhands.yml | 2 +- plugins/pr-review/README.md | 8 ++++---- plugins/pr-review/action.yml | 3 ++- plugins/pr-review/workflows/pr-review-by-openhands.yml | 2 +- 4 files changed, 8 insertions(+), 7 deletions(-) diff --git a/.github/workflows/pr-review-by-openhands.yml b/.github/workflows/pr-review-by-openhands.yml index 1b236579..fdaa5d4d 100644 --- a/.github/workflows/pr-review-by-openhands.yml +++ b/.github/workflows/pr-review-by-openhands.yml @@ -45,7 +45,7 @@ jobs: llm-base-url: https://llm-proxy.app.all-hands.dev # [DEPRECATED] review-style is no longer used; standard and roasted are merged # review-style: roasted - use-sub-agents: 'true' + use-sub-agents: 'true' # Matches the default; kept for clarity llm-api-key: ${{ secrets.LLM_API_KEY }} github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC }} lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }} diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index 87204fc3..38e70ca6 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -24,7 +24,7 @@ Then configure the required secrets (see [Installation](#installation) below). - **A/B Testing**: Support for testing multiple LLM models - **Review Context Awareness**: Considers previous reviews and unresolved threads - **Evidence Enforcement**: Optional check that PR descriptions include concrete end-to-end proof the code works, not just test output -- **Sub-Agent Delegation** *(Experimental)*: Split large PR reviews across multiple sub-agents, one per file, then consolidate findings (see [Known Limitations](#known-limitations-sub-agent-delegation)) +- **Sub-Agent Delegation**: Split large PR reviews across multiple sub-agents, one per file, then consolidate findings (see [Known Limitations](#known-limitations-sub-agent-delegation)) - **Observability**: Optional Laminar integration for tracing and evaluation ## Plugin Contents @@ -142,7 +142,7 @@ PR reviews are automatically triggered when: | `llm-base-url` | No | `''` | Custom LLM endpoint URL | | `review-style` | No | `roasted` | **[DEPRECATED]** Previously chose between `standard` and `roasted` review styles. Now ignored — the styles have been merged into a single unified skill. | | `require-evidence` | No | `'false'` | Require the reviewer to enforce an `Evidence` section in the PR description with end-to-end proof: screenshots/videos for frontend work, commands and runtime output for backend or scripts, and an agent conversation link when applicable. Test output alone does not qualify. | -| `use-sub-agents` | No | `'true'` | **(Experimental)** Enable sub-agent delegation for file-level reviews. The main agent acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. Useful for large PRs with many changed files. | +| `use-sub-agents` | No | `'true'` | Enable sub-agent delegation for file-level reviews. The main agent acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. Useful for large PRs with many changed files. To restore the previous single-agent behavior, set to `'false'`. | | `extensions-repo` | No | `OpenHands/extensions` | Extensions repository | | `extensions-version` | No | `main` | Git ref (tag, branch, or SHA) | | `llm-api-key` | Yes | - | LLM API key | @@ -165,14 +165,14 @@ Python dependency caching is **disabled by default**. `uv run --with ...` re-dow ## Known Limitations: Sub-Agent Delegation -The `use-sub-agents` feature is **experimental** and has the following known constraints: +The `use-sub-agents` feature has the following known constraints: - **LLM-driven JSON parsing**: The coordinator agent relies on the LLM to parse and merge JSON responses from sub-agents. There is no code-level validation of sub-agent output, so malformed responses may cause incomplete reviews. - **Potential information loss during consolidation**: When merging findings from multiple sub-agents, the coordinator may lose or deduplicate findings imperfectly, especially for cross-file issues. - **No integration tests yet**: Current test coverage verifies prompt formatting only. End-to-end validation of the delegation flow requires manual workflow testing. - **Sub-agents have read-only tools**: File reviewer sub-agents have access to `terminal` and `file_editor` for inspecting full source files and surrounding context, but they cannot query the GitHub API or post reviews — only the coordinator handles GitHub interaction. -These limitations are acceptable for an opt-in experimental feature and will be addressed as the feature matures. +These limitations will be addressed as the feature matures. To opt out, set `use-sub-agents: 'false'` in your workflow. ## A/B Testing Multiple Models diff --git a/plugins/pr-review/action.yml b/plugins/pr-review/action.yml index 19842997..3f302d42 100644 --- a/plugins/pr-review/action.yml +++ b/plugins/pr-review/action.yml @@ -29,9 +29,10 @@ inputs: default: 'false' use-sub-agents: description: > - Enable sub-agent delegation for file-level reviews (experimental). + Enable sub-agent delegation for file-level reviews. When true, the agent gets the TaskToolSet and decides at runtime whether to delegate based on diff size and complexity. + Set to 'false' to restore the previous single-agent behavior. required: false default: 'true' extensions-repo: diff --git a/plugins/pr-review/workflows/pr-review-by-openhands.yml b/plugins/pr-review/workflows/pr-review-by-openhands.yml index 1b236579..fdaa5d4d 100644 --- a/plugins/pr-review/workflows/pr-review-by-openhands.yml +++ b/plugins/pr-review/workflows/pr-review-by-openhands.yml @@ -45,7 +45,7 @@ jobs: llm-base-url: https://llm-proxy.app.all-hands.dev # [DEPRECATED] review-style is no longer used; standard and roasted are merged # review-style: roasted - use-sub-agents: 'true' + use-sub-agents: 'true' # Matches the default; kept for clarity llm-api-key: ${{ secrets.LLM_API_KEY }} github-token: ${{ secrets.OPENHANDS_BOT_GITHUB_PAT_PUBLIC }} lmnr-api-key: ${{ secrets.LMNR_SKILLS_API_KEY }} From c6a5a2bf57fd6add5ae244461b5aea067603635a Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:08:44 +0000 Subject: [PATCH 4/7] =?UTF-8?q?chore:=20address=20PR=20review=20feedback?= =?UTF-8?q?=20=E2=80=94=20add=20migration=20guide,=20fix=20maturity=20lang?= =?UTF-8?q?uage?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add a 'Migration: Sub-Agent Delegation Now Enabled by Default' section with what changed, how to opt out, and behavioral differences - Remove 'No integration tests yet' bullet (contradicts enabling by default) - Remove 'will be addressed as the feature matures' closing (maturity contradiction) - Keep Known Limitations factual without implying the feature is immature Co-authored-by: openhands --- plugins/pr-review/README.md | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index 38e70ca6..16418570 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -163,16 +163,35 @@ Python dependency caching is **disabled by default**. `uv run --with ...` re-dow **Self-hosted runners:** Consider mounting a host-level uv cache volume (e.g. `/home/runner/.cache` as a Docker volume) instead of — or in addition to — this option. A local volume is faster than a round trip to GHA cache storage and does not cross any trust boundary. +## Migration: Sub-Agent Delegation Now Enabled by Default + +As of this release, `use-sub-agents` defaults to `'true'`. Previously it defaulted to `'false'`. + +**What changed:** The main review agent now acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. This improves review depth for large PRs. + +**How to restore the previous behavior:** Set `use-sub-agents: 'false'` in your workflow: + +```yaml +- uses: OpenHands/extensions/plugins/pr-review@main + with: + use-sub-agents: 'false' + # ... other inputs +``` + +**Behavioral differences to expect:** +- Reviews may take slightly longer due to sub-agent coordination overhead +- File-level analysis is more thorough, especially on large multi-file PRs +- The coordinator merges findings from sub-agents, which may surface more issues + ## Known Limitations: Sub-Agent Delegation The `use-sub-agents` feature has the following known constraints: - **LLM-driven JSON parsing**: The coordinator agent relies on the LLM to parse and merge JSON responses from sub-agents. There is no code-level validation of sub-agent output, so malformed responses may cause incomplete reviews. - **Potential information loss during consolidation**: When merging findings from multiple sub-agents, the coordinator may lose or deduplicate findings imperfectly, especially for cross-file issues. -- **No integration tests yet**: Current test coverage verifies prompt formatting only. End-to-end validation of the delegation flow requires manual workflow testing. - **Sub-agents have read-only tools**: File reviewer sub-agents have access to `terminal` and `file_editor` for inspecting full source files and surrounding context, but they cannot query the GitHub API or post reviews — only the coordinator handles GitHub interaction. -These limitations will be addressed as the feature matures. To opt out, set `use-sub-agents: 'false'` in your workflow. +To opt out, set `use-sub-agents: 'false'` in your workflow. ## A/B Testing Multiple Models From cf0a8b1e32c1e1149a6ecb8afb2b94a6e2e4f83f Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:21:52 +0000 Subject: [PATCH 5/7] chore: align limitations section with maturity claim MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add tradeoff justification to Known Limitations — these constraints are acceptable because worst-case is a less thorough review (not data loss or security risk), and the single-agent fallback is always available. Co-authored-by: openhands --- plugins/pr-review/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index 16418570..f5a317fa 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -185,7 +185,7 @@ As of this release, `use-sub-agents` defaults to `'true'`. Previously it default ## Known Limitations: Sub-Agent Delegation -The `use-sub-agents` feature has the following known constraints: +The following are known constraints of the sub-agent delegation feature. These are acceptable tradeoffs for the improved review depth it provides, and none result in data loss or security risk — in the worst case a review may be less thorough than expected, which the single-agent fallback (`use-sub-agents: 'false'`) addresses. - **LLM-driven JSON parsing**: The coordinator agent relies on the LLM to parse and merge JSON responses from sub-agents. There is no code-level validation of sub-agent output, so malformed responses may cause incomplete reviews. - **Potential information loss during consolidation**: When merging findings from multiple sub-agents, the coordinator may lose or deduplicate findings imperfectly, especially for cross-file issues. From 5b0c3bf7731e194bd478a6ca96fa0086930f9912 Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:25:50 +0000 Subject: [PATCH 6/7] chore: fix wording inconsistency in Known Limitations Changed 'none result in data loss' to 'none pose a security risk' to avoid contradicting the 'potential information loss during consolidation' bullet. Co-authored-by: openhands --- plugins/pr-review/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index f5a317fa..44929afe 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -185,7 +185,7 @@ As of this release, `use-sub-agents` defaults to `'true'`. Previously it default ## Known Limitations: Sub-Agent Delegation -The following are known constraints of the sub-agent delegation feature. These are acceptable tradeoffs for the improved review depth it provides, and none result in data loss or security risk — in the worst case a review may be less thorough than expected, which the single-agent fallback (`use-sub-agents: 'false'`) addresses. +The following are known constraints of the sub-agent delegation feature. These are acceptable tradeoffs for the improved review depth it provides, and none pose a security risk — in the worst case a review may be less thorough than expected, which the single-agent fallback (`use-sub-agents: 'false'`) addresses. - **LLM-driven JSON parsing**: The coordinator agent relies on the LLM to parse and merge JSON responses from sub-agents. There is no code-level validation of sub-agent output, so malformed responses may cause incomplete reviews. - **Potential information loss during consolidation**: When merging findings from multiple sub-agents, the coordinator may lose or deduplicate findings imperfectly, especially for cross-file issues. From 4b5beda22acc2a78501b486f4d8bb3ce3ad5df6c Mon Sep 17 00:00:00 2001 From: openhands Date: Thu, 23 Apr 2026 17:29:32 +0000 Subject: [PATCH 7/7] =?UTF-8?q?chore:=20remove=20migration=20section=20?= =?UTF-8?q?=E2=80=94=20feature=20not=20yet=20publicly=20released?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: openhands --- plugins/pr-review/README.md | 20 -------------------- 1 file changed, 20 deletions(-) diff --git a/plugins/pr-review/README.md b/plugins/pr-review/README.md index 44929afe..6eee790d 100644 --- a/plugins/pr-review/README.md +++ b/plugins/pr-review/README.md @@ -163,26 +163,6 @@ Python dependency caching is **disabled by default**. `uv run --with ...` re-dow **Self-hosted runners:** Consider mounting a host-level uv cache volume (e.g. `/home/runner/.cache` as a Docker volume) instead of — or in addition to — this option. A local volume is faster than a round trip to GHA cache storage and does not cross any trust boundary. -## Migration: Sub-Agent Delegation Now Enabled by Default - -As of this release, `use-sub-agents` defaults to `'true'`. Previously it defaulted to `'false'`. - -**What changed:** The main review agent now acts as a coordinator that delegates per-file review work to `file_reviewer` sub-agents via the SDK TaskToolSet, then consolidates findings into a single PR review. This improves review depth for large PRs. - -**How to restore the previous behavior:** Set `use-sub-agents: 'false'` in your workflow: - -```yaml -- uses: OpenHands/extensions/plugins/pr-review@main - with: - use-sub-agents: 'false' - # ... other inputs -``` - -**Behavioral differences to expect:** -- Reviews may take slightly longer due to sub-agent coordination overhead -- File-level analysis is more thorough, especially on large multi-file PRs -- The coordinator merges findings from sub-agents, which may surface more issues - ## Known Limitations: Sub-Agent Delegation The following are known constraints of the sub-agent delegation feature. These are acceptable tradeoffs for the improved review depth it provides, and none pose a security risk — in the worst case a review may be less thorough than expected, which the single-agent fallback (`use-sub-agents: 'false'`) addresses.