From 35f7c257d105e00409c18626a082de5301df60fc Mon Sep 17 00:00:00 2001 From: Nelson Spence Date: Fri, 3 Apr 2026 21:13:48 -0500 Subject: [PATCH 1/2] docs(security): add extraction budget and per-field cap section Document the per-field extraction cap introduced in OpenHands/software-agent-sdk#2709. Explains the starvation vector, the fix, and the remaining boundaries. Coding-Agent: claude-code Model: claude-opus-4-6 --- sdk/guides/security.mdx | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/sdk/guides/security.mdx b/sdk/guides/security.mdx index ab1a6d3da..a4d4ae323 100644 --- a/sdk/guides/security.mdx +++ b/sdk/guides/security.mdx @@ -604,6 +604,29 @@ replacement for either. | Content past 30k chars is invisible | Hard cap prevents regex denial-of-service | Raise the cap (increases ReDoS exposure) | | `thinking_blocks` not scanned | Scanning model reasoning risks false positives on deliberation | Separate injection-only CoT scan | +#### Extraction budget and per-field cap + +The 30k-character extraction cap protects against regex denial-of-service, +but it creates a secondary risk: fields are extracted in order (`tool_name` +→ `tool_call.name` → `tool_call.arguments`), so an oversized early field +can consume the entire budget and make later fields invisible to scanning. + +Since `tool_name` has no length validation in the SDK, and the +non-function-calling code path allows arbitrary-length names, this is a +real starvation vector — not just a theoretical concern. + +A **per-field cap** (`_FIELD_CAP = _EXTRACT_HARD_CAP // 2`) ensures no +single field can consume more than 50% of the budget. With this cap, an +oversized `tool_name` is truncated at 15k characters, leaving at least +15k for `tool_call.arguments`. + +**Remaining boundaries** (documented as strict xfails in the test suite): +- Two adversarially large fields can still collectively exhaust the budget +- Multiple thought entries under the cap can collectively starve `summary` + +Both require reserved per-field budgets to fix, which needs usage data on +real field size distributions. + Ready-to-run example: [examples/01_standalone_sdk/47_defense_in_depth_security.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/47_defense_in_depth_security.py) From 3d2a05f488dfb5b3f532c51a509e7ef7c8747981 Mon Sep 17 00:00:00 2001 From: Nelson Spence Date: Fri, 22 May 2026 08:44:38 -0500 Subject: [PATCH 2/2] docs(security): correct extraction section to match shipped ordering The "extraction budget and per-field cap" section described a proposed `_FIELD_CAP` design that never shipped. Rewrite it to match the merged mechanism (#2709): one shared 30k budget per corpus consumed in priority order (arguments first for exec, summary first for reasoning), with no per-field cap. Both previously listed "remaining boundaries" are closed by the ordering; the only real residual is a single-field payload past 30k, already covered in the limitations table. --- sdk/guides/security.mdx | 48 ++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 22 deletions(-) diff --git a/sdk/guides/security.mdx b/sdk/guides/security.mdx index a4d4ae323..1b921606c 100644 --- a/sdk/guides/security.mdx +++ b/sdk/guides/security.mdx @@ -604,28 +604,32 @@ replacement for either. | Content past 30k chars is invisible | Hard cap prevents regex denial-of-service | Raise the cap (increases ReDoS exposure) | | `thinking_blocks` not scanned | Scanning model reasoning risks false positives on deliberation | Separate injection-only CoT scan | -#### Extraction budget and per-field cap - -The 30k-character extraction cap protects against regex denial-of-service, -but it creates a secondary risk: fields are extracted in order (`tool_name` -→ `tool_call.name` → `tool_call.arguments`), so an oversized early field -can consume the entire budget and make later fields invisible to scanning. - -Since `tool_name` has no length validation in the SDK, and the -non-function-calling code path allows arbitrary-length names, this is a -real starvation vector — not just a theoretical concern. - -A **per-field cap** (`_FIELD_CAP = _EXTRACT_HARD_CAP // 2`) ensures no -single field can consume more than 50% of the budget. With this cap, an -oversized `tool_name` is truncated at 15k characters, leaving at least -15k for `tool_call.arguments`. - -**Remaining boundaries** (documented as strict xfails in the test suite): -- Two adversarially large fields can still collectively exhaust the budget -- Multiple thought entries under the cap can collectively starve `summary` - -Both require reserved per-field budgets to fix, which needs usage data on -real field size distributions. +#### Extraction budget and primary-surface-first ordering + +The 30k-character cap is applied per scanning corpus, not per field: every +field competes for one shared budget (the `_BoundedSegments` buffer in +`defense_in_depth/utils.py`). That creates a secondary risk — a single +oversized field could consume the whole budget and leave higher-value +fields unscanned. `tool_name` has no length validation in the SDK, so a 30k +hallucinated name is a real starvation vector, not just a theoretical one. + +The analyzer addresses this by **extraction order**, not a per-field cap: +the primary attack surface is added first, so it always receives budget +even when a later field is adversarially large. + +- Executable corpus: `tool_call.arguments` (the primary prompt-injection + surface) → `tool_name` → `tool_call.name`. +- Reasoning corpus: `summary` (what the agent is about to do) → + `reasoning_content` → `thought`. + +The two corpora are extracted with separate budgets and concatenated +without a second outer cap, so a budget-filling `arguments` payload cannot +crowd `summary` out of the injection scan. + +**Remaining boundary** (a strict xfail in the test suite): a payload past +30k characters *within a single field* is still truncated and invisible. +That is the deliberate ReDoS trade-off already listed above; extraction +order does not change it. Ready-to-run example: [examples/01_standalone_sdk/47_defense_in_depth_security.py](https://github.com/OpenHands/software-agent-sdk/blob/main/examples/01_standalone_sdk/47_defense_in_depth_security.py)