fix: surface stale CDK construct synth errors and validate gateway name length#1652
Open
aidandaly24 wants to merge 1 commit into
Open
fix: surface stale CDK construct synth errors and validate gateway name length#1652aidandaly24 wants to merge 1 commit into
aidandaly24 wants to merge 1 commit into
Conversation
…me length
Adding a gateway then deploying could fail two opaque ways:
1. CDK synth rejected `unknown keys (remove): "protocolType"` when the
project's bundled @aws/agentcore-cdk (frozen in node_modules by a
create-time lockfile) was older than the CLI, which now writes
protocolType into the gateway config. The deploy flow only runs
`npm run build`, never refreshing the dependency, so the error gave
no hint that the bundled construct was stale.
2. Gateway names were only validated by AWS at deploy, failing mid-deploy
with an opaque CloudFormation CREATE_FAILED when the composed
`${projectName}-${gatewayName}` exceeded the 48-char service limit.
Fix A: in synthesizeCdk, detect an "unknown keys (remove)" synth failure
and rewrite it to a StaleCdkConstructError that names the rejected keys,
reports the installed construct version, and tells the user to run
`npm update @aws/agentcore-cdk`. This is provable rather than presumptuous:
the CLI strict-validates agentcore.json against its own schema in preflight
before synth, so a genuine bad key fails earlier — any unknown-keys failure
reaching synth is necessarily version skew. The original error is preserved
as the cause (visible via AGENTCORE_DEBUG). Unrelated errors pass through.
Fix B: validate the composed gateway name against the 48-char limit at
`add gateway` (fails immediately) and in deploy preflight via
validateGatewayNames (backstop for imported/hand-edited configs),
mirroring the existing validateRuntimeNames. Imported gateways with an
explicit resourceName are skipped.
Closes #1651
Contributor
|
Claude Security Review: no high-confidence findings. (run) |
Contributor
Package TarballHow to installgh release download pr-1652-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.21.1.tgz |
agentcore-cli-automation
approved these changes
Jun 26, 2026
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
LGTM. Two small, focused fixes that turn previously-opaque deploy failures into clear, actionable errors.
Things I checked:
- Version-skew rewrite is provably narrow.
AgentCoreGatewaySchemais.strict()and is parsed by the CLI in preflight before synth, so anyunknown keys (remove)reaching synth necessarily implies the bundled construct's parser is stricter (i.e. older) than the CLI's — the comment inrewriteIfStaleCdkConstructaccurately captures this. - Regex against the construct's actual output.
UNKNOWN_KEYS_PATTERNmatches the exact format produced by@aws/agentcore-cdksrc/lib/errors/config.ts(${path}: unknown keys (remove): "a", "b"). The lazy.+?with\n|$boundary correctly captures the full key list on one line and stops at the next line (handles theCaused by:wrapping fromformatError). - Error preservation. Original synth error is kept as
{ cause }and remains visible withAGENTCORE_DEBUG=1viaformatError's cause chain. Non-Error throws are normalized. - Telemetry instrumentation.
StaleCdkConstructErroris added to theErrorNameenum incommon-shapes.ts, soclassifyErrorwill emit it viacli.command_runrather than rolling up toUnknownError. ✅ - Gateway length check placement.
validateGatewayNamesmirrorsvalidateRuntimeNames, runs in deploy preflight as a backstop, andcreateGatewayFromWizardfails the moment the user runsadd gateway.resourceName(imported gateways) is correctly skipped. The newadd gatewayfailure path returns throughadd()'stoError(err)and surfaces as JSONsuccess: falsewith the expected message — verified against the e2e test. - Tests follow established patterns. The new preflight tests use the same hoisted mocks the existing suite already established (mocks at the I/O boundary —
ConfigIO, AWS credentials,LocalCdkProject— not at every fs call).extractUnknownKeys/rewriteIfStaleCdkConstructare tested with pure functions and realErrorinstances (including the realistic nested cause shape). Theadd gatewaye2e test uses a real temp project viarunCLI. No excessive mocking. MAX_RUNTIME_NAME_LENGTHwas promoted from a file-local const toconstants.tsalongside the newMAX_GATEWAY_NAME_LENGTH. Value/behavior preserved.
No blocking issues.
Contributor
Coverage Report
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adding a gateway then deploying could fail two opaque ways. This PR turns both into clear, actionable errors raised before AWS/CDK ever sees them.
1. Stale bundled CDK construct → baffling synth error. A project created with an older CLI pins
@aws/agentcore-cdkinagentcore/cdk/node_modulesvia a create-time lockfile. The deploy flow only runsnpm run build(tsc) and never refreshes that dependency. Once the CLI is upgraded to a version that writesprotocolTypeinto the gateway config, the frozen older construct's strict schema rejects the key at synth withunknown keys (remove): "protocolType"— with no hint that the cause is a stale bundled construct.synthesizeCdknow detects anunknown keys (remove)synth failure and rewrites it to aStaleCdkConstructErrorthat names the rejected key(s), reports the installed construct version, and tells the user to runnpm update @aws/agentcore-cdkinagentcore/cdk/. This is provable, not presumptuous: the CLI strict-validatesagentcore.jsonagainst its own schema in preflight before synth, so a genuine bad/typo'd key fails earlier — anyunknown keysfailure reaching synth is necessarily version skew. The original synth error is preserved as{ cause }(still visible viaAGENTCORE_DEBUG=1), and unrelated synth errors pass through untouched.2. Gateway name length only validated by AWS at deploy. The deployed gateway resource name is composed as
${projectName}-${gatewayName}and AWS caps it at 48 chars (([0-9a-zA-Z][-]?){1,48}). A too-long name passedadd gatewayand synth, then failed mid-deploy with an opaque CloudFormationCREATE_FAILED. The composed name is now validated against the limit atadd gateway(fails the moment the command is run) and in deploy preflight viavalidateGatewayNames(backstop for imported/hand-edited configs), mirroring the existingvalidateRuntimeNames. Imported gateways carrying an explicitresourceNameare skipped.No changes to
@aws/agentcore-cdkorsrc/assets/are required — the latest construct already supportsprotocolType; the bug is purely that deploy never refreshes the frozen install.Related Issue
Closes #1651
Documentation PR
N/A — no user-facing docs change; both are clearer error messages on existing flows.
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshots (N/A — nosrc/assets/changes)Added 12 tests:
extractUnknownKeys/rewriteIfStaleCdkConstruct— single/multiple keys, nested cause chain (the real propagation shape), unrelated errors pass through, non-Error normalization.validateGatewayNamesviavalidateProject— over-limit rejection (the reporter's exact 51-char name), exactly-at-limit acceptance, imported-gateway skip.add gatewaye2e — composed name over 48 chars is rejected with the expected message.Also reproduced Fix B end-to-end with the reporter's exact inputs:
Note on Fix A test coverage: a pure downgrade-only live repro isn't possible from the registry — every published construct old enough to reject gateway
protocolType(≤ alpha.36) is too old to compile against the current vended template (missingHarnessSpecSchema), and every version that compiles (alpha.38+) already acceptsprotocolType. Fix A is covered by unit tests feeding the exact construct error string from the reporter's log plus verification that it surfaces through the full deploy → display path; the original is preserved as the error cause.Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.