fix(schema): auto-migrate pre-v0.4.0 agentcore.json legacy keys (#719)#1649
fix(schema): auto-migrate pre-v0.4.0 agentcore.json legacy keys (#719)#1649aidandaly24 wants to merge 4 commits into
Conversation
…ws#719) Renames legacy keys at parse time via a Zod preprocess on AgentCoreProjectSpecSchema: top-level `agents` -> `runtimes` (PR aws#706) and credential discriminator `type` -> `authorizerType` (PR aws#709), so pre-v0.4.0 projects keep validating, deploying, and running without manual edits. Mirrors the legacy-aware preprocess in primitives/harness. Refs aws#719
Package TarballHow to installgh release download pr-1649-tarball --repo aws/agentcore-cli --pattern "*.tgz" --dir /tmp/pr-tarball
npm install -g /tmp/pr-tarball/aws-agentcore-0.21.0.tgz |
|
Claude Security Review: no high-confidence findings. (run) |
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Nice, targeted fix that mirrors the existing legacy-aware preprocess pattern. The migration logic itself is straightforward, immutable on the input, and correctly handles the ambiguous-both-keys case (leaves the strict schema to error).
A few things worth addressing before merge:
- Test does not cover the canonical issue #719 shape. The reported error came from configs that included
type: "AgentCoreRuntime"on each agent entry and used OAuth/Payment credentials — the test omits both. See inline comment. - JSON Schema generation may be silently degraded. Wrapping the top-level export in
z.preprocesschanges its Zod kind fromZodObjecttoZodPipe/ZodEffects;scripts/generate-schema.mjscallsz.toJSONSchema(AgentCoreProjectSpecSchema, ...)andschemas/agentcore.schema.v1.jsonisn't regenerated in this PR. Worth runningnpm run build && npm run build:schemaand diffing the output before merging to confirm the published JSON Schema is unchanged. See inline comment. - No telemetry / no user-visible signal. The migration is silent, so we have no way to see how many users actually have legacy projects, and the on-disk file stays in the old shape forever (until something else triggers
writeProjectSpec). A telemetry signal (and ideally a one-time deprecation warning) would let us measure adoption and eventually remove this shim. See inline comment.
Requesting changes mainly for (1); (2) and (3) are open for discussion.
| expect(result.data.runtimes).toHaveLength(1); | ||
| expect(result.data.runtimes[0]!.name).toBe('MyAgent'); | ||
| expect('agents' in result.data).toBe(false); | ||
| expect(result.data.credentials[0]!.authorizerType).toBe('ApiKeyCredentialProvider'); |
There was a problem hiding this comment.
This test doesn't actually exercise the shape from issue #719. The original bug report has two characteristics the test omits:
- Each entry in
agentscarries a legacytype: "AgentCoreRuntime"field — the PR description and code comment both claim the migration "strips runtime type", butmigrateLegacyProjectSpecnever removes it. It only works today becauseAgentEnvSpecSchemahappens to be non-strict and silently drops unknown keys. If anyone later tightensAgentEnvSpecSchemato.strict()/strictObject()(which is plausible — the top-level schema is already strict), legacy projects regress and no test catches it. - Only
ApiKeyCredentialProvideris tested. The bug report explicitly showsOAuthCredentialProvider(which has a realdiscoveryUrland a defaultvendor), andPaymentCredentialProvideris the other arm of the discriminated union — neither is exercised.
Please extend the test (or add a second case) that mirrors the "Before" example in #719 verbatim:
agents: [
{
type: 'AgentCoreRuntime', // <-- legacy field, must be tolerated
name: 'MyAgent',
build: 'CodeZip',
entrypoint: 'main.py',
codeLocation: 'app/MyAgent/',
runtimeVersion: 'PYTHON_3_12',
networkMode: 'PUBLIC',
modelProvider: 'Bedrock',
protocol: 'HTTP',
},
],
credentials: [
{
type: 'OAuthCredentialProvider',
name: 'my-oauth',
discoveryUrl: 'https://idp.example.com/.well-known/openid-configuration',
vendor: 'CustomOauth2',
},
],And assert both that the legacy type is gone from the migrated runtime entry and that authorizerType === 'OAuthCredentialProvider'. Without this, the canonical reproducer from the linked issue isn't covered.
There was a problem hiding this comment.
Done. Added a dedicated test (auto-migrates the canonical issue #719 shape (runtime type stripped, OAuth credential)) that mirrors the #719 "Before" example verbatim — each agent entry carries type: "AgentCoreRuntime" and the credential is an OAuthCredentialProvider with discoveryUrl/vendor. It asserts the legacy type is gone from the migrated runtime entry and that authorizerType === 'OAuthCredentialProvider' (discoveryUrl + default vendor preserved). Also added a PaymentCredentialProvider case and a case where the array already uses runtimes but still carries the legacy runtime type.
Importantly, migrateLegacyProjectSpec now explicitly strips type: "AgentCoreRuntime" from runtime entries rather than relying on AgentEnvSpecSchema being non-strict — so legacy projects stay valid even if that schema is later tightened to .strict()/strictObject().
| return spec; | ||
| } | ||
|
|
||
| export const AgentCoreProjectSpecSchema = z.preprocess(migrateLegacyProjectSpec, AgentCoreProjectSpecBaseSchema); |
There was a problem hiding this comment.
Wrapping AgentCoreProjectSpecSchema in z.preprocess changes its Zod kind from a ZodObject (with .strict() + .superRefine()) to a ZodPipe/effects wrapper. That has two downstream implications worth checking:
- JSON Schema generation.
scripts/generate-schema.mjscallsz.toJSONSchema(AgentCoreProjectSpecSchema, { target: 'draft-07' })and the result is committed toschemas/agentcore.schema.v1.json. Zod v4'stoJSONSchemahandles pipes by emitting the input side by default — i.e. it inspects the inner schema — but the exact output (e.g. whetheradditionalProperties: falseis preserved, whethersuperRefinecross-field rules survive) can shift. This PR doesn't regenerate the committed JSON Schema; please runnpm run build && npm run build:schemaand either commit the (hopefully unchanged) regenerated file or confirm there's no diff, so we don't ship a silently-degradedagentcore.schema.v1.json. - Type identity. Anything that consumed
AgentCoreProjectSpecSchemaas aZodObject(e.g..shape,.extend,.merge,.pick,.omit) will no longer compile. I grepped and didn't find any current callers, but worth knowing for future consumers — consider exportingAgentCoreProjectSpecBaseSchemaas well (or naming the preprocess wrapper something likeAgentCoreProjectSpecParser) so the underlying object schema stays accessible if needed.
There was a problem hiding this comment.
Confirmed and addressed both points:
-
JSON Schema generation — ran
npm run build && npm run build:schemaafter thez.preprocesswrap:schemas/agentcore.schema.v1.jsonregenerates byte-identical to the committed file (verified withdiff), so the published draft-07 schema is unchanged. Zod v4'stoJSONSchemaemits the input side of the pipe (the innerZodObject), soadditionalProperties: falseand thesuperRefinecross-field rules are preserved exactly as before. No diff to commit. -
Type identity — exported
AgentCoreProjectSpecBaseSchema(the underlying strictZodObject, before the preprocess wrapper) so future consumers needing.shape/.extend/.pick/etc. still have access; the publicAgentCoreProjectSpecSchemastays the migrating parser.
| return entry; | ||
| }); | ||
| } | ||
| return spec; |
There was a problem hiding this comment.
Two UX/observability gaps with a fully-silent migration:
- No telemetry signal. Per
src/cli/telemetry/README.mdwe instrument behavior changes so we can measure them. Without an event here we have no way to know how many users still have legacy projects and therefore no data-driven trigger for eventually removing this shim. Consider emitting a metric (e.g.cli.legacy_project_migratedwith attributes for which legacy keys were detected:had_agents_key,had_credential_type_key) when this preprocess actually rewrites something — wired in via the loader atsrc/lib/schemas/io/config-io.ts#readProjectSpec, not from inside the pure schema, so the schema stays side-effect-free. - No user-visible signal. The on-disk
agentcore.jsonstays in the old shape indefinitely; nothing tells the user to update their file. A one-timeconsole.warn/CLI warning along the lines of "agentcore.json uses pre-v0.4.0 keys (agents,typeon credentials). These are auto-migrated for now but support will be removed in a future release — please rename toruntimesandauthorizerType." would (a) let users update at their convenience and (b) make it possible to actually drop the shim later without re-breaking the same users.
Neither needs to live in the schema file itself — both belong in the loader — but at least one should land alongside this fix so we aren't stuck supporting the legacy shape forever with no measurement.
There was a problem hiding this comment.
Both landed, wired in the loader (not the pure schema) exactly as suggested:
-
Telemetry — new
cli.legacy_project_migratedmetric (registered inschemas/registry.ts, attrs incommon-shapes.ts) emitted withhad_agents_key,had_credential_type_key, andhad_runtime_type_keyso we can measure legacy-project adoption and have a data-driven trigger to eventually drop the shim. -
One-time deprecation notice — a
process.stderrwarning telling users theiragentcore.jsonuses pre-v0.4.0 keys that are auto-migrated for now but will lose support, with the rename guidance.
Both are driven from ConfigIO.readProjectSpec via a new detectsLegacyProjectKeys(raw) check + a lib-clean setLegacyProjectMigrationReporter hook, so src/lib stays free of any CLI/telemetry import and the schema stays side-effect-free. The CLI registers the reporter once at startup (registerLegacyProjectMigrationReporter in cli.ts). Reporter errors are swallowed so observability can never break a config read (covered by a test).
…elemetry - strip the legacy per-runtime `type: "AgentCoreRuntime"` discriminator explicitly in migrateLegacyProjectSpec (robust even if AgentEnvSpecSchema is later tightened to .strict()) and add a test mirroring the issue aws#719 "Before" example verbatim (legacy runtime type + OAuth/Payment credentials) - export AgentCoreProjectSpecBaseSchema so consumers needing ZodObject methods keep access after the z.preprocess wrap; regenerated schema confirmed byte-identical - emit `cli.legacy_project_migrated` telemetry plus a one-time deprecation notice when a pre-v0.4.0 agentcore.json is auto-migrated, wired in the loader via a lib-clean reporter hook so src/lib stays free of CLI/telemetry imports Refs aws#719
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Nice iteration — the canonical #719 shape is now covered, the JSON Schema regeneration is verified byte-identical, and the telemetry / deprecation-notice plumbing is wired through the loader cleanly. Two follow-up issues worth addressing before merge.
| .catch(() => { | ||
| // Telemetry is best-effort and must never affect CLI behavior. | ||
| }); | ||
| printDeprecationNotice(); |
There was a problem hiding this comment.
The deprecation notice fires synchronously on process.stderr the first time readProjectSpec() is called — which during interactive use happens inside the TUI render loop (e.g. useCreateFlow, useDeployFlow, useAddAgent, ValidateScreen, RunEvalFlow, …). At that point we're in the alt-screen buffer (renderTUI writes ENTER_ALT_SCREEN before render()), so a stderr write at this moment will either:
- be drawn over by the next Ink frame and effectively invisible to the user (so the "one-time" notice is consumed silently and never shown again), or
- briefly scribble on the rendered UI before being repainted.
Either way the deprecation signal is lost to TUI users — which is the majority of legacy-project users since they're upgrading from a working install. The telemetry metric still fires correctly; only the user-visible nudge is the problem.
A couple of options:
- Defer the print until after the TUI/command exits. Track a "had legacy migration" flag in the reporter, and have
printPostCommandNotices(or a sibling) print it afterrenderTUI/program.parseAsyncreturns and the alt-screen buffer is restored. This mirrors howprintTelemetryNoticeandprintUpdateNotificationare already deferred toprintPostCommandNoticesincli.ts. - Surface it inside the TUI instead of via stderr. A small toast / status-line banner from the TUI layer when the legacy info arrives — heavier but more discoverable.
- Print to stderr only when not in alt-screen / not under Ink. Cheapest but leaves TUI users with no nudge at all, which defeats the purpose.
(1) seems closest to the existing notice patterns in the codebase.
There was a problem hiding this comment.
Good catch — fixed with option (1). The reporter no longer prints synchronously; it just arms a migrationObserved flag (and emits telemetry). The one-time deprecation notice is now flushed from printPostCommandNotices via printLegacyProjectMigrationNotice(), which both the TUI path (renderTUI -> printPostCommandNotices at render.ts:112, after the alt-screen buffer is restored) and the CLI path (cli.ts after program.parseAsync) already funnel through. This mirrors how printTelemetryNotice/printUpdateNotification are deferred, so the notice is visible to TUI users instead of being painted over. Added a test asserting the reporter does not write synchronously and the notice prints exactly once after the deferred flush.
| let noticePrinted = false; | ||
|
|
||
| /** Reset the one-time-notice latch. Test-only. */ | ||
| export function resetLegacyProjectMigrationNotice(): void { |
There was a problem hiding this comment.
This new CLI-layer module has no direct unit test. The lib-layer tests in config-io.test.ts verify that the reporter is invoked with the right LegacyProjectMigrationInfo, but nothing exercises what this file actually does with that info, specifically:
- The camelCase → snake_case attribute mapping (
hadAgentsKey→had_agents_key, etc.) on thecli.legacy_project_migratedemit. TypeScript catches a missing/renamed attribute, but it won't catch a value going to the wrong attribute (e.g.had_agents_key: info.hadCredentialTypeKey). A one-line assertion againstInMemorySinkwould lock this down. - The "one-time" semantics of
printDeprecationNotice— that the second invocation of the reporter is a no-op for the notice.resetLegacyProjectMigrationNoticeis exported as test-only specifically for this, but is currently unused (dead export). - That a
TelemetryClientAccessor.get()rejection doesn't propagate (the.catch(() => {})contract).
The existing src/cli/telemetry/__tests__/client.test.ts is a good template — it spies TelemetryClientAccessor.get to return a TelemetryClient(InMemorySink) and asserts on sink.metrics. A few similar cases here would (a) actually use the resetLegacyProjectMigrationNotice export and (b) prevent silent regressions in the attribute mapping / one-time latch.
There was a problem hiding this comment.
Added src/cli/telemetry/__tests__/legacy-project-migration.test.ts covering exactly these gaps, using the TelemetryClientAccessor.get + InMemorySink template from client.test.ts:
- Attribute mapping — asserts
cli.legacy_project_migratedis emitted with{ had_agents_key, had_credential_type_key, had_runtime_type_key }each carrying its own value (e.g.hadCredentialTypeKey: falsemaps tohad_credential_type_key: 'false'), which would catch a wrong-attribute wiring bug. - One-time latch — invokes the reporter, then calls
printLegacyProjectMigrationNotice()twice and assertsprocess.stderr.writefires exactly once. This now uses theresetLegacyProjectMigrationNoticeexport inbeforeEach/afterEach(no longer a dead export). - Deferred-print contract — asserts the reporter does not write to stderr synchronously (the notice is only flushed later).
- Rejection swallowing — mocks
TelemetryClientAccessor.get()to reject and asserts the reporter doesn't throw and no metric is recorded.
…the reporter - the deprecation notice was written to stderr synchronously from the reporter, which during interactive use lands inside the Ink alt-screen buffer and is repainted over (lost to TUI users). Arm a flag in the reporter instead and flush the one-time notice from printPostCommandNotices after the alt-screen is restored, mirroring the existing telemetry/update-notification deferral - add direct unit tests for the CLI reporter module: camelCase->snake_case attr mapping on cli.legacy_project_migrated, the one-time notice latch (exercising the previously-unused resetLegacyProjectMigrationNotice export), the deferred-print contract, and that a TelemetryClientAccessor.get() rejection is swallowed Refs aws#719
|
Claude Security Review: no high-confidence findings. (run) |
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
All prior review feedback has been addressed in this iteration:
- Canonical issue #719 shape covered —
auto-migrates the canonical issue #719 shape (runtime type stripped, OAuth credential)mirrors the bug report verbatim and asserts the legacytypeis gone from runtime entries (not just silently dropped by a non-strict schema).PaymentCredentialProviderand theruntimes-with-legacy-type cases are also covered. - JSON Schema generation verified — author confirmed byte-identical regeneration of
schemas/agentcore.schema.v1.jsonafter thez.preprocesswrap.AgentCoreProjectSpecBaseSchemais exported for any future consumer that needsZodObjectmethods. - Telemetry + deprecation notice wired in the loader, not the schema —
cli.legacy_project_migratedis registered correctly (description, attributes,MetricAttrsbranch) and thesrc/liblayer stays free of any CLI/telemetry import via thesetLegacyProjectMigrationReporterhook. - Deferred-print past the alt-screen — the reporter only arms a flag, and
printLegacyProjectMigrationNotice()is flushed fromprintPostCommandNotices(called by bothrenderTUIand the CLI tail), so TUI users actually see the notice. - Reporter unit tests — attribute mapping, one-time latch, deferred-print contract, and
TelemetryClientAccessor.get()rejection-swallowing are all covered.
The migration logic is pure and immutable on its input, the detector and migrator are consistent ('agents' in spec && !('runtimes' in spec) gate, credential migration gated on 'type' in entry && !('authorizerType' in entry)), and reporter errors are swallowed so observability can't break a config read. LGTM.
Description
#719 — agentcore.json: 'unknown keys (remove): agents' after upgrading to v0.4.0 — migration guide for agents→runtimes
Any AgentCore project created before the v0.4.0 schema rename (agents->runtimes, credential type->authorizerType) fails to validate, deploy, or run ANY project-aware command (deploy/add/invoke/import/export/validate) after upgrading. The CLI emits a cryptic Zod error ("root: unknown keys (remove): "agents"") with zero hint that the field was renamed and no auto-migration. Bounded to users upgrading existing pre-v0.4.0 projects; new projects are unaffected.
Fix
DX gap, not a schema defect (the breaking change was intentional and announced in CHANGELOG.md:598-604). Best fix: a Zod .preprocess on AgentCoreProjectSpecSchema that auto-renames legacy keys at parse time (agents->runtimes, credential type->authorizerType, strip runtime type) so old projects keep working with zero user action — this mirrors the existing legacy-aware .preprocess pattern in src/schema/schemas/primitives/harness.ts:374. Cheaper alternative: a targeted friendly error mirroring the httpGateways .max(0, '...') deprecation pattern at agentcore-project.ts:545-551, e.g. add an
agentsfield that errors with "The 'agents' field was renamed to 'runtimes' in v0.4.0; rename it and change credential 'type' to 'authorizerType'." At minimum, make the cryptic root message name the rename instead of only living in GitHub issue #719. A fullagentcore migratecommand is the heaviest option and likely unnecessary given the migration is 3 deterministic edits.This fix was produced by an automated triage+fix run and validated locally (build + unit suite + symptom reproduction). It is being opened as a draft for CI and human review.
Related Issue
Closes #719
Documentation PR
N/A — bug fix; no devguide change required.
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshots — N/A, nosrc/assets/changesChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.