fix(add): support sessionTimeoutMinutes and filters for online eval config#1166
fix(add): support sessionTimeoutMinutes and filters for online eval config#1166aidandaly24 wants to merge 6 commits into
Conversation
Coverage Report
|
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for picking this up. I have one blocking concern and one usability concern before this can ship.
Blocking: the deployed CFN resource will not reflect these fields.
The CLI side persists sessionTimeoutMinutes and filters into agentcore.json, but the consuming CDK side in aws/agentcore-l3-cdk-constructs is not updated in lockstep:
OnlineEvaluationConfigSchemainagentcore-l3-cdk-constructs/src/schema/schemas/primitives/online-evaluation-config.tsdoes not declaresessionTimeoutMinutesorfilters.ConfigIO.readProjectSpec()usesz.object.safeParse, which strips unknown keys — so when CDK reads the project spec, both values are silently dropped.AgentCoreOnlineEvaluationConfig(src/cdk/constructs/components/primitives/evaluator/AgentCoreOnlineEvaluationConfig.ts, around line 193) still hardcodessessionTimeoutMinutes: 5and never passesfiltersintoCfnOnlineEvaluationConfig'sruleblock.
Net effect: the wizard collects input, the schema validates it, the file stores it — and then the deployed AWS::BedrockAgentCore::OnlineEvaluationConfig ignores it. The PR description claims "sessionTimeoutMinutes was already wired (default 5)" and that "the rule block now conditionally forwards filters", but neither is true on main of agentcore-l3-cdk-constructs, and this PR only touches agentcore-cli.
This needs a companion PR on agentcore-l3-cdk-constructs that:
- adds
sessionTimeoutMinutesandfilters(with matching filter/operator/value subschemas) toOnlineEvaluationConfigSchema, and - forwards both to
CfnOnlineEvaluationConfig(replace the hardcoded5withconfig.sessionTimeoutMinutes ?? 5, and add...(config.filters?.length && { filters: config.filters.map(...) })insiderule).
Ideally that lands and is released first, with this PR pinning the matching @aws/agentcore-cdk version, or the two land together. As-is, merging this gives users a knob that silently does nothing at deploy time.
Filter input format is too lossy — see inline comment on the filter parser. The way values get auto-typed means users cannot express stringValue: "true", stringValue: "false", or any string that happens to look numeric (e.g. "123", IDs, versions). Given filter keys are service-defined attribute names, this is likely to come up for real users. Suggested options in the inline comment.
| /** Optional description for the online eval config */ | ||
| description: z.string().max(200).optional(), | ||
| /** Session idle timeout in minutes (1-1440). Default: 5 */ | ||
| sessionTimeoutMinutes: z.number().int().min(1).max(1440).optional(), |
There was a problem hiding this comment.
This field is persisted here but never read by the CDK side — OnlineEvaluationConfigSchema in agentcore-l3-cdk-constructs doesn't declare it, so ConfigIO.readProjectSpec strips it before the construct ever sees it, and AgentCoreOnlineEvaluationConfig still hardcodes sessionTimeoutMinutes: 5. Same applies to filters below. Needs a companion PR on agentcore-l3-cdk-constructs; see the main review comment.
| if (rawValue === 'true' || rawValue === 'false') { | ||
| value = { booleanValue: rawValue === 'true' }; | ||
| } else if (/^-?\d+(\.\d+)?$/.test(rawValue)) { | ||
| value = { doubleValue: parseFloat(rawValue) }; | ||
| } else { | ||
| value = { stringValue: rawValue }; | ||
| } |
There was a problem hiding this comment.
The value auto-typing here is lossy: "true"/"false" always become booleans and anything matching -?\d+(\.\d+)? always becomes a double. Users cannot produce:
stringValue: "true"orstringValue: "false"stringValue: "123",stringValue: "v1.2", or any numeric-looking identifier
Since filter keys are service-defined attributes, this will come up in real configs (e.g. a trace attribute whose value is the literal string "true", or an ID like "12345").
A couple of options:
- Explicit type prefix: require
s:,d:, orb:prefixes, e.g.model Equals s:claude-3,latencyMs LessThan d:1000,success Equals b:true. Falls back to string if no prefix. - Quoted strings force string type: e.g.
key Equals "true"→stringValue: "true", baretrue/false→ boolean, bare numeric → double, otherwise string. - Per-filter wizard step with a value-type select (string / number / boolean) followed by a typed value prompt. More keystrokes but unambiguous, and avoids the
;/whitespace-in-values limitations the current parser has.
Whichever you pick, please also document the behavior in the <Text dimColor> help above so users aren't surprised.
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for the follow-up on the filter value typing — the quoted-string escape hatch is a clean fix for that concern.
However, the blocking issue from round 1 is still outstanding, and I found a couple of related gaps while re-reviewing. One change requested.
Blocking: the CDK constructs companion change still isn't in place.
This PR's description claims:
CDK construct (
AgentCoreOnlineEvaluationConfig) — Theruleblock now conditionally forwardsfilterstoCfnOnlineEvaluationConfigwhen at least one is configured.sessionTimeoutMinuteswas already wired (default 5).
Neither part is true against aws/agentcore-l3-cdk-constructs@main as of this review:
OnlineEvaluationConfigSchema(src/schema/schemas/primitives/online-evaluation-config.ts) still does not declaresessionTimeoutMinutesorfilters.ConfigIO.readProjectSpec()usesz.object.safeParse, which strips unknown keys, so both fields are silently dropped the moment CDK reads the project spec.AgentCoreOnlineEvaluationConfig(src/cdk/constructs/components/primitives/evaluator/AgentCoreOnlineEvaluationConfig.ts~line 193) still hardcodessessionTimeoutMinutes: 5and never passesfiltersintoCfnOnlineEvaluationConfig.rule.
Net effect is the same as round 1: the wizard collects the input, the schema validates it, agentcore.json stores it — and then the deployed AWS::BedrockAgentCore::OnlineEvaluationConfig ignores it. Shipping this as-is gives users a knob that silently does nothing at deploy time, which is worse than not having the knob.
A couple of options:
- Open a companion PR on
agentcore-l3-cdk-constructsthat (a) addssessionTimeoutMinutes+filters(with matching operator/value subschemas) toOnlineEvaluationConfigSchema, and (b) forwards both intoCfnOnlineEvaluationConfig.rule(replace the hardcoded5withconfig.sessionTimeoutMinutes ?? 5, add...(config.filters?.length && { filters: config.filters.map(...) })). Land and release it first, then bump the pinned@aws/agentcore-cdkversion here. - Land the constructs PR and this PR together in the same release, with the version bump included here.
Also please correct the PR description — the two CDK-related bullets describe changes that aren't in either repo.
Also worth addressing before merge:
parseFiltersInputhas no unit tests. Given it's the whole contract for filter input (regex parsing, quoted vs bare, boolean/numeric coercion, operator validation, segment splitting, multi-word value handling), it should have its own test coverage independent of the TUI. Suggest exporting it (or putting it in a sibling module) and adding tests for the cases documented in its JSDoc plus the failure modes.online-eval-config.test.tsdoesn't exercise the new schema fields at all. Worth adding cases forsessionTimeoutMinutes(accepts 1 and 1440, rejects 0/1441/non-integer) andfilters(accepts a valid filter, rejects invalid operator, rejects zero or two-plus ofstringValue/doubleValue/booleanValue— theOnlineEvalFilterValueSchema.refineis non-trivial and currently untested).
Non-blocking observation (call out in code for a future fix if you don't pick it up here):
toOnlineEvalConfigSpec in src/cli/commands/import/import-online-eval.ts and GetOnlineEvalConfigResult in src/cli/aws/agentcore-control.ts do not round-trip sessionTimeoutMinutes or filters from the AWS API response. Once the CDK side is wired up, importing an existing online eval config that has either configured will silently drop them from the local spec, reverting them to defaults on the next deploy. Not a regression introduced by this PR, but it's the same feature area and likely the next bug report.
| /** Optional description for the online eval config */ | ||
| description: z.string().max(200).optional(), | ||
| /** Session idle timeout in minutes (1-1440). Default: 5 */ | ||
| sessionTimeoutMinutes: z.number().int().min(1).max(1440).optional(), |
There was a problem hiding this comment.
Reinforcing the round-1 inline comment since nothing has changed here or on the CDK side: this field (and filters below) is persisted into agentcore.json, but agentcore-l3-cdk-constructs@main still (a) doesn't declare sessionTimeoutMinutes/filters in OnlineEvaluationConfigSchema (they'll be stripped by safeParse when CDK reads the spec) and (b) hardcodes sessionTimeoutMinutes: 5 / never passes filters in AgentCoreOnlineEvaluationConfig's rule block. Needs the companion CDK PR before this can ship — see the main review comment for two sequencing options.
| .refine( | ||
| v => [v.stringValue, v.doubleValue, v.booleanValue].filter(x => x !== undefined).length === 1, | ||
| 'Exactly one of stringValue, doubleValue, or booleanValue must be set' | ||
| ); |
There was a problem hiding this comment.
The XOR .refine here is non-trivial (must be exactly one of three) and currently has no test coverage. Please add cases to online-eval-config.test.ts for:
- zero values set (should fail)
- exactly one of each of
stringValue/doubleValue/booleanValue(should pass) - two or more set simultaneously (should fail)
- invalid operator (e.g.
"equals"lowercase) (should fail) - valid
sessionTimeoutMinutesboundaries (1, 1440) and invalid (0, 1441,1.5, negative)
These are the validation rules the wizard is relying on; worth pinning down.
| </Text> | ||
| <TextInput | ||
| key="filters" | ||
| prompt='Filters (e.g. "model Equals claude-3; id Equals "12345"; success Equals true")' |
There was a problem hiding this comment.
The example inside this prompt has unbalanced-looking nested double quotes — as rendered to the user it reads:
Filters (e.g. "model Equals claude-3; id Equals "12345"; success Equals true")
where the "12345" appears to close the outer string and then reopen it. Users copying this literally won't get what they think, and it's confusing as a hint. Suggest either (a) dropping the outer quotes around the example, or (b) using single quotes / backticks for the outer wrapper so the inner "12345" reads cleanly. Something like:
prompt="Filters (e.g. model Equals claude-3; id Equals \"12345\"; success Equals true)"
| * | ||
| * Returns undefined if any segment is malformed. | ||
| */ | ||
| function parseFiltersInput(input: string): OnlineEvalFilter[] | undefined { |
There was a problem hiding this comment.
parseFiltersInput is the whole contract for filter input — regex-based segment parsing, quoted vs bare value handling, boolean/numeric coercion, operator validation — and it's untested. Given the failure modes are silent (customValidation just re-runs the parser), please add unit tests covering at minimum:
- single vs multi-segment input
- quoted string preserves
"true","false","12345"asstringValue(the fix for the round-1 concern — worth locking in) - bare
true/false→booleanValue - bare
-?\d+(\.\d+)?→doubleValue, including negatives and decimals - bare non-numeric/non-boolean →
stringValue - multi-word unquoted values (current regex allows them via
(.+?)\s*$, so it's testable behavior worth pinning) - invalid operator →
undefined - missing value or operator →
undefined - stray
;and empty segments don't break parsing
Extracting this into a sibling helper module (e.g. ./filter-input.ts) would make it trivial to test and keep the screen component lean.
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for adding the parser tests and schema test coverage — both gaps from round 2 are now closed, and the schema tests do a nice job of pinning down the OnlineEvalFilterValueSchema.refine XOR behavior.
One blocking issue still outstanding, and one round-1 item that was never addressed.
Still blocking: CDK companion change is not in place.
This is the same issue raised in rounds 1 and 2. Re-verified against aws/agentcore-l3-cdk-constructs@main as of this review:
OnlineEvaluationConfigSchema(src/schema/schemas/primitives/online-evaluation-config.ts) still declares onlyname,agent,evaluators,samplingRate,description,enableOnCreate,tags. NosessionTimeoutMinutes, nofilters. SinceConfigIO.readProjectSpec()usesz.object.safeParse, both fields are silently dropped the moment CDK reads the project spec.AgentCoreOnlineEvaluationConfig.tsline 193 still readssessionTimeoutMinutes: 5(literal), and theruleblock never forwardsfiltersintoCfnOnlineEvaluationConfig.src/assets/cdk/package.jsonhere still pins@aws/agentcore-cdk: ^0.1.0-alpha.19, and no newer alpha with these fields appears to have been released.
Net effect remains unchanged: shipping this PR gives users two wizard knobs that validate locally, persist to agentcore.json, and then get silently discarded at deploy time. That's a worse user experience than not having the knobs.
Two ways forward:
- Land a companion PR on
agentcore-l3-cdk-constructsfirst that (a) addssessionTimeoutMinutes+filters(with matching operator/value subschemas) toOnlineEvaluationConfigSchemaand (b) replaces the hardcoded5withconfig.sessionTimeoutMinutes ?? 5and adds...(config.filters?.length && { filters: config.filters.map(...) })torule. Release it, then bump@aws/agentcore-cdkinsrc/assets/cdk/package.jsonon this PR. - Land the constructs PR and this PR together in the same release, with the version bump included here.
Also, the PR description's claims under Schema ("sessionTimeoutMinutes… was already on the CDK schema") and CDK construct ("The rule block now conditionally forwards filters…"; "sessionTimeoutMinutes was already wired") are both incorrect against main — please correct them so reviewers (and future readers) don't look for changes that aren't there.
Round-1 follow-up not addressed: the filters prompt example still renders with unbalanced quotes.
See inline comment on AddOnlineEvalScreen.tsx line 284. Round 1 flagged this and it looks like it was missed; happy to treat it as a small fix alongside the CDK work.
Non-blocking from round 2: toOnlineEvalConfigSpec / GetOnlineEvalConfigResult still don't round-trip the two new fields. Once the CDK side is wired, importing an existing online eval config will drop these locally and revert them to defaults on next deploy. Fine to handle in a follow-up, just flagging it's still on the table.
| </Text> | ||
| <TextInput | ||
| key="filters" | ||
| prompt='Filters (e.g. "model Equals claude-3; id Equals "12345"; success Equals true")' |
There was a problem hiding this comment.
Round-1 follow-up: this line was flagged previously and it doesn't look like it was picked up. The JS single-quoted string literal contains raw nested double quotes, so the prompt renders to the user as:
Filters (e.g. "model Equals claude-3; id Equals "12345"; success Equals true")
where the "12345" visually closes the outer example string and re-opens it. It's confusing as a hint, and users copy-pasting the example verbatim won't get what they think (the outer quotes become part of the first segment's key).
Pick whichever of these reads best:
- Drop the outer quotes around the example:
prompt="Filters (e.g. model Equals claude-3; id Equals \"12345\"; success Equals true)" - Swap the outer delimiter to backticks in the hint text and drop outer quotes in the prompt; the help text above already uses
`"..."`to reference quoted values, so this stays consistent. - If you want the example to show the quoted-string escape hatch, escape the inner quotes:
prompt={'Filters (e.g. model Equals claude-3; id Equals \\"12345\\"; success Equals true)'}
agentcore-cli-automation
left a comment
There was a problem hiding this comment.
Thanks for the iteration — the parser extraction and schema/parser tests from rounds 2 and 3 all look solid, the XOR refinement coverage is thorough, and filter-parser.test.ts covers every operator, every value-typing rule, and the failure modes. The round-1 prompt-quoting concern on the filters hint also looks resolved on AddOnlineEvalScreen.tsx L284 — the JS single-quoted literal renders cleanly without an outer-quote wrapper. Nice work.
Two remaining blockers before this can ship, both concrete and mechanical.
1. @aws/agentcore-cdk@^0.1.0-alpha.29 is not published. See inline comment on src/assets/cdk/package.json. Latest published version on npm is 0.1.0-alpha.28 (https://registry.npmjs.org/@aws/agentcore-cdk). With ^0.1.0-alpha.29 pinned, every agentcore create will fail at npm install in the generated CDK app with No matching version found. That's strictly worse than the round-3 "silent no-op at deploy time" concern because it breaks project bootstrap.
Side note on scope: inspecting the alpha.28 tarball on npm, OnlineEvaluationConfigSchema already declares sessionTimeoutMinutes and AgentCoreOnlineEvaluationConfig already forwards config.sessionTimeoutMinutes ?? 5 into rule.sessionConfig. So that half of the CDK work is live on alpha.28 (even though it isn't on agentcore-l3-cdk-constructs@main — the release branch has drifted ahead of main). filters is still absent from the schema and rule block on alpha.28, matching main, so that half still needs aws/agentcore-l3-cdk-constructs#211 to merge and release before this can land.
A few ways forward:
a. Wait for #211 to merge and alpha.29 to be released, then rebase. The pin here already matches that plan.
b. Land the constructs PR and this PR together as a coordinated release (bump the version here once the constructs release is cut).
c. Split this PR: ship sessionTimeoutMinutes against alpha.28 now (bump to ^0.1.0-alpha.28, drop the filters wizard step + schema additions), and follow up with filters once the constructs work lands. Gives users the smaller of the two knobs immediately.
2. Asset snapshot wasn't regenerated after the version bump.
src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snap line 360 still reads "@aws/agentcore-cdk": "^0.1.0-alpha.19", while this PR bumps src/assets/cdk/package.json to ^0.1.0-alpha.29. The snapshot is generated from that file, so the asset snapshot test will fail on CI — the two are guaranteed to diverge until this file is regenerated. (Couldn't post this inline because the snapshot file itself isn't in the diff.)
This matches the unchecked "If I modified src/assets/, I ran npm run test:update-snapshots and committed the updated snapshots" item in the PR description. Run npm run test:update-snapshots and commit the result, once the version you end up pinning (see (1)) actually exists on npm.
Non-blocking, still open from round 2 (fine as a follow-up — flagging since it'll bite the first user who tries it): toOnlineEvalConfigSpec in src/cli/commands/import/import-online-eval.ts and GetOnlineEvalConfigResult in src/cli/aws/agentcore-control.ts don't round-trip sessionTimeoutMinutes or filters, so agentcore import on an existing online eval config will silently drop both locally and revert them to defaults on the next deploy.
| }, | ||
| "dependencies": { | ||
| "@aws/agentcore-cdk": "^0.1.0-alpha.19", | ||
| "@aws/agentcore-cdk": "^0.1.0-alpha.29", |
There was a problem hiding this comment.
Blocking: this version isn't published yet.
Latest published version of @aws/agentcore-cdk on npm is 0.1.0-alpha.28 (per https://registry.npmjs.org/@aws/agentcore-cdk — alpha.29 does not exist in the versions list). With ^0.1.0-alpha.29 pinned here, every agentcore create will fail at npm install in the generated CDK app with No matching version found for @aws/agentcore-cdk@^0.1.0-alpha.29.
For reference, inspecting the alpha.28 tarball directly:
OnlineEvaluationConfigSchemaalready declaressessionTimeoutMinutesandAgentCoreOnlineEvaluationConfigalready forwardsconfig.sessionTimeoutMinutes ?? 5intorule.sessionConfig. So that half of this PR is deployable against alpha.28 today.filtersis still absent from both the schema andCfnOnlineEvaluationConfig.ruleon alpha.28 — that's the part that needs aws/agentcore-l3-cdk-constructs#211 to merge + release before this can ship as-is.
Options (see the main review body for the full list): wait for #211 + alpha.29 release and rebase; land the two PRs together; or split this PR so sessionTimeoutMinutes can ship against alpha.28 now and filters follows once the constructs work is out.
|
Closing: re-running with format fix in executor |
Description
Adds CLI and schema support for the optional
sessionTimeoutMinutesandfiltersparameters ofCreateOnlineEvaluationConfig, so users can configure both duringagentcore add → OnlineEvalConfiginstead of being limited tosamplingConfig.State of the constructs repo (
mainas of this PR)OnlineEvaluationConfigSchemaalready declaressessionTimeoutMinutes(1–1440, optional).AgentCoreOnlineEvaluationConfigalready forwardsconfig.sessionTimeoutMinutes ?? 5into therule.sessionConfig.filtersis not yet present in either the schema or the construct — added by #211.CLI changes in this PR
Schema (
src/schema/schemas/primitives/online-eval-config.ts)OnlineEvalConfigSchemanow accepts:sessionTimeoutMinutes?: number— int, 1–1440. Mirrors the field already present on the CDK schema so the value is preserved when persisted to / read back fromagentcore.json.filters?: Filter[]— each filter haskey,operator(one ofEquals,NotEquals,GreaterThan,LessThan,GreaterThanOrEqual,LessThanOrEqual,Contains,NotContains), andvalue(exactly one ofstringValue/doubleValue/booleanValue, enforced via.refine).CLI primitive (
OnlineEvalConfigPrimitive)AddOnlineEvalConfigOptionsandcreateOnlineEvalConfigaccept and persist both new optional fields using the existing spread-conditional pattern.TUI wizard (
agentcore add→ online-eval)samplingRateandenableOnCreate:TextInputaccepting blank (use service default of 5) or an integer 1–1440.TextInputaccepting blank (no filters) or a;-separated list of<key> <operator> <value>triples. Value typing rules are explained inline above the input: baretrue/false→ boolean, bare numeric → double, double-quoted ("...") → string (escape hatch for"true","false","12345", etc.), otherwise string.useCreateOnlineEvalhook threadsfiltersthrough to the primitive (sessionTimeoutMinuteswas already plumbed).CDK app pin (
src/assets/cdk/package.json)@aws/agentcore-cdkbumped from^0.1.0-alpha.19to^0.1.0-alpha.28(latest published on npm). This release already includessessionTimeoutMinutes.filterswill follow in the next alpha after #211 merges; the pin here will be bumped again at that point.src/assets/__tests__/__snapshots__/assets.snapshot.test.ts.snapwas regenerated vianpm run test:update-snapshotsand committed.Related Issue
Closes #906
Documentation PR
N/A — no doc changes required; the new optional fields are surfaced through the existing interactive wizard and project schema, with inline help describing the filter input format.
Type of Change
Testing
How have you tested the change?
npm run test:unitandnpm run test:integnpm run typechecknpm run lintsrc/assets/, I rannpm run test:update-snapshotsand committed the updated snapshotsTargeted test runs (all green):
src/cli/primitives/__tests__/OnlineEvalConfigPrimitive.test.ts— 18/18 (16 existing + 2 new round-trippingsessionTimeoutMinutesandfiltersthrough the project spec).src/schema/schemas/primitives/__tests__/online-eval-config.test.ts— 31/31 (14 existing + 17 new coveringsessionTimeoutMinutesboundary + non-integer + omission, andfiltersoperator enum, value-XOR via.refine, empty key, empty array, omission).src/cli/tui/screens/online-eval/__tests__/filter-parser.test.ts— 22/22 (new) covering value typing for bare/quoted/numeric/boolean, every operator, segment splitting, and failure modes forparseFiltersInput/formatFilter.agentcore-l3-cdk-constructs/test/constructs/online-eval-config.test.ts(in #211) — 4/4 (2 existing + 2 new assertingRule.FiltersandRule.SessionConfig.SessionTimeoutMinutesare emitted correctly, and thatFiltersis omitted when not provided).Code review: rounds 1–3 — all findings addressed.
Checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the
terms of your choice.