Skip to content

fix(templates): Make each HTTP framework template reuse per-session conversa (#808, #809)#33

Draft
aidandaly24 wants to merge 3 commits into
mainfrom
fix/808-809
Draft

fix(templates): Make each HTTP framework template reuse per-session conversa (#808, #809)#33
aidandaly24 wants to merge 3 commits into
mainfrom
fix/808-809

Conversation

@aidandaly24

Copy link
Copy Markdown
Owner

Refs aws#808
Refs aws#809

Issues

  • OpenAI Agents, Google ADK, and LangGraph HTTP templates are stateless — no conversation history across invocations aws/agentcore-cli#808 — The OpenAI Agents, Google ADK, and LangGraph HTTP starter templates rebuild agent/session state on every invocation, so a multi-turn conversation within one session has no memory — turn 2 cannot recall anything from turn 1. This directly contradicts the CLI's own agentcore invoke --session-id flag, which is advertised as "Use specific session ID for conversation continuity" and prints "To resume: agentcore invoke --session-id " after each call. AutoGen (a 5th, unlisted HTTP template) has the identical defect.
  • Strands HTTP template leaks conversation history across sessions and users aws/agentcore-cli#809 — The no-memory Strands HTTP template shares one global _agent across every in-process invocation, so when a single process serves multiple sessions, conversation history (and any "secret" content) bleeds between sessions/users. In practice this only manifests in agentcore dev and in non-Runtime deployments (Lambda/ECS/FastAPI): deployed AgentCore Runtime isolates each session_id in its own microVM, so the cross-user data leak cannot happen in the production runtime. The flip side is that the template also has no per-session multi-turn recall in the no-memory path.

Root cause

All three templates fail to thread context.session_id into any per-session history store, verified at v0.20.2 (= HEAD e92c79a). OpenAI (src/assets/python/http/openaiagents/base/main.py): invoke() -> main(prompt) builds a fresh Agent and calls Runner.run(agent, query) with no session= arg; the file never references context.session_id at all (only references are filesystem mount-path template strings). LangGraph (src/assets/python/http/langchain_langgraph/base/main.py:136-180): invoke() calls create_react_agent(...) fresh each call with no checkpointer and ainvoke({"messages":[HumanMessage(...)]}) with no thread_id/config — no checkpoint/thread_id/session anywhere. Google ADK (src/assets/python/http/googleadk/base/main.py:125-148): although context.session_id IS threaded (line 157), setup_session_and_runner() instantiates a brand-new InMemorySessionService() (line 127) and create_session() (line 128) on EVERY invoke, so prior turns are discarded. The Python SDK (bedrock-agentcore v1.9.1 runtime/context.py, app.py) only surfaces session_id to the entrypoint and logging; it does not — and is not expected to — auto-wire conversation history, so this is purely template content owned by the CLI.

Verified by reading the actual template at HEAD (lines 441-468 global _agent; 386-424 per-session hasMemory branch; 381 NullConversationManager; 581 dispatch), the SDK session_id plumbing (context.py:36, a2a.py:123), and the CLI local invoke session hardcoding (dev/invoke.ts:101). Owner=cli verified: defect is wholly in the CLI-vended asset; SDK delivers session_id correctly and Strands' message accumulation is intended upstream behavior, so neither sdk nor upstream owns it.

The fix

Make each template reuse per-session conversation state keyed by context.session_id, in-memory (best-effort, resets on cold start) — the approach in the OPEN closing PR aws#794 (branch fix/template-conversation-history). OpenAI: wrap get_session(session_id) returning openai-agents SQLiteSession in @lru_cache(maxsize=128) and pass session= to Runner.run. Google ADK: hoist _session_service/_runner to module scope, add get_or_create_session() that calls get_session() first and only create_session() when absent (avoids AlreadyExistsError, preserves history). LangGraph: module-level _checkpointer = InMemorySaver() passed to create_react_agent, and config={"configurable": {"thread_id": context.session_id}} on ainvoke. Regenerate the asset snapshot. TWO sharpenings beyond the brief: (1) extend the same fix to the AutoGen template (src/assets/python/http/autogen/base/main.py builds a fresh AssistantAgent every invoke and is equally stateless) — the issue text and PR aws#794 both omit it, leaving the bug half-fixed across HTTP frameworks; (2) the durable alternative is the AgentCore Memory primitive, but note that only Strands currently has a hasMemory variant — these three frameworks ship a base variant only, so in-process best-effort memory is the sole option until/unless a hasMemory path is added.

In src/assets/python/http/strands/base/main.py replace the global _agent singleton (lines 441-468) with a per-session-keyed factory so each context.session_id gets its own Agent, mirroring the hasMemory branch. Open PR aws#794's approach is the right design: @lru_cache(maxsize=128) get_or_create_agent(session_id) returning a fresh Agent per session_id, with the entrypoint passing context.session_id; the 128-entry bound caps process memory and evicted sessions restart fresh (best-effort in-memory recall, documented with a pointer to durable session stores for production). Apply the same to the no-memory config-bundle branch (create_agent() at lines 428-438, used at line 579). Design decision required: PR aws#794 keys by session_id and reuses the per-session instance, giving BOTH isolation and in-process multi-turn recall (correct default for a starter template); the closed PR aws#1314 used a per-invocation create_agent() that isolates but drops in-process recall — prefer aws#794. Regenerate the asset snapshot at src/assets/tests/snapshots/assets.snapshot.test.ts.snap. The other three HTTP templates' statelessness gap is the same PR's aws#808 scope, not this issue.

Files touched: src/assets/python/http/openaiagents/base/main.py (invoke + Runner.run path); src/assets/python/http/googleadk/base/main.py (setup_session_and_runner / call_agent_async); src/assets/python/http/langchain_langgraph/base/main.py (invoke); src/assets/python/http/autogen/base/main.py (invoke — additional, omitted by the brief and PR aws#794); and the regenerated snapshot src/assets/tests/snapshots/assets.snapshot.test.ts.snap. All in the CLI repo (aws/agentcore-cli); tracked by OPEN PR aws#794, which currently covers the 3 named templates + Strands but NOT AutoGen.

src/assets/python/http/strands/base/main.py: global _agent/get_or_create_agent no-memory no-config-bundle no-payment branch (lines 441-468) and create_agent no-memory config-bundle branch (lines 428-438); entrypoint dispatch at lines 578-582. Snapshot: src/assets/tests/snapshots/assets.snapshot.test.ts.snap. (Open fix: PR aws#794, unmerged; alternative PR aws#1314 closed 2026-05-20.)

Validation evidence

The fix was verified by reproducing the original symptom and re-running after the change:

BUILD: OK -> /local/home/aidandal/workplace/issues/bugfix-run/clusters/808-809/repo/dist/cli/index.mjs (exit 0).

REPRODUCE (pre-fix base = git HEAD, before working-tree changes):

  • strands no-memory (Strands HTTP template leaks conversation history across sessions and users aws/agentcore-cli#809): src/assets/python/http/strands/base/main.py had _agent = None (line 441) + global _agent (line 444) -> one shared Agent singleton across all session_ids => cross-session history bleed.
  • openaiagents: Runner.run(agent, query) with NO session= (lines 131/141/154/164), async def main(query) -> stateless every invoke.
  • googleadk: setup_session_and_runner() constructs a FRESH InMemorySessionService() + create_session() on every invoke (lines 127-128) -> history discarded each turn.
  • langchain_langgraph: graph.ainvoke({...}) with no checkpointer and no thread_id (line 174) -> stateless.
  • autogen: AssistantAgent(...) constructed INSIDE invoke() every call (line 102) -> stateless.
    So turn-2 "what is my name?" could not recall "Alice", matching the reported symptom.

AFTER (fix/808-809 working tree):

  • strands no-memory: global singleton GONE (grep for _agent = None/global _agent = 0 hits). Replaced with agent_factory() returning a bounded module-level cache keyed by session_id (cap 128, LRU-style eviction); invoke() does session_id = getattr(context,'session_id',...) then get_or_create_agent(session_id, ...). Mirrors the hasMemory branch (key f"{session_id}/{actor}") => per-session isolation + in-process multi-turn recall.
  • openaiagents: adds @lru_cache-backed per-session SQLiteSession and passes session= into Runner.run.
  • googleadk: hoists session service/runner to module scope and reuses get_or_create_session() so prior turns persist.
  • langchain_langgraph: module-level InMemorySaver checkpointer + config thread_id=session_id on ainvoke.
  • autogen: per-session AssistantAgent cache keyed by session_id.
  • assets snapshot regenerated to match the five edited templates.

RE-VERIFY: build clean; targeted greps confirm session_id threading present in all five base/main.py and the strands hasMemory branch + every other framework file are byte-for-byte untouched.

Test suite: green.


Staged on the fork as a draft for human review. Promote to aws/agentcore-cli after vetting.

…oss all HTTP starter base templates (aws#808, aws#809)

All five HTTP starter base templates (strands, openaiagents, googleadk, langchain_langgraph, autogen) plus the regenerated assets snapshot. Only base/main.py template content changed; the strands hasMemory branch and all other framework files are untouched.

Refs aws#808
Refs aws#809
@github-actions github-actions Bot added size/m PR size: M agentcore-harness-reviewing AgentCore Harness review in progress and removed agentcore-harness-reviewing AgentCore Harness review in progress labels Jun 25, 2026
@github-actions

github-actions Bot commented Jun 25, 2026

Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 37.16% 13593 / 36577
🔵 Statements 36.43% 14452 / 39667
🔵 Functions 31.8% 2333 / 7336
🔵 Branches 31.1% 9000 / 28932
Generated in workflow #130 for commit f3dabe9 by the Vitest Coverage Report Action

@aidandaly24 aidandaly24 changed the title fix(All five HTTP starter base templates (strands, openaiagents, googleadk, langchain_langgraph, autogen) plus the regenerated assets snapshot. Only base/main.py template content changed; the strands hasMemory branch and all other framework files are untouched.): Make each HTTP framework template reuse per-session conve... (#808, #809) fix(templates): Make each HTTP framework template reuse per-session conversa (#808, #809) Jun 25, 2026
… caches true LRU (aws#808, aws#809)

The googleadk (InMemorySessionService) and langchain_langgraph (InMemorySaver)
templates kept per-session state at module scope with no eviction, trading the
cross-session leak for an unbounded memory leak in long-running processes. Bound
both to 128 active sessions with LRU eviction (delete_session / delete_thread on
the least-recently-used key), matching the other three templates.

The autogen and strands no-memory caches were plain dicts evicting by insertion
order (FIFO) despite docstrings claiming LRU. Switch them to OrderedDict with
move_to_end on hit so the eviction policy is genuinely LRU and the docstrings are
accurate. openaiagents already uses functools.lru_cache (true LRU) and is
unchanged.

Regenerated the assets snapshot.

Refs aws#808
Refs aws#809
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Jun 25, 2026
aws#808)

OrderedDict.popitem() returns (key, value) where key is the (user_id,
session_id) tuple and value is True. The previous unpack assigned the key
tuple to old_user_id and True to old_session_id, so delete_session() was
called with garbage and the real session was never deleted -- the
InMemorySessionService grew unbounded past the 128-session cap, defeating
the aws#808 bound for GoogleADK (LangChain unpacked correctly).

Destructure the key tuple and discard the value so the real ids reach
delete_session. Add an asset test asserting the store caps at 128 with
correct eviction ids, and regenerate the asset snapshot.
@github-actions github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Jun 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/m PR size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant