Problem
AgentCoreMemorySessionManager makes synchronous boto3 calls on the asyncio event loop hot path when used with Agent.stream_async() inside an async WebSocket server (e.g. AgentCore Runtime). This adds 200–800ms of blocked event loop time per agent turn, directly degrading time-to-first-token (TTFT) for streaming responses.
The four blocking call sites are:
initialize() → read_session(), read_agent(), list_messages() (sync gmdp client calls)
append_message() → create_event() (sync, once per message when batch_size=1)
retrieve_customer_context() → uses ThreadPoolExecutor + as_completed(), but as_completed() still blocks the calling coroutine
sync_agent() → update_agent() → create_event() (sync, per turn)
Because all four run inside the Strands hook lifecycle (BeforeInvocationEvent, MessageAddedEvent, AfterInvocationEvent), callers cannot wrap them without subclassing or forking the SDK.
Proposed Solution
Add an async_mode configuration setting to AgentCoreMemorySessionManager (or its config class):
class MemorySessionManagerConfig:
async_mode: bool = False # default: sync (backwards-compatible)
When async_mode=True, the session manager wraps all 4 blocking call sites with asyncio.to_thread() to offload boto3 calls to a thread pool, keeping the event loop unblocked:
# Pseudocode
if self.config.async_mode:
await asyncio.to_thread(self._blocking_call, ...)
else:
self._blocking_call(...)
This is a non-breaking change — existing sync users default to sync behavior unchanged. Async users opt in by setting async_mode=True.
Impact
- Without fix: 200–800ms per turn of event loop blocking → degraded TTFT for streaming agents
- With fix: Boto3 calls offloaded to thread pool → event loop stays free for I/O
Affected File
src/bedrock_agentcore/memory/integrations/strands/session_manager.py
Acceptance Criteria
Problem
AgentCoreMemorySessionManagermakes synchronous boto3 calls on the asyncio event loop hot path when used withAgent.stream_async()inside an async WebSocket server (e.g. AgentCore Runtime). This adds 200–800ms of blocked event loop time per agent turn, directly degrading time-to-first-token (TTFT) for streaming responses.The four blocking call sites are:
initialize()→read_session(),read_agent(),list_messages()(sync gmdp client calls)append_message()→create_event()(sync, once per message whenbatch_size=1)retrieve_customer_context()→ usesThreadPoolExecutor+as_completed(), butas_completed()still blocks the calling coroutinesync_agent()→update_agent()→create_event()(sync, per turn)Because all four run inside the Strands hook lifecycle (
BeforeInvocationEvent,MessageAddedEvent,AfterInvocationEvent), callers cannot wrap them without subclassing or forking the SDK.Proposed Solution
Add an
async_modeconfiguration setting toAgentCoreMemorySessionManager(or its config class):When
async_mode=True, the session manager wraps all 4 blocking call sites withasyncio.to_thread()to offload boto3 calls to a thread pool, keeping the event loop unblocked:This is a non-breaking change — existing sync users default to
syncbehavior unchanged. Async users opt in by settingasync_mode=True.Impact
Affected File
src/bedrock_agentcore/memory/integrations/strands/session_manager.pyAcceptance Criteria
async_mode: bool = Falseadded to config (backwards-compatible default)asyncio.to_threadwhenasync_mode=Trueretrieve_customer_contextusesasyncio.gatherinstead of blockingas_completedwhenasync_mode=Trueasync_mode=False(existing behavior) andasync_mode=True