awslabs · smoell · Jun 18, 2026 · Jun 19, 2026 · Jun 19, 2026 · Jun 23, 2026
diff --git a/...host-your-agent/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/.gitignore b/...host-your-agent/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/.gitignore
@@ -0,0 +1,21 @@
+# Python
+__pycache__/
+*.pyc
+.venv/
+*.egg-info/
+
+# CDK
+cdk.out/
+.cdk.staging/
+
+# Config (generated at deploy time)
+deploy/config.env
+envvars.config
+bootstrap.config
+
+# IDE
+.idea/
+.vscode/
+
+# OS
+.DS_Store
diff --git a/...-agent/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/README.md b/...-agent/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/README.md
@@ -0,0 +1,163 @@
+# Autonomous Coding Agent — Durable Orchestration on AgentCore Runtime
+
+An event-driven, headless coding backend that receives a ticket, clones a repo, writes code, runs tests in an isolated sandbox, gets a review from an evaluator agent, and retries until tests pass — all orchestrated by a **Lambda Durable Function** with zero-cost suspension.
+
+Uses **4 AgentCore Runtimes**, Cedar-based authorization policies, and cross-ticket memory.
+
+## Architecture
+
+```
+EventBridge ──► Durable Orchestrator (Lambda Durable Function)
+   {ticketId}     ┌─────────────────────────────────────────────────────────┐
+                  │ 1. ADMISSION   validate ticket schema                   │
+                  │ 2. HYDRATE     git clone repo + recall memory lessons    │
+                  │ 3. CODE_LOOP   wait_for_callback (SUSPEND, $0 compute)  │
+                  │ 4. REVIEW      evaluator agent (read-only, Haiku)        │
+                  │ 5. FINALIZE    write memory lessons + SNS notify         │
+                  └───────┬─────────────────────────────────────────────────┘
+                          │
+            ┌─────────────┼──────────────────────────────────┐
+            ▼             ▼                                  ▼
+   Coding Agent     Swift Sandbox            Evaluator Agent
+   (Opus 4)         (non-LLM executor)       (Haiku, read-only)
+   writes code      runs `swift test`        structured verdict
+   ──► Sandbox      .build persisted to      request_changes → retry
+       via MCP      /mnt/workspace
+```
+
+| Runtime | Role | Model | Network |
+|---------|------|-------|---------|
+| Coding Agent | Plan + write code, drive sandbox via MCP | Claude Opus 4 | VPC (private) |
+| Sandbox | Execute commands, path-confined to ticket dir | None (plain executor) | VPC (private) |
+| Swift Sandbox | Swift-specific sandbox with `.build` persistence | None | VPC (private) |
+| Evaluator Agent | Read-only code review, structured verdict | Claude Haiku | VPC (private) |
+
+> **Note:** The included sandbox images cover **Python** (pytest) and **Swift** (SwiftPM). The architecture is framework-agnostic — to support additional languages or frameworks (e.g. Java/Gradle, TypeScript/Jest, Go), add a new Dockerfile with the required toolchain and register it as an additional sandbox runtime in the CDK stack.
+
+## Key features
+
+- **Zero-cost suspension** — Durable Function suspends at `wait_for_callback`; no compute charges while the coding agent works asynchronously
+- **Retry loop** — if `swift test` (or `pytest`) fails, orchestrator retries with feedback (up to MAX_ATTEMPTS)
+- **Cedar policies** — Gateway authorization via Cedar; sandbox enforces path confinement
+- **Cross-ticket memory** — AgentCore Memory stores per-repo lessons; recalled at hydrate, written at finalize
+- **Control/data separation** — coding agent cannot execute locally (Bash, WebFetch disallowed); all execution delegated to sandbox
+- **Session isolation** — different tickets = different microVMs (state doesn't leak)
+
+## Repository layout
+
+```
+cdk/                   CDK app (8 stacks, production deployment)
+coding-agent/          Control plane — Claude Agent SDK + sandbox MCP tools
+demo/                  Demo UI (index.html) for submitting tickets and viewing results
+sandbox/               Data plane — command executor + Cedar policy engine
+evaluator-agent/       Read-only review agent
+orchestrator/          Lambda Durable Function handler
+gateway-policies/      Cedar policies for AgentCore Gateway
+shared/                Shared libraries (memory, audit, validation, logging)
+scripts/               Helper scripts (fire_ticket, build_images)
+tests/                 Unit + integration tests (pytest)
+```
+
+## Prerequisites
+
+- AWS account with Bedrock AgentCore access (us-east-1)
+- Python 3.12+, AWS CLI v2, AWS CDK CLI (`npm install -g aws-cdk`)
+- Bedrock model access: Claude Opus 4, Claude Haiku
+- Docker with buildx (for local builds) or CodeBuild (recommended)
+
+## Deployment
+
+```bash
+# 1. Configure AWS credentials
+export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
+export AWS_REGION=us-east-1
+
+# 2. Install CDK dependencies
+cd cdk
+pip install -r requirements.txt
+
+# 3. Bootstrap CDK (once per account/region)
+cdk bootstrap aws://$AWS_ACCOUNT_ID/$AWS_REGION -c account=$AWS_ACCOUNT_ID
+
+# 4. Deploy all stacks
+cdk deploy --all --require-approval broadening -c account=$AWS_ACCOUNT_ID
+
+# 5. Build and push ARM64 container images (via CodeBuild)
+cd ..
+bash scripts/build_images.sh all
+```
+
+CDK stacks (deployed in dependency order):
+
+| Stack | Resources |
+|-------|-----------|
+| `cagent-network` | VPC, 2 AZ, NAT, SG, VPC endpoints |
+| `cagent-storage` | S3 bucket (versioned), shared filesystem, mount targets |
+| `cagent-build` | CodeBuild projects for native ARM64 builds |
+| `cagent-runtime` | IAM exec roles + 4 AgentCore runtimes |
+| `cagent-gateway-policy` | Cedar authorization policies |
+| `cagent-memory` | AgentCore Memory store (per-repo lessons) |
+| `cagent-orchestrator` | Lambda Durable Function + EventBridge + SNS |
+| `cagent-monitoring` | CloudWatch alarms + dashboard |
+
+## Running a ticket
+
+```bash
+# Fire a ticket via EventBridge
+bash scripts/fire_ticket.sh TICKET-1
+
+# Or invoke directly
+printf '{"ticketId":"MY-TICKET"}' > /tmp/payload.json
+aws lambda invoke --function-name cagent-orchestrator \
+  --payload fileb:///tmp/payload.json /tmp/result.json
+cat /tmp/result.json | python3 -m json.tool
+```
+
+### Ticket format
+
+Upload to `s3://<bucket>/tickets-source/<ticketId>.json`:
+
+```json
+{
+  "id": "TICKET-101",
+  "title": "Add sorting to user list API",
+  "description": "Implement sortable user list endpoint. Clone repo, add sort parameter, write tests. Done when swift test passes.",
+  "repo_url": "https://github.com/example/my-swift-api.git"
+}
+```
+
+### What to expect
+
+1. Orchestrator validates ticket, clones repo, recalls memory lessons
+2. Coding agent plans and writes code (~30-120s)
+3. Sandbox runs tests — on failure, orchestrator retries with feedback
+4. Evaluator reviews final code, may request changes (→ another loop)
+5. Memory lessons saved, SNS notification sent (PASS/FAIL + summary)
+
+## Security model
+
+| Layer | Mechanism |
+|-------|-----------|
+| Cedar gateway policy | Authorizes which agents can invoke which runtimes |
+| Path confinement | Sandbox validates all paths via `realpath` + prefix check |
+| Env denylist | Blocks LD_PRELOAD, PATH, AWS_* from override |
+| S3 access point boundary | `rootDirectory=/work` prevents bucket escape |
+| Session isolation | Different sessions = different microVMs |
+| Control/data separation | Coding agent cannot execute locally |
+| Evaluator read-only | Review agent has no write tools |
+
+## Running tests
+
+```bash
+pip install -r requirements-dev.txt
+pytest
+```
+
+## Cleaning up
+
+```bash
+cd cdk
+cdk destroy --all
+```
+
+> **Note:** The S3 bucket is retained by default (contains artifacts). Delete manually if no longer needed.
diff --git a/...nt/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/cdk/README.md b/...nt/01-runtime/04-coding-agents/05-autonomous-coding-agent-durable/cdk/README.md
@@ -0,0 +1,159 @@
+# CDK deployment — autonomous AgentCore coding agent
+
+This `cdk/` directory stands up the **entire** system on a fresh AWS account with
+`cdk deploy --all`. It is the CDK equivalent of the `deploy/*.sh` scripts, kept in
+lock-step with them (notably the `poc/dh-gaps-swift-durable` branch: 4 runtimes,
+AgentCore Memory, SSM runtime-ARN params, and the Lambda **Durable Function**
+orchestrator). The 4th runtime, `cagent_evaluator`, is a **standalone review/evaluator
+agent** — its OWN container image and its OWN least-privilege, read-only IAM role
+(separate logs / cost / IAM from the coder), running Opus 4.8. It is no longer the
+coding-agent image repurposed via a `REVIEW_MODE` flag. There are now **4 images**
+(coding-agent, sandbox, sandbox-swift, evaluator), not 3. The only step CDK cannot do
+inline is **building the ARM64 container images** — a one-time out-of-band step (see
+"Build the images" below).
+
+## Stacks (dependency order)
+
+| Stack | Resources | Mirrors |
+|---|---|---|
+| `cagent-network` | VPC (2 AgentCore-supported AZs), 2 public + 2 private subnets, NAT, SG (self-ref NFS 2049 + HTTPS 443), 5 interface VPC endpoints (bedrock-agentcore, bedrock-runtime, ecr.api, ecr.dkr, logs) + S3 gateway endpoint | `deploy/10_vpc.sh` |
+| `cagent-storage` | Versioned S3 bucket, S3 Files sync role, **native** `AWS::S3Files::FileSystem` + 2 `MountTarget`s + broad `AccessPoint` (rootDir `/work`, uid/gid 1000), demo ticket seed | `deploy/05_s3files.sh` |
+| `cagent-build` | 4 ECR repos (coding-agent, sandbox, sandbox-swift, evaluator) + 4 CodeBuild projects (native ARM64) | `deploy/00_bootstrap.sh` (repos) + `deploy/20_build_push.sh` |
+| `cagent-memory` | `AWS::BedrockAgentCore::Memory` (semantic strategy, namespace `lessons/{actorId}`) | `deploy/06_memory.sh` |
+| `cagent-runtime` | Shared coder/sandbox exec role + **evaluator's own least-privilege read-only role** + **4** `AWS::BedrockAgentCore::Runtime` (coding_agent, sandbox, sandbox_swift, **evaluator**) + 4 SSM params `/<project>/runtime/<key>` | `deploy/00_bootstrap.sh` + `deploy/30_create_base_runtimes.sh` + `deploy/31_create_poc_runtimes.sh` |
+| `cagent-orchestrator` | Lambda **Durable Function** (python3.13, `DurableConfig` at creation) + published version + EventBridge rule → the version + SNS topic + role (durable managed policy + app inline) | `deploy/41_durable_orchestrator.sh` |
+| `cagent-monitoring` | CloudWatch alarms (errors, throttles) + dashboard | `deploy/redeploy_instrumented.sh` instrumentation |
+
+`gateway_policy_stack.py` (AgentCore Gateway + Cedar) is **intentionally not wired**
+into `app.py` — it was never part of the live shell-script deployment and is out of
+PoC scope. Left in the tree for reference only.
+
+## CFN-support findings (all native — no custom resources needed)
+
+Every piece the old shell scripts drove via preview/boto3 shims is now a **native
+CloudFormation resource type** (verified against the current CFN Template Reference,
+June 2026):
+
+- **S3 Files**: `AWS::S3Files::FileSystem | MountTarget | AccessPoint | FileSystemPolicy`
+  were added to CloudFormation **2026-04-14** — *after* the live shell deploy, which is
+  why `deploy/05_s3files.sh` had to use the `deploy/s3files_boto.py` boto3 shim (the API
+  was not in the installed CLI). CDK uses raw `CfnResource` against the verified PascalCase
+  schemas. No custom resource required.
+- **AgentCore Runtime**: `AWS::BedrockAgentCore::Runtime` (native; the repo already learned
+  this). `FilesystemConfigurations` accepts `S3FilesAccessPoint{AccessPointArn,MountPath}`
+  and `SessionStorage{MountPath}`; `NetworkConfiguration{NetworkMode,NetworkModeConfig{Subnets,SecurityGroups}}`.
+- **AgentCore Memory**: `AWS::BedrockAgentCore::Memory` (native), `EventExpiryDuration` is an
+  Integer, `MemoryStrategies[].SemanticMemoryStrategy{Name,Namespaces}`.
+- **Lambda Durable Functions**: native CDK L2 support — `aws_lambda.Function(durable_config=
+  DurableConfig(execution_timeout=..., retention_period=...))` (synthesizes the `DurableConfig`
+  property; requires aws-cdk-lib ≥ 2.258). The role attaches the AWS managed policy
+  `service-role/AWSLambdaBasicDurableExecutionRolePolicy`. EventBridge targets the **published
+  version** (durable functions must be invoked via a qualified ARN), not `$LATEST`.
+
+There is **no residual script-only step** for infrastructure. The only out-of-band step is
+image builds (which `cdk deploy` cannot run inline regardless).
+
+## AZ constraint (important)
+
+AgentCore Runtime VPC mode rejects subnets in unsupported Availability **Zone-IDs** (the
+constraint is by zone-id, not zone-name, and the name→id mapping differs per account). On
+the target account **123456789012** the live deployment runs in `us-east-1a` (use1-az2) +
+`us-east-1b` (use1-az4) — both supported here (verified via the live subnets in
+`deploy/config.env` and `aws ec2 describe-subnets … AvailabilityZoneId`). `network_stack.py`
+pins those two AZ names. On a different account, override:
+
+```bash
+cdk deploy cagent-network -c agentcore_azs="us-east-1a,us-east-1d"
+```
+
+Pick AZ names that resolve to AgentCore-supported zone-ids on *your* account.
+
+## Prerequisites
+
+- Python 3.13 venv with `aws-cdk-lib>=2.258`, `boto3>=1.43`, `jsii`, and the
+  `aws-durable-execution-sdk-python` package available to pip (the orchestrator Lambda is
+  bundled locally — no Docker needed at synth time; falls back to the PYTHON_3_13 bundling
+  image if local pip is unavailable).
+- CDK CLI **≥ 2.1126** (the cloud-assembly schema for aws-cdk-lib 2.258 is v54).
+- AWS credentials for the target account/region (us-east-1).
+
+```bash
+cd cdk
+pip install -r requirements.txt
+```
+
+## Deploy sequence
+
+```bash
+export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query Account --output text)
+
+# 1) Bootstrap the CDK environment (once per account/region)
+cdk bootstrap aws://$AWS_ACCOUNT_ID/us-east-1 -c account=$AWS_ACCOUNT_ID
+
+# 2) Stand up network + storage + build (ECR repos must exist before images push) + memory
+cdk deploy cagent-network cagent-storage cagent-build cagent-memory \
+  -c account=$AWS_ACCOUNT_ID --require-approval broadening
+
+# 3) Build + push the 4 ARM64 images (the ONLY out-of-band step) — see below
+
+# 4) Deploy the runtimes (need images in ECR), orchestrator, monitoring
+cdk deploy cagent-runtime cagent-orchestrator cagent-monitoring \
+  -c account=$AWS_ACCOUNT_ID --require-approval broadening
+```
+
+`cdk deploy --all -c account=$AWS_ACCOUNT_ID` works in one shot too, **provided the four
+images are already in ECR** when the runtime stack creates the runtimes (the
+`AWS::BedrockAgentCore::Runtime` resource pins the image at creation). On a truly fresh
+account, deploy `cagent-build` first, push images, then `--all`.
+
+### Build the images (out-of-band, ARM64 only)
+
+The `cagent-build` stack creates a CodeBuild project per image. After it exists, upload the
+build contexts and start the builds (e.g. via `scripts/build_images.sh`, which zips the
+`coding-agent/`, `sandbox/`, and `evaluator-agent/` contexts to
+`s3://<bucket>/build-artifacts/*.zip` and runs `aws codebuild start-build`). The swift
+sandbox reuses the `sandbox/` context but builds `Dockerfile.swift`. Alternatively
+build/push locally with `deploy/20_build_push.sh all`.
+
+The evaluator runtime is a **standalone agent with its own image** (`evaluator-agent/`,
+its own `Dockerfile`) running Opus 4.8 under a least-privilege read-only role — it no
+longer reuses the coding-agent image and there is no `REVIEW_MODE` flag. **4 images total**
+(coding-agent, sandbox, sandbox-swift, evaluator).
+
+To point a runtime at a specific image URI/digest instead of `:latest`:
+
+```bash
+cdk deploy cagent-runtime -c account=$AWS_ACCOUNT_ID \
+  -c coding_agent_image=<uri> -c sandbox_image=<uri> \
+  -c sandbox_swift_image=<uri> -c evaluator_image=<uri>
+```
+
+### Seed the Swift demo repo (optional)
+
+The storage stack seeds the demo ticket JSONs (`tickets-source/TICKET-1.json`,
+`RAINBOW-1.json`). The sample source repo is **not** vendored or pre-seeded — each ticket
+carries a `repo_url`, and the hydrate step `git clone`s it into the work dir inside the
+sandbox on demand (real public repo). Nothing third-party lives in this repo or in S3.
+
+## Fire a ticket
+
+```bash
+bash scripts/fire_ticket.sh RAINBOW-1     # EventBridge cagent.tickets / TicketCreated -> durable orchestrator
+```
+
+## Context / config
+
+- `-c account=<id>` (or `CDK_DEFAULT_ACCOUNT`) — required.
+- `-c region=<region>` (default `us-east-1`), `-c project=<prefix>` (default `cagent`).
+- `-c agentcore_azs="az1,az2"` — override the two VPC AZs.
+- `-c notification_email=<addr>` — subscribe an email to the SNS results topic.
+- `-c coding_agent_image=… -c sandbox_image=… -c sandbox_swift_image=… -c evaluator_image=…` — pin image URIs.
+
+## Verify it synthesizes
+
+```bash
+cdk synth -c account=$AWS_ACCOUNT_ID        # synthesizes all 7 stacks
+```
+
+(`cdk synth` does an `availability-zones` context lookup the first time — it needs read-only
+AWS creds, or a pre-populated `cdk.context.json`.)