Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
085964a
feat: add dynamic assume role support via scope-configurations provider
Jun 1, 2026
d9ff61b
feat: add requirements module with IAM policies for Lambda scope oper…
Jun 1, 2026
52cba87
chore: set ASSUME_ROLE_ARN_DEFAULT for testing
Jun 1, 2026
d49d2b9
fix: correct nullplatform provider version constraint in specs/tofu
Jun 1, 2026
33842ec
fix: surface sts:AssumeRole errors to stdout for visibility in NP logs
Jun 1, 2026
14d3ad5
fix: use exact PLACEHOLDER_IMAGE_URI when explicitly set, skip arch s…
Jun 1, 2026
fc8bc76
fix: remove automatic arch suffix from placeholder image URI
Jun 1, 2026
3f89288
fix: read TOFU_STATE_BUCKET from .provider.aws_state_bucket as fallback
Jun 1, 2026
97121e4
fix(iam): prefix lambda execution role with np-lambda- to match requi…
Jun 2, 2026
684d9f7
fix(tofu): surface tofu apply stderr to stdout for visibility in NP logs
Jun 2, 2026
b9e41d3
fix(iam): add modern CloudWatch Logs tagging actions to lambda requir…
Jun 2, 2026
bd26af4
feat(placeholder): make placeholder image configurable via PLACEHOLDE…
Jun 2, 2026
2dc0a3e
fix(deploy): ensure Lambda pull policy on the image's ECR repo before…
Jun 2, 2026
c04e9cc
fix(deploy): add missing diagnose.yaml workflow for diagnose-deployme…
Jun 2, 2026
bbd31a0
chore: remove account-specific defaults from values.yaml
Jun 2, 2026
e186fa2
docs: explain placeholder image config and restore PLACEHOLDER_IMAGE_…
Jun 3, 2026
41ed944
refactor(setup): consolidate install tofu under lambda/setup and merg…
Jun 4, 2026
779eef9
feat(assume-role): resolve role ARN from nullplatform IAM provider by…
Jun 8, 2026
4ff57c4
chore(values): set PLACEHOLDER_IMAGE_URI_DEFAULT for this installation
Jun 8, 2026
5109c0e
feat(workflows): assume IAM role via dedicated first step in every wo…
Jun 9, 2026
23e2515
feat(iam): make Lambda execution-role prefix configurable
Jun 9, 2026
0b68476
chore(do_tofu): remove the stderr-redirect explanation comment
Jun 9, 2026
51836a5
refactor: move install tofu module from lambda/setup to lambda/specs/…
Jun 9, 2026
6752bee
fix(specs/tofu): bump aws provider constraint to ~> 6.47.0
Jun 10, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -288,6 +288,67 @@ LOG_RETENTION_DAYS: 30
PARAMETERS_STRATEGY: "env" # or "secretsmanager"
```

### Placeholder Image (Scope Bootstrap)

When a scope is created, the Lambda function and its IAM role must exist **before**
the first real deployment — otherwise aliases, networking, and IAM have nothing to
attach to. To bootstrap this, `create-scope` provisions a throwaway **placeholder**
function that the first deployment then overwrites with the real code.

How the placeholder is sourced depends on the scope's **package type**:

- **Zip** — fully self-contained. A minimal handler ships pre-built and
base64-encoded in the repo (`scope/placeholder/placeholder_lambda.zip.b64`) and is
used automatically. **No configuration needed.**
- **Image** — the placeholder must be a container image, and this is where
`PLACEHOLDER_IMAGE_URI_DEFAULT` comes in.

#### Why `PLACEHOLDER_IMAGE_URI_DEFAULT` is needed for Image scopes

A Lambda function with `PackageType=Image` can only pull from a **private ECR
repository in the same account and region** — Lambda rejects `public.ecr.aws`
images at function-creation time. The built-in default in
`scope/scripts/resolve_placeholder_image` points at a public image
(`public.ecr.aws/nullplatform/aws-lambda/nullplatform-lambda-placeholder:latest`),
which is fine to *validate* but cannot actually back a real Lambda function.

So for Image-based scopes you **must** mirror a placeholder into your own private
ECR and point the scope at it. The image must also be **single-arch matching the
scope architecture** (`-amd64` for `x86_64`, `-arm64` for `arm64`) — Lambda does
not accept multi-arch manifest lists.

#### Resolution precedence

The placeholder image URI is resolved in this order (first match wins):

1. scope-configurations provider key `deployment.placeholder_image_uri` — per-scope,
managed without code
2. `PLACEHOLDER_IMAGE_URI_DEFAULT` env var — the **account-wide** knob, set in
`values.yaml` or via the agent's `extra_envs` (Helm)
3. the public default in `scope/scripts/resolve_placeholder_image` (validation-only
fallback; not usable for real Image functions)

Because the URI is account-specific, `values.yaml` ships it commented out — set it
once per installation and every Image scope in that account uses it, unless a
specific scope overrides it via the provider key.

#### Publishing a placeholder image

Use the helper script to build and push the single-arch placeholders to your private
ECR (it creates the repository if it does not exist):

```bash
export PLACEHOLDER_IMAGE_REPO=123456789012.dkr.ecr.us-east-1.amazonaws.com/aws-lambda/nullplatform-lambda-placeholder
lambda/scope/placeholder/publish # pushes <repo>:latest-arm64 and <repo>:latest-amd64
```

Then set the URI (matching your scope architecture) in `values.yaml` or the agent's
`extra_envs`:

```yaml
PLACEHOLDER_IMAGE_URI_DEFAULT: "123456789012.dkr.ecr.us-east-1.amazonaws.com/aws-lambda/nullplatform-lambda-placeholder:latest-arm64"
```

### Resource Naming

| Resource | Format | Example |
Expand Down Expand Up @@ -507,6 +568,7 @@ export TOFU_LOCK_TABLE=my-lock-table
| Issue | Cause | Solution |
|-------|-------|----------|
| "Function name too long" | Name exceeds 64 chars | Shorten namespace/application/scope slugs |
| "Placeholder image not found" | Image scope with no private placeholder published | Run `lambda/scope/placeholder/publish` and set `PLACEHOLDER_IMAGE_URI_DEFAULT` (see [Placeholder Image](#placeholder-image-scope-bootstrap)) |
| "Provisioned concurrency timeout" | Warmup taking too long | Increase `PROVISIONED_CONCURRENCY_MAX_WAIT_SECONDS` |
| "ALB listener rule capacity" | Too many rules on ALB | Increase `ALB_LISTENER_RULE_CAPACITY` in values.yaml |
| "Module not composed" | `MODULES_TO_USE` not updated | Verify setup script appends to `MODULES_TO_USE` |
Expand Down
16 changes: 16 additions & 0 deletions lambda/deployment/scripts/update_function_code
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,22 @@ if [ "$package_type" = "Image" ]; then
fi
log debug " ✅ image_uri=$IMAGE_URI"

# Ensure the image's ECR repo lets the Lambda service pull it. Container-image
# Lambdas require a repository policy granting lambda.amazonaws.com; without it
# update-function-code fails with "Lambda does not have permission to access
# the ECR image". Idempotent and best-effort (cross-account repos may not be
# writable from here — Lambda would then need the policy set on the source side).
if [[ "$IMAGE_URI" == *.dkr.ecr.*.amazonaws.com/* ]]; then

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No entiendo porque necesitamos esto?

Que pasa si la uri no es de amazonaws.com?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lambda con docker, solo fucniona con imagenes de ecr https://docs.aws.amazon.com/es_es/lambda/latest/dg/images-create.html

ecr_region=$(echo "${IMAGE_URI%%/*}" | cut -d. -f4)
ecr_repo="${IMAGE_URI#*/}"; ecr_repo="${ecr_repo%%:*}"; ecr_repo="${ecr_repo%%@*}"
lambda_pull_policy='{"Version":"2008-10-17","Statement":[{"Sid":"LambdaECRImageRetrievalPolicy","Effect":"Allow","Principal":{"Service":"lambda.amazonaws.com"},"Action":["ecr:BatchGetImage","ecr:GetDownloadUrlForLayer"]}]}'
if aws ecr set-repository-policy --repository-name "$ecr_repo" --region "$ecr_region" --policy-text "$lambda_pull_policy" >/dev/null 2>&1; then
log debug " ✅ ensured Lambda pull policy on ECR repo $ecr_repo"
else
log warn " ⚠️ could not set Lambda pull policy on ECR repo $ecr_repo (continuing; pull may fail if not already allowed)"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Esto está confirmado que puede llegar a funcionar? si es una certeza que va a fallar después, tiraría un error.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esta con warning por esto:
El caso que protege es cross-account: si la imagen vive en un ECR de otra cuenta, el rol asumido puede no tener permiso para escribir la
policy ahí — pero si esa policy ya está seteada del lado dueño del repo, Lambda igual puede pullear y el deployment funciona. Hoy,
set-repository-policy falla → warning → update-function-code igual tiene éxito. Si lo hago fallar duro, ese deployment cross-account (que
andaría) se rompería innecesariamente.

fi
fi

update_output=$(aws lambda update-function-code \
--function-name "$LAMBDA_FUNCTION_NAME" \
--image-uri "$IMAGE_URI" \
Expand Down
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/blue_green.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,16 @@ include:
configuration:
DEPLOYMENT_STRATEGY: "blue_green"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/delete.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
41 changes: 41 additions & 0 deletions lambda/deployment/workflows/diagnose.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/diagnose/build_context"
output:
- name: SCOPE_ID
type: environment
- name: SCOPE_NRN
type: environment
- name: LAMBDA_FUNCTION_NAME
type: environment
- name: LAMBDA_FUNCTION_ARN
type: environment
- name: LAMBDA_ROLE_ARN
type: environment
- name: SCOPE_DOMAIN
type: environment
- name: diagnose
type: executor
before_each:
name: notify_check_running
type: script
file: "$SERVICE_PATH/diagnose/notify_check_running"
after_each:
name: notify_check_results
type: script
file: "$SERVICE_PATH/diagnose/notify_results"
folders:
- "$SERVICE_PATH/diagnose/checks"
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/finalize.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/initial.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/rollback.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/deployment/workflows/switch_traffic.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/deployment/build_context"
Expand Down
4 changes: 4 additions & 0 deletions lambda/diagnose/build_context
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ fi

source "$SERVICE_PATH/utils/lambda_function_name"

# NOTE: The IAM role is assumed by the dedicated `assume_role` step that runs
# first in the workflow (see utils/assume_role_step); credentials are already in
# the environment here.

lambda_info=$(aws lambda get-function --function-name "$LAMBDA_FUNCTION_NAME" --output json 2>/dev/null || echo "{}")
LAMBDA_FUNCTION_ARN=$(echo "$lambda_info" | jq -r '.Configuration.FunctionArn // ""')
LAMBDA_ROLE_ARN=$(echo "$lambda_info" | jq -r '.Configuration.Role // ""')
Expand Down
11 changes: 9 additions & 2 deletions lambda/installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,17 +29,24 @@ git clone https://github.com/nullplatform/tofu-modules /root/.np/nullplatform/to
### 2. Configure variables

```bash
cd lambda/tofu
cd lambda/specs/tofu
cp terraform.tfvars.example terraform.tfvars
```

Edit `terraform.tfvars` with your values:
This module registers the scope type **and** provisions the IAM policies the
agent needs to operate Lambda scopes (formerly the separate `requirements`
module — now consolidated here). Edit `terraform.tfvars` with your values:

| Variable | Required | Description |
|---|---|---|
| `nrn` | ✅ | Nullplatform Resource Name (`organization:account`) |
| `np_api_key` | ✅ | Nullplatform API key |
| `tags_selectors` | ✅ | Tags to select the agent (e.g. `{ environment = "production" }`) |
| `name` | ✅ | Unique identifier for IAM policy naming (account-global, e.g. `prod-us-east-1`) |
| `aws_region` | — | AWS provider region. IAM is global; leave unset to resolve from the environment |
| `create_role` | — | `true` to create a new IAM role and attach the Lambda policies to it |
| `trusted_arns` | — | Principal ARNs allowed to assume the created role (with `create_role = true`) |
| `role_name` | — | Existing IAM role to attach the Lambda policies to (instead of `create_role`) |
| `github_branch` | — | Branch to fetch specs from (default: `main`) |
| `repo_path` | — | Path where scopes-lambda is cloned on the agent |
| `overrides_enabled` | — | Set `true` to enable config overrides from scopes-networking |
Expand Down
10 changes: 10 additions & 0 deletions lambda/instance/workflows/list.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/instance/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/log/workflows/log.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/log/build_context"
Expand Down
10 changes: 10 additions & 0 deletions lambda/metric/workflows/list.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: list_metrics
type: script
file: "$SERVICE_PATH/metric/list_metrics"
10 changes: 10 additions & 0 deletions lambda/metric/workflows/metric.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
include:
- "$SERVICE_PATH/values.yaml"
steps:
- name: assume_role
type: script
file: "$SERVICE_PATH/utils/assume_role_step"
output:
- name: AWS_ACCESS_KEY_ID
type: environment
- name: AWS_SECRET_ACCESS_KEY
type: environment
- name: AWS_SESSION_TOKEN
type: environment
- name: build_context
type: script
file: "$SERVICE_PATH/metric/build_context"
Expand Down
4 changes: 2 additions & 2 deletions lambda/prerequisites.md
Original file line number Diff line number Diff line change
Expand Up @@ -229,7 +229,7 @@ Agents run in a Kubernetes pod and authenticate to AWS via a **Service Account**
The IAM policies above let the agent CREATE Lambda functions and target
groups, but the `create-scope` workflow ALSO depends on three runtime
artifacts that must exist BEFORE the first scope is created. None are
auto-created by the bundled `install/tofu/main.tf` today — the operator
auto-created by the bundled `specs/tofu/main.tf` today — the operator
must provision them.

### 1. Placeholder image (private ECR)
Expand Down Expand Up @@ -383,7 +383,7 @@ This applies to **every** ECR repository that ever stores a Lambda
image:

1. The placeholder ECR (created during installation, addressed by
`lambda/tofu/main.tf` if you use the bundled module — the policy is
`lambda/specs/tofu/main.tf` if you use the bundled module — the policy is
already applied there).
2. **The per-application ECR repositories** that `np asset push`
creates dynamically when each app does its first build, named
Expand Down
2 changes: 1 addition & 1 deletion lambda/scope/placeholder/publish
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ if ! docker buildx version &>/dev/null; then
fi

# Extract registry host and region from IMAGE_REPO
ECR_REGISTRY=$(echo "$IMAGE_REPO" | cut -d/ -f1) # 688720756067.dkr.ecr.us-east-1.amazonaws.com
ECR_REGISTRY=$(echo "$IMAGE_REPO" | cut -d/ -f1) # 123456789012.dkr.ecr.us-east-1.amazonaws.com
ECR_REGION=$(echo "$ECR_REGISTRY" | cut -d. -f4) # us-east-1
ECR_REPO_NAME=$(echo "$IMAGE_REPO" | cut -d/ -f2-) # aws-lambda/nullplatform-lambda-placeholder

Expand Down
11 changes: 6 additions & 5 deletions lambda/scope/scripts/resolve_placeholder_image
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,15 @@ log info "🔍 Resolving placeholder image URI..."
placeholder_image_base="${PLACEHOLDER_IMAGE_URI:-public.ecr.aws/nullplatform/aws-lambda/nullplatform-lambda-placeholder:latest}"
architecture="${ARCHITECTURE:-arm64}"

# Lambda uses "x86_64" but images are tagged with Docker convention "amd64"
arch_tag="${architecture}"
[ "$architecture" = "x86_64" ] && arch_tag="amd64"
log debug " 📋 architecture=$architecture"

# Use the image URI as-is. If PLACEHOLDER_IMAGE_URI is not set, the default
# :latest tag is used without any architecture suffix — publish arch-specific
# tags and set PLACEHOLDER_IMAGE_URI explicitly if needed.
Comment on lines +43 to +45

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saquemos este comment

if [[ "$placeholder_image_base" == *":"* ]]; then
placeholder_image_uri="${placeholder_image_base}-${arch_tag}"
placeholder_image_uri="$placeholder_image_base"
else
placeholder_image_uri="${placeholder_image_base}:latest-${arch_tag}"
placeholder_image_uri="${placeholder_image_base}:latest"
fi

log debug " 📋 architecture=$architecture"
Expand Down
Loading