Skip to content

feat: dynamic assume-role support for the AWS S3 service (+ IAM fixes & cleanup)#11

Draft
davidf-null wants to merge 15 commits into
mainfrom
feature/dynamic-assume-role
Draft

feat: dynamic assume-role support for the AWS S3 service (+ IAM fixes & cleanup)#11
davidf-null wants to merge 15 commits into
mainfrom
feature/dynamic-assume-role

Conversation

@davidf-null

@davidf-null davidf-null commented May 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds dynamic assume-role support to the AWS S3 dependency service so the agent
can run AWS operations under a per-service / cross-account IAM role instead of only
its base IRSA identity, plus the IAM-permission and link-workflow fixes found while
testing it, and a final security/docs hardening pass.

Changes

Dynamic assume role

  • New assume_role_arn in values.yaml. When set, the agent calls sts:AssumeRole
    with its IRSA identity and uses the temporary credentials for all subsequent AWS
    calls (CLI + Terraform). Empty → IRSA is used directly.
  • Hard-fail (no silent fallback) if the assume-role call fails.
  • Credential isolation: link actions (build_permissions_context) unset any
    inherited AWS_* credentials before sourcing assume_role, so the call always
    starts from the IRSA identity and never tries to re-assume an already-assumed role
    (self-assume failure).

Requirements module — optional role creation

  • create_role (bool) to optionally create the IAM role inside the module.
  • trusted_arns (list) for the principals allowed to sts:AssumeRole.
  • Exposes role_arn / role_name outputs. Backwards compatible: existing
    role_name behavior unchanged.
module "s3_requirements" {
  source       = "./requirements"
  name         = "prod-us-east-1"
  create_role  = true
  trusted_arns = ["arn:aws:iam::123456789012:role/my-other-role"]
}

IAM permission fixes (required by the AWS provider)

The provider refreshes the full bucket configuration on every plan, so the
bucket-management policy now grants the complete s3:Get* read set:

  • Added missing bucket read permissions.
  • Corrected action names: s3:GetReplicationConfiguration,
    s3:GetEncryptionConfiguration / s3:PutEncryptionConfiguration.

Link / state fixes

  • build_context: always export the link context vars and surface errors to stderr
    for visibility in NP logs.
  • tofu init -reconfigure to handle stale backend state in OUTPUT_DIR.

Security & docs hardening

  • Removed the hardcoded testing ARN from values.yaml; assume_role_arn now
    defaults to "" so the published service stays account-agnostic. The
    account-specific ARN is provided per installation via --overrides-path.
  • Deleted local np-api-skill.{key,key.admin,token} credential files and added
    np-api-skill.*, *.key, *.token, *.pem, .claude/ to .gitignore so
    credentials can never be committed.
  • README: documented credential isolation before assume role, expanded the agent
    IAM permissions section (three policies + why the full read set is needed), and
    clarified the account-agnostic override note.

Note: the real account ID 235494813897 was introduced for testing in
445c5c8 and remains in the branch history; it is removed from the working tree
but not scrubbed from history (account IDs are low-sensitivity).

Test plan

  • create_role = false + role_name set → existing behavior unchanged
  • create_role = true + trusted_arns populated → role created with correct
    trust policy and all 3 policies attached; role_arn output correct
  • assume_role_arn empty → IRSA used directly end-to-end
  • assume_role_arn set (cross-account) → create/update/link/unlink all run
    under the assumed role; link actions do not hit a self-assume failure
  • Bad assume_role_arn → workflow aborts immediately (no IRSA fallback)
  • Plan/apply succeeds with the expanded s3:Get* read permissions

🤖 Generated with Claude Code

David Fernandez and others added 3 commits May 28, 2026 12:07
When assume_role_arn is set, the agent's base credentials (IRSA or static)
are used only to call sts:AssumeRole; all subsequent AWS calls (CLI +
Terraform) run under the target role. When empty, IRSA is used directly,
preserving backward compatibility for single-account setups.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add create_role and trusted_arns variables to optionally create an IAM
role with a configurable trust policy, allowing other roles to assume it
via sts:AssumeRole. Outputs role_arn and role_name expose the created
role. Existing role_name behavior is unchanged.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@davidf-null davidf-null marked this pull request as draft May 29, 2026 13:30
David Fernandez and others added 7 commits May 29, 2026 16:27
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…UTPUT_DIR

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…PATH

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…uration)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@davidf-null davidf-null force-pushed the feature/dynamic-assume-role branch from 1bb29fe to a98e2da Compare May 29, 2026 20:16
David Fernandez and others added 3 commits May 29, 2026 17:24
…revent self-assume failure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…build_context

Removed the ACTION_SOURCE guard around link variable extraction — the check
was never true because ACTION_SOURCE is never set, causing LINK_ID and related
outputs declared in link.yaml to be missing, which fails the workflow step.
Variables now always extracted from $CONTEXT using jq defaults (empty string
for non-link actions), preserving backward compatibility.

Also redirected error messages from stderr to stdout so they appear in NP
workflow logs, and added a DEBUG dump of the providers response to aid
future diagnosis.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hanges

Security/hygiene pass to keep the published service account-agnostic:

- Delete local np-api-skill.{key,key.admin,token} files and ignore
  np-api-skill.*, *.key, *.token, *.pem and .claude/ so credentials can
  never be committed.
- Clear the hardcoded testing ARN in aws-s3-bucket/values.yaml; assume_role_arn
  now defaults to "" (IRSA), with the account-specific ARN provided per
  installation via --overrides-path.
- README: document credential isolation before assume role (self-assume
  prevention), expand the agent IAM permissions section (three policies + why
  the full s3:Get* read set is required), and clarify the account-agnostic note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@davidf-null davidf-null changed the title feat(requirements): add assume role support to requirements module feat: dynamic assume-role support for the AWS S3 service (+ IAM fixes & cleanup) Jun 3, 2026
David Fernandez and others added 2 commits June 5, 2026 16:46
The agent now resolves the IAM role ARN to assume from the "AWS IAM" provider
(category Identity & Access Control, spec aws-iam-configuration) declared in
nullplatform, matching its arns list by selector. Precedence: env var ->
provider -> values.yaml -> IRSA.

- assume_role_lib (new): pure arn_for_selector_from_json + provider_arn_for_selector
  (list->read, since provider list omits deep attributes).
- assume_role: rewritten with the precedence chain above.
- build_context: derive ACCOUNT_NRN and selector BEFORE sourcing assume_role
  (previously sourced before NRN was available); export ASSUME_ROLE_NRN/SELECTOR.
- build_permissions_context: same selector resolution for link actions, after the
  inherited-credentials unset.
- values.yaml: new assume_role_selector (default service slug); assume_role_arn
  is now a fallback for local testing / back-compat.
- tests: bash unit tests for both lib functions, using a fake np on PATH as a
  test double; fixtures use placeholder account/ARN values only.
- README: auth section rewritten to document the provider and precedence.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extract the assume-role logic out of build_context and
build_permissions_context into a new standalone assume_role_step that
runs as the first step of every aws workflow. The role is now assumed
exactly once, up front, and the temporary credentials are exported and
inherited (then re-exported) by all subsequent steps.

Assuming once eliminates the double-assume that required unsetting and
re-assuming credentials in build_permissions_context (the self-assume
failure addressed in 5671ac6); that workaround is removed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@davidf-null davidf-null force-pushed the feature/dynamic-assume-role branch from a625b98 to e39cba4 Compare June 10, 2026 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant