feat(providers): AWS STS AssumeRole refresh strategy and aws-s3 profile#1782
Draft
russellb wants to merge 12 commits into
Draft
feat(providers): AWS STS AssumeRole refresh strategy and aws-s3 profile#1782russellb wants to merge 12 commits into
russellb wants to merge 12 commits into
Conversation
…nnels
Add proxy-side AWS SigV4 re-signing so sandbox clients can reach AWS
services (Bedrock) through the CONNECT tunnel using placeholder
credentials. The proxy strips the invalid signature, resolves real
credentials from the SecretResolver, re-signs with the aws-sigv4 crate,
and forwards. Configuration is policy-driven via two new fields
(credential_signing, signing_service).
Policy YAML example:
credential_signing: sigv4
signing_service: bedrock
Implementation:
- sigv4.rs: strip_aws_headers removes old auth headers before the
fail-closed placeholder scan; apply_sigv4_to_request re-signs using
the aws-sigv4 SDK with PayloadChecksumKind::XAmzSha256 enabled.
Returns Result instead of panicking. Non-signed headers (Accept,
User-Agent, etc.) are preserved in the output.
- rest.rs: SigV4 path buffers body (capped at MAX_REWRITE_BODY_BYTES)
for signing, then forwards the re-signed request upstream.
- Proto: credential_signing (field 19), signing_service (field 20)
on NetworkEndpoint.
- Policy/OPA: plumbed through serde, proto conversion, and Rego data.
- Supports AWS session tokens (STS temporary credentials).
- Integration test against real Bedrock (ignored, requires AWS creds).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reject policies where credential_signing is set but signing_service is empty during validate_sandbox_policy() instead of failing at connection time. The runtime check in rest.rs is kept as defense-in-depth.
Extend the SigV4 proxy re-signing to auto-detect the correct payload signing mode from the client SDK's x-amz-content-sha256 header: - Hex hash → buffer body and include hash in signature (Bedrock) - STREAMING-UNSIGNED-PAYLOAD-TRAILER → sign headers only, stream body through for aws-chunked uploads (S3 PutObject, upload_fileobj) - UNSIGNED-PAYLOAD → sign headers only, no body buffering (S3 over HTTPS) - Absent → fall back to Content-Length heuristic This eliminates the need for body buffering on S3 uploads and adds support for chunked transfer encoding that the previous implementation could not handle. New credential_signing policy values: - sigv4 — auto-detect from client headers (recommended) - sigv4:body — always buffer and hash the body - sigv4:no_body — always use UNSIGNED-PAYLOAD Also adds Expect: 100-continue handling in the REST L7 relay so clients like boto3's S3 PutObject receive the interim 100 response before sending the body. Validated end-to-end from inside a Podman sandbox against real AWS: Bedrock InvokeModel, S3 PUT/GET/DELETE, and streaming upload_fileobj.
Critical: - Scope Expect: 100-continue handling to SigV4 paths only. Previously it fired for all L7-proxied requests, violating RFC 7231 §5.1.1 and risking double 100 responses on non-SigV4 traffic. Warnings: - Reject unknown credential_signing values at policy validation time. A typo like "sigv4_typo" now produces a clear PolicyViolation instead of silently falling back to no signing. - Support dualstack, FIPS, virtual-hosted, and China partition hostnames in extract_aws_region (e.g. s3.dualstack.us-west-2.amazonaws.com, s3.cn-north-1.amazonaws.com.cn). - Emit OCSF NetworkActivity event for SigV4 re-signing decisions instead of debug! tracing, per AGENTS.md structured logging guidelines. - Update architecture/sandbox.md to document all three signing modes (signed body, streaming unsigned trailer, unsigned payload) and the auto-detection mechanism.
…OCSF nit - Fix extract_aws_region for FIPS+dualstack combo hostnames like s3-fips.dualstack.us-west-2.amazonaws.com (scans past all "dualstack" labels instead of just one). - Add tests for FIPS+dualstack and GovCloud region extraction. - Add unit test for UnknownCredentialSigning policy validation (e.g. "sigv4_typo" produces the expected violation). - Use ActivityId::Traffic instead of ActivityId::Other for the SigV4 OCSF event — more descriptive for a signing operation on existing traffic flow.
…TO, startup validation Critical: - Reject STREAMING-AWS4-HMAC-SHA256-PAYLOAD in detect_payload_mode() instead of silently treating it as SignBody (per-chunk signing is not supported). Returns a clear error directing the user to sigv4:no_body. - Add defense-in-depth guard in the SignBody path: fail closed if the request uses chunked transfer encoding, preventing body-less forwards. Warnings: - Wire credential_signing and signing_service through EndpointProfile DTO in openshell-providers. Both endpoint_to_proto() and endpoint_from_proto() now preserve the fields during round-trip. - Reject unknown credential_signing values at sandbox L7 config parse time (returns None, disabling L7 for the endpoint) instead of silently downgrading to CredentialSigning::None. Also reject SigV4 modes with empty signing_service at startup rather than deferring the error to request time.
Add gateway-owned AWS STS AssumeRole as a new credential refresh strategy. The gateway calls sts:AssumeRole and writes three coupled credentials (AccessKeyId, SecretAccessKey, SessionToken) atomically to the provider record via an extended multi-key MintedCredential. Scoped to provider v2 only — ConfigureProviderRefresh rejects aws_sts_assume_role when providers_v2_enabled is false. Ships two built-in profiles: generic 'aws' (credentials only) and 'aws-s3' (with S3 endpoint policy rules). Multi-key collision validation runs at both configure-time and mint-time. Refs NVIDIA#1576
Wire credential_signing and signing_service through EndpointProfile serde and proto conversion so provider profiles can enable SigV4 signing. Configure aws-s3 profile with **.amazonaws.com host glob, tls: terminate, credential_signing: sigv4, signing_service: s3, and a binary allowlist for common S3 clients. Add chunked transfer limitation to manage-providers.mdx. Add examples/aws-s3-sts.md manual E2E test guide. Refs NVIDIA#1576
Boto3 connects to global S3 endpoints like bucket.s3.amazonaws.com (no region in the hostname). The previous extract_aws_region returned "s3" for this pattern because it took the label at parts[len-3] without checking if it was actually a region. Add looks_like_region() which requires a hyphen followed by a digit (e.g., us-east-1). Service names like "s3" or "bedrock-runtime" are rejected, causing the fallback to us-east-1. Refs NVIDIA#1576
Boto3 put_object sends x-amz-content-sha256 with the value STREAMING-AWS4-HMAC-SHA256-PAYLOAD, which was rejected by detect_payload_mode() because per-chunk signing is not supported. Treat all streaming- variants as StreamingUnsignedTrailer: re-sign headers only and stream the body through. The proxy cannot reproduce per-chunk signatures, but AWS accepts unsigned streaming payloads over HTTPS. Refs NVIDIA#1576
Handle the new ProviderCredentialRefreshStrategy::AwsStsAssumeRole variant added by the SigV4 feature branch after rebase onto main introduced the exhaustive match in openshell-tui.
russellb
commented
Jun 5, 2026
…tighten aws-s3 endpoints Allow `*` as an entire middle DNS label in host wildcard patterns (e.g. `*.s3.*.amazonaws.com`) while rejecting partial middle-label wildcards (`us-*`) and middle-label `**`. This enables S3-specific regional endpoint shapes without the overly broad `**.amazonaws.com`. - Add `InvalidHostWildcard` policy violation and `host_wildcard_shape_invalid` validator in openshell-policy - Update L7 `validate_host_wildcard` to accept whole-label `*` in middle positions while still rejecting `**` and partial wildcards - Add OPA runtime tests confirming `*` matches exactly one DNS label and does not cross label boundaries (dualstack, missing bucket prefix) - Tighten aws-s3.yaml from `**.amazonaws.com` to four S3-specific shapes: `*.s3.*.amazonaws.com`, `s3.*.amazonaws.com`, `*.s3.dualstack.*.amazonaws.com`, `s3.dualstack.*.amazonaws.com` - Pin profile test to assert the old broad endpoint is gone - Update architecture/security-policy.md wildcard table Signed-off-by: Russell Bryant <rbryant@nvidia.com>
p5
reviewed
Jun 5, 2026
There was a problem hiding this comment.
Question (non-blocking): What's the use-case for a dedicated S3 provider over just using the AWS provider?
Contributor
Author
There was a problem hiding this comment.
the pre-defined endpoints list, mainly. and it's just what I was focused on as a test case.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add gateway-owned AWS STS AssumeRole as a credential refresh strategy, and ship an
aws-s3provider profile with SigV4 signing pre-configured.Prerequisite: #1638 must merge first — this branch includes those commits as a base.
Related Issue
Refs #1576
Changes
STS credential refresh (
openshell-server):aws_sts_assume_rolerefresh strategy (v2 provider API only)sts:AssumeRoleand writes three coupled credentials (AccessKeyId,SecretAccessKey,SessionToken) atomically via an extended multi-keyMintedCredentialConfigureProviderRefreshrejectsaws_sts_assume_rolewhenproviders_v2_enabledis falseProvider profiles (
openshell-providers):aws.yaml— generic AWS profile (credentials only)aws-s3.yaml— S3 profile with**.amazonaws.comhost glob,tls: terminate,credential_signing: sigv4,signing_service: s3, and binary allowlist for common S3 clientscredential_signingandsigning_servicethroughEndpointProfileserde and proto conversionTUI (
openshell-tui):AwsStsAssumeRolein refresh strategy label (rebase fix)Documentation:
docs/sandboxes/manage-providers.mdx: chunked transfer limitation noteexamples/aws-s3-sts.md: end-to-end manual test guide for S3 access with STSTesting
mise run pre-commitpasses (except pre-existingpython:protofailure on main)Checklist