Conversation
…repend remediation hint (#226) When AWS SSO sessions expire or GCP ADC refresh tokens are revoked, the underlying SDK error currently surfaces buried inside an S3OperationError or GcsOperationError stack, with the remediation hint (`aws sso login` / `gcloud auth application-default login`) hard to find. Users initially attribute the failure to the lock layer or backend. Both wrappers now classify auth-class failures and prepend a swamp-flavoured summary line: - "Datastore session expired: ... Run 'aws sso login --profile <P>' ..." - "Datastore credentials rejected by AWS: verify '<P>' ..." - "Datastore session expired: your GCP ADC ... 'gcloud auth application-default login' ..." - "Datastore credentials rejected by GCS: verify GOOGLE_APPLICATION_CREDENTIALS ..." Detection is via pure exported helpers (classifyAwsCredentialError, deriveAwsErrorCode, classifyGcpCredentialError, formatAwsCredentialHint, formatGcpCredentialHint, tokenRefreshError) so each branch can be unit-tested without constructing real SDK errors. The wrappers stay private; only the helpers are exported. S3 detection covers CredentialsProviderError + ExpiredTokenException (session-expired) and InvalidAccessKeyId + SignatureDoesNotMatch + 403 AccessDenied (credentials-rejected). The cause-chain walk handles minified SDK builds where the real class name appears as _CredentialsProviderError on .cause. GCS refactors tokenFromUserCredentials and tokenFromServiceAccount to throw a typed GcsOperationError with name "CredentialsProviderError" on invalid_grant responses, so the same classifier path fires for ADC refresh failures. The .name field on both wrapper classes is preserved as the underlying SDK error name so existing `error.name === "..."` checks at call sites keep working. Filed swamp-club#228 to apply the same treatment to @swamp/aws-sm-vault. Refs: swamp-club#226 Manifests: 2026.05.04.1 → 2026.05.04.2 for both extensions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
Blocking Issues
None.
Suggestions
-
Double docblock on
tokenRefreshError(gcs_client.tslines ~278–291): Two consecutive JSDoc blocks appear before the function — likely a merge artifact. TypeScript associates only the last one with the function; the first is dead documentation. The first block should be removed. -
classifyGcpCredentialError'scauseNameparameter is alwaysundefinedin productionsend()calls:send()always callsclassifyGcpCredentialError(undefined, response.status), so thecauseName === "CredentialsProviderError"branch in that function is never reached fromsend(). The session-expired path for GCS flows throughtokenRefreshError()directly (which embeds the hint before throwing). The function is correctly exercised in the composition unit test (err.nameis passed in), but a brief note in the JSDoc thatcauseNameis intended for callers re-classifying a caught error — not forsend()itself — would clarify the design intent. -
Stale section comment in
gcs_client_test.ts(line ~467): The block comment preceding the wrapper integration tests says "writing a temporaryauthorized_usercredentials file, pointing the OAuth token endpoint at our mock server, and intercepting the refresh request" but the actual tests calltokenRefreshErrordirectly as pure unit tests. The description matches the test design, not what was implemented.
The implementation is well-structured: pure helper functions for testability, proper cause-chain walking for minified SDK builds, regression guards on non-credential 403s, env vars restored in finally blocks throughout, and sanitizeResources: false following the existing file-level comment pattern. Test coverage is thorough across both providers.
There was a problem hiding this comment.
Adversarial Review
Critical / High
None found.
Medium
-
GCS error type change in token-refresh functions (
gcs_client.ts:310–319,gcs_client.ts:336–340).
tokenFromServiceAccountandtokenFromUserCredentialspreviously threw plainError; they now throwGcsOperationErrorviatokenRefreshError(). This is a behavioral change: any catch block that dispatches oninstanceof GcsOperationErrorand inspectshttpStatusCodewill now match these token-endpoint errors when it didn't before.Breaking example: A caller does
catch (err) { if (err instanceof GcsOperationError && err.httpStatusCode === 400) { /* "GCS said bad request" */ } }. Before this PR, a 400 from the OAuth2 token endpoint (e.g.invalid_grant) would be a plainErrorand skip this block. Now it enters the block with a status code whose semantic origin is the token endpoint, not GCS.Verified safe in existing code:
isRetryableErroringcs_cache_sync.ts:182–195handles the new type correctly — 400 → not retried (correct), 5xx → retried (arguably an improvement over the old behavior where 5xx token-endpoint failures were non-retryable plainErrors). Thenamefield ("CredentialsProviderError"/"TokenRefreshError") disambiguates if needed. No action required unless there are call sites outside this repo that matchinstanceof GcsOperationError.
Low
-
Duplicate JSDoc on
tokenRefreshError(gcs_client.ts:278–293). Two consecutive/** */blocks precede the function. The first (lines 278–284) is a stale draft; only the second (lines 285–293) is attached to the function by the parser. Harmless, but messy — delete the first block. -
Unquoted
AWS_PROFILEin remediation hint (s3_client.ts:192).formatAwsCredentialHintinterpolates the profile name bare:`aws sso login --profile ${awsProfile}`. IfAWS_PROFILEcontains spaces (uncommon but not prohibited by AWS config format), the copy-paste command would be broken. Consider wrapping:`aws sso login --profile '${awsProfile}'`. -
tokenFromMetadataServerstill throws plainError(gcs_client.ts:376). The other two token functions now usetokenRefreshError()→GcsOperationError, but the metadata-server path still throwsnew Error(...). This means metadata-server token failures won't get the swamp-flavoured hint. The remediation for metadata-server failures is different (check instance SA, not rungcloud auth), so this may be intentional — but it's worth a comment to make that explicit.
Verdict
PASS. The new helpers are pure, well-tested, and correctly integrated into both providers' error paths. Classification logic handles all documented error codes with appropriate regression guards. The medium finding is a real behavioral change but is handled correctly by the existing retry classifier. Solid work.
…121) Five small follow-ups from the post-merge review of #120: 1. gcs_client.ts:tokenRefreshError — collapse the duplicate JSDoc block left over from the test-export commit (parser only attached the second one; the first was dead documentation). 2. gcs_client_test.ts — rewrite the stale "Wrapper integration tests" section comment that described an OAuth-mocking design that was never implemented (we export tokenRefreshError and unit-test it directly instead). 3. classifyGcpCredentialError — add a JSDoc note that the causeName parameter is for callers re-classifying a caught error. send() never reaches the cause-name branch because tokenRefreshError short- circuits the session-expired path at construction time. 4. s3_client.ts:formatAwsCredentialHint — wrap the profile name in double quotes inside the single-quoted command, so the copy-pasted `aws sso login --profile "..."` stays valid for profiles that contain spaces (uncommon but legal in AWS config). Updated unit and integration tests to assert the new shape; added a multi-word- profile unit test. 5. tokenFromMetadataServer — add a comment explaining why this path intentionally throws plain Error rather than going through tokenRefreshError. The metadata-server failure remediation differs from the user-credential / service-account paths (check instance SA and IAM scopes, not run gcloud auth), so stamping `name = "CredentialsProviderError"` here would mislead the classifier into prepending the wrong hint. Manifest bumps: 2026.05.04.2 → 2026.05.04.3 for both extensions. Refs: swamp-club#226, #120 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ion hint (#228) (#126) Follow-up to swamp-club#226 (PR #120/#121 covered s3-datastore and gcs-datastore). The aws-sm vault wraps the AWS SDK and surfaced the same CredentialsProviderError discoverability issue when SSO sessions expire mid-secret-fetch. Adds vault/aws-sm/extensions/vaults/aws_sm_errors.ts containing: - AwsSmOperationError: error wrapper that preserves the SDK error's `name` (so error.name === "ResourceNotFoundException" still works) and exposes httpStatusCode, code, and requestId. - classifyAwsCredentialError, deriveAwsErrorCode, formatAwsCredentialHint: pure helpers mirroring the s3 pattern. - wrapAwsSmError: front-loads "Vault session expired:" or "Vault credentials rejected by AWS:" with `aws sso login --profile <P>` pointing at the next action. aws_sm.ts wraps every .send() call and routes errors through wrapAwsSmError. The put() create-on-missing fallback migrates from `error instanceof ResourceNotFoundException` to `error.name === "ResourceNotFoundException"`, which is functionally equivalent and removes the SDK class import. Empirical grounding (probe at @aws-sdk/client-secrets-manager@3.1024.0, 2026-05-06): - Pre-flight credential failure surfaces as `name === "CredentialsProviderError"` directly on the outer error with no cause chain. The cause-walk + leading-underscore-strip in deriveAwsErrorCode remains as defensive coverage for minified SDK builds and the older #226 stack-trace pattern. - HTTP 400 with a body lacking `__type` produces err.name="Unknown" and err.message="UnknownError". wrapAwsSmError mirrors the s3 wrapper's `code !== "Unknown"` and `message !== "UnknownError"` filters so the wrapped output stays clean. Both shapes are covered by direct wrapAwsSmError tests in aws_sm_errors_test.ts (24 tests). aws_sm_test.ts adds 6 behavioural tests covering all four operations, the put create-on-missing regression guard for the instanceof → name migration, and the malformed-body defense. Bumps the inline SDK pin in aws_sm.ts from @3.1010.0 → @3.1024.0 so the new aws_sm_errors.ts and aws_sm.ts agree (matches deno.json's import map). Manifest: 2026.04.22.2 → 2026.05.06.1. Refs: swamp-club#228, swamp-club#226 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
aws sso login/gcloud auth application-default login). The SDK detail and cause chain remain underneath;error.nameis unchanged so existing call-site checks keep working.@swamp/aws-sm-vault.Why
When AWS SSO sessions expire or GCP ADC refresh tokens are revoked mid-execution, the resulting
S3OperationError [CredentialsProviderError]orGcsOperationErrorsurfaces with the remediation hint (aws sso login) buried inside the SDK stack — users initially attribute the failure to lock contention or a backend issue. This change front-loads the swamp-flavoured summary so the cause and next action are visible first.What
S3 (
datastore/s3/extensions/datastores/_lib/s3_client.ts):classifyAwsCredentialError,deriveAwsErrorCode,formatAwsCredentialHint,AwsCredentialErrorKind.wrapErrorcalls them; cause-chain walk handles minified SDK builds where the real class name appears as_CredentialsProviderErroron.cause(per the issue's stack trace).CredentialsProviderError+ExpiredTokenException(session-expired) andInvalidAccessKeyId+SignatureDoesNotMatch+ 403AccessDenied(credentials-rejected).AWS_PROFILEis read for the hint; falls back to generic phrasing when unset.BucketRegionMismatch).GCS (
datastore/gcs/extensions/datastores/_lib/gcs_client.ts):classifyGcpCredentialError,formatGcpCredentialHint,tokenRefreshError,GcpCredentialErrorKind.tokenFromUserCredentialsandtokenFromServiceAccountnow throw a typedGcsOperationErrorwithname === "CredentialsProviderError"when the response body indicatesinvalid_grant— the GCP equivalent of expired SSO.send()prepends the credentials-rejected hint on 401/403; suppresses the old generic hint when the credential-specific hint fires.Manifests: Both
datastore/s3anddatastore/gcsbumped from2026.05.04.1to2026.05.04.2(CalVer same-day .2).Test plan
deriveAwsErrorCode— 7 cases coveringe.Codeprecedence,e.namefallback, genericErrorignored, single + multiple leading underscore strip on cause, non-Error cause ignored, no-signal undefined.classifyAwsCredentialError— 8 cases covering each error code branch and status guards.formatAwsCredentialHint— 5 cases covering profile set/unset variants for both kinds +otherreturns undefined.withMockServer— 4 cases for server-side errors (InvalidAccessKeyId,AccessDenied,SignatureDoesNotMatch,BucketRegionMismatchregression) + 1 case for 500 regression.CredentialsProviderErrorintegration test — setsAWS_PROFILEto a non-existent profile, unsets all access-key env vars, points config files at empty temp file, disables IMDS, callsheadBucket(). Asserts the wrapped message starts with"Datastore session expired", includes the configured profile in theaws sso login --profilehint, andS3OperationError.namereflects the underlying SDK error name. This is the issue's primary failure path, exercised end-to-end against a real SDK chain failure.classifyGcpCredentialError/formatGcpCredentialHint— 8 cases.tokenRefreshError— 4 cases coveringinvalid_grant(session-expired framing + name =CredentialsProviderError), non-invalid_grant4xx (generic name), 5xx (no credential framing), and a composition test that proves the resulting error flows throughclassifyGcpCredentialErrortosession-expired.Deno.serve({ port: 0 })— 401, 500 regression, 404/412 narrow-type regression guards. Existing 403 andgcs_cache_sync_test.ts403 tests updated to assert the new hint.assertDatastoreExportConformancecontinues to pass for both extensions.deno check / lint / fmt --check / test / install --frozenclean for both extensions.Out of scope
@swamp/aws-sm-vaulthas the same problem but nowrapErrorpattern to extend — introducing one is a meaningful separate lift. Filed as swamp-club#228.summarizeSyncErrorreference path lives outside this repo.🤖 Generated with Claude Code