feat: add AWS Bedrock provider for Claude via SigV4 by rcha0s · Pull Request #84 · NVIDIA/SkillSpector

rcha0s · 2026-06-16T20:34:24Z

Summary

Closes #82.

Adds a bedrock provider so SkillSpector can run Claude models on AWS Bedrock alongside the existing OpenAI / Anthropic / NVIDIA paths. Bedrock authenticates via SigV4 (not API keys), so this also introduces an optional ChatModelFactory protocol — providers that don't speak the OpenAI wire protocol implement it, and llm_utils.get_chat_model delegates client construction to them. Existing providers are unchanged.

Changes

New providers/bedrock/ package — BedrockProvider builds ChatBedrockConverse from a boto3 session.
- Reads AWS_PROFILE and AWS_REGION from env (region default: us-west-2).
- Default model is overridable via SKILLSPECTOR_MODEL; supports both Bedrock model IDs and application-inference-profile ARNs (ARNs auto-pin provider="anthropic").
- Bundled model_registry.yaml with token-budget metadata.
New ChatModelFactory Protocol in providers/base.py — runtime-checkable; lets providers supply their own chat client when the (api_key, base_url) model doesn't fit.
llm_utils.get_chat_model now returns BaseChatModel (was ChatOpenAI). When the active provider implements ChatModelFactory, construction is delegated to it; otherwise the existing ChatOpenAI path runs unchanged.
is_llm_available() short-circuits for factory-style providers — credential errors surface at invoke time via boto3 instead of at startup.
CLI / README updated to document SKILLSPECTOR_PROVIDER=bedrock, AWS_PROFILE, and AWS_REGION.
Deps: langchain-aws>=0.2.0, boto3>=1.34.0 (added to pyproject.toml; uv.lock left for the maintainer to regenerate).

Usage

export SKILLSPECTOR_PROVIDER=bedrock
export AWS_PROFILE=your-aws-profile
export AWS_REGION=us-west-2
# Optional: override the default model with a Bedrock model ID or ARN
# export SKILLSPECTOR_MODEL=us.anthropic.claude-sonnet-4-6-20250915-v1:0
skillspector scan ./my-skill/

Test plan

New unit tests: tests/unit/test_bedrock_provider.py — 16 tests covering credential short-circuit, model-registry lookup, resolve_model precedence, ARN-vs-model-id provider pinning, env-driven AWS_PROFILE / AWS_REGION, and SKILLSPECTOR_PROVIDER=bedrock selection. boto3 + ChatBedrockConverse are mocked, so no AWS calls are made.
Full unit suite: pytest -m 'not integration' → 617 passed, 11 skipped, 23 deselected (up from 601 by the 16 new tests).
ruff check src/ tests/ clean.
ruff format --check src/ tests/ clean.
End-to-end smoke test against AWS: chat_completion("Reply with just the word: OK") returned OK via SigV4 + the application-inference-profile ARN.
End-to-end scan: scanning the instruction-override skill from apurin/cursed-skills (intentionally malicious) produced CRITICAL findings for prompt-injection / /etc/passwd exfil — confirms the analyzers actually run end-to-end through Bedrock, not just the credential path.
DCO sign-off present on the commit.

Known follow-up (out of scope for this PR)

Issues #66 and #76 report that the Anthropic provider's structured-output schema is rejected because Pydantic emits minimum/maximum JSON-Schema keywords. Bedrock's Anthropic-backed Converse API has the same constraint surface. End-to-end semantic-analyzer scans through Bedrock should work for endpoints that tolerate the current schema; if the same 400s appear on Bedrock, the fix belongs alongside the upstream fix in LLMAnalyzerBase / meta_analyzer.py rather than in this provider. Happy to fold that in or follow up in a separate PR per the maintainer's preference.

rng1995

Thanks for adding a Bedrock provider — the ChatModelFactory Protocol is a clean way to support providers that don't use the OpenAI (api_key, base_url) model, the delegation in get_chat_model/is_llm_available is tidy, and the unit tests (mocked boto3.Session + ChatBedrockConverse, ARN-vs-model-ID provider pinning, env overrides, provider selection) are thorough.

Requesting changes on one blocking issue before this merges into a public repository.

1. Hardcoded environment-specific defaults (blocking). The default model is a fixed application-inference-profile ARN that embeds a specific private AWS account ID, and the default AWS_PROFILE is an organization-specific profile name. Both are committed as defaults in provider.py, model_registry.yaml, the README, and the CLI help. In a public/OSS project this is problematic for two reasons:

It bakes account/environment-specific infrastructure identifiers into published source.
The default is unusable for anyone outside that account — an external user who sets SKILLSPECTOR_PROVIDER=bedrock without overrides will hit an opaque AccessDenied against a profile/ARN they can't access.

Please make the defaults generic and environment-agnostic: don't hardcode a profile (let AWS_PROFILE / the standard boto3 credential chain resolve), and require the user to supply SKILLSPECTOR_MODEL (a Bedrock model ID or their own inference-profile ARN) rather than shipping a private ARN as the default. The bundled model_registry.yaml should key on public Bedrock model IDs / cross-region inference-profile IDs only.

Non-blocking:

2. timeout is accepted but never applied. build_chat_model(..., timeout=...) is invoked with timeout=120 from get_chat_model, but timeout is never forwarded — it's absent from the kwargs passed to ChatBedrockConverse, and the boto3 client is created without a read/connect timeout. Consider applying it via the client config (e.g. botocore.config.Config(read_timeout=timeout, connect_timeout=...)) so long Bedrock calls actually time out, or drop the parameter if it's intentionally unused.

3. Heads-up: the branch is currently in a conflicted state (not mergeable) and will need a rebase on main.

Happy to re-review once the defaults are genericized.

Closes NVIDIA#82. Adds a `bedrock` provider so SkillSpector can run Claude models on AWS Bedrock alongside the existing OpenAI / Anthropic / Anthropic-proxy / NVIDIA paths. Bedrock uses SigV4 (not API keys); the provider implements the unified ChatModelProvider protocol introduced in main and returns its own ChatBedrockConverse from create_chat_model(). - New providers/bedrock/ package: BedrockProvider implements the full LLMProvider surface (ModelMetadataProvider + CredentialsProvider + ChatModelProvider). resolve_credentials() returns None since Bedrock doesn't fit the (api_key, base_url) shape; create_chat_model() probes the boto3 credential chain, returns None when no credentials resolve so the orchestrator can fall through, and otherwise constructs ChatBedrockConverse from a boto3 session. - Generic defaults (no environment-specific identifiers in OSS source): AWS_PROFILE is honored when set, otherwise the standard boto3 credential chain resolves. AWS_REGION defaults to us-west-2. Default model is the public cross-region inference profile us.anthropic.claude-sonnet-4-6-20250915-v1:0; users override via SKILLSPECTOR_MODEL with any Bedrock model ID, cross-region inference profile ID, or their own application-inference-profile ARN. - ARN model strings auto-pin ChatBedrockConverse's provider="anthropic" (required by LiteLLM/langchain-aws for inference-profile ARNs). - Per-call timeout is wired through to the boto3 client via botocore.config.Config(read_timeout=..., connect_timeout=10) so long Bedrock calls actually time out instead of hanging. - Wire selector: SKILLSPECTOR_PROVIDER=bedrock. - Deps: langchain-aws>=0.2.0, boto3>=1.34.0. - README and CLI help updated to document AWS_PROFILE / AWS_REGION behavior and the default model. - New tests/unit/test_bedrock_provider.py (16 tests) covers credential short-circuit, model registry lookup, resolve_model precedence, default-model-is-public-ID assertion, AWS_PROFILE omitted from boto3.Session when env is unset, env overrides, timeout wired through to BotocoreConfig, and ARN-vs-model-id provider pinning. - uv.lock not modified — let maintainer regenerate after merge. Full unit suite: 736 passed, 12 skipped. Signed-off-by: Rohan Isawe <rohan.isawe@smartsheet.com>

rcha0s · 2026-06-23T00:31:04Z

Thanks for the thorough review. Force-pushed dfdbf12 addressing all three points.

1. Generic defaults (blocking) — fixed

Apologies for the sandbox-specific defaults — they were carryover from the internal testing config and should never have shipped in the PR. Reworked so the OSS defaults are fully generic:

BEDROCK_DEFAULT_PROFILE removed. AWS_PROFILE is honored when set, otherwise the standard boto3 credential chain (env vars, instance metadata, SSO, etc.) resolves. The provider passes profile_name to boto3.Session() only when AWS_PROFILE is non-empty — verified by test_omits_profile_when_aws_profile_unset.
BEDROCK_DEFAULT_MODEL is now us.anthropic.claude-sonnet-4-6-20250915-v1:0 (public cross-region inference profile, available to any account with Anthropic-on-Bedrock access). A regression test (test_default_model_is_public_cross_region_inference_profile) explicitly asserts the default isn't an ARN, so this can't drift back.
model_registry.yaml keys on public Bedrock cross-region inference-profile IDs only — the private ARN entry is gone.
README + CLI help rewritten to document the credential-chain behavior and the public default model.

2. `timeout` wired through (non-blocking) — fixed

The timeout argument now flows through to the boto3 client as the read timeout via botocore.config.Config(read_timeout=timeout, connect_timeout=10). Verified by test_timeout_applied_to_botocore_config — asserts both config.read_timeout and config.connect_timeout on the client kwargs. The 10s connect timeout is baked in for now; happy to expose it if you'd prefer a knob.

3. Rebase on main — done

The branch had drifted significantly while the PR was open — main introduced the unified ChatModelProvider protocol (every provider implements create_chat_model(...) → BaseChatModel | None), the anthropic_proxy provider, and reworked orchestration in providers/__init__.py / llm_utils.py. Rather than fight a 4-file conflict, I reset the branch to origin/main and reapplied the Bedrock work fitted to the new architecture:

BedrockProvider now implements the full LLMProvider surface; no separate ChatModelFactory Protocol introduced.
create_chat_model probes session.get_credentials() and returns None when nothing resolves in the chain, letting the orchestrator's existing fall-through logic in providers/__init__.py:create_chat_model handle it. No changes to base.py, llm_utils.py, or the orchestrator function — only providers/__init__.py gets a 3-line selector entry for bedrock.

Verification

Full unit suite: 736 passed, 12 skipped.
ruff check src/ tests/: clean.
ruff format --check: clean on all PR files.
uv.lock not touched — left for the maintainer to regenerate after merge (the local uv version was producing revision-noise diffs).
DCO sign-off included.

Ready for re-review.

rng1995

[Automated SkillSpector Review]

Re-review: blocker resolved — approving.

The environment-specific defaults are gone: BEDROCK_DEFAULT_MODEL is now the public cross-region inference profile us.anthropic.claude-sonnet-4-6-20250915-v1:0 (not a private ARN), the hardcoded org AWS_PROFILE is removed (the provider only passes profile_name when AWS_PROFILE is set, otherwise the standard boto3 chain resolves), and model_registry.yaml keys on public IDs only. test_default_model_is_public_cross_region_inference_profile even asserts the default is not an arn:, locking the fix in. The previously non-blocking timeout issue is also fixed (applied via BotocoreConfig(read_timeout=..., connect_timeout=...)). Thorough provider tests. Approving.

rcha0s force-pushed the feat/bedrock-provider branch from 60f38cf to 943d6c7 Compare June 16, 2026 20:46

rng1995 requested changes Jun 21, 2026

View reviewed changes

rcha0s force-pushed the feat/bedrock-provider branch from 943d6c7 to dfdbf12 Compare June 23, 2026 00:05

rng1995 approved these changes Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add AWS Bedrock provider for Claude via SigV4#84

feat: add AWS Bedrock provider for Claude via SigV4#84
rcha0s wants to merge 1 commit into
NVIDIA:mainfrom
rcha0s:feat/bedrock-provider

rcha0s commented Jun 16, 2026 •

edited

Loading

Uh oh!

rng1995 left a comment

Uh oh!

rcha0s commented Jun 23, 2026

Uh oh!

rng1995 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

rcha0s commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Test plan

Known follow-up (out of scope for this PR)

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

rcha0s commented Jun 23, 2026

1. Generic defaults (blocking) — fixed

2. timeout wired through (non-blocking) — fixed

3. Rebase on main — done

Verification

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rcha0s commented Jun 16, 2026 •

edited

Loading

2. `timeout` wired through (non-blocking) — fixed