feat: add AWS Bedrock provider for Claude via SigV4#84
Conversation
60f38cf to
943d6c7
Compare
rng1995
left a comment
There was a problem hiding this comment.
Thanks for adding a Bedrock provider — the ChatModelFactory Protocol is a clean way to support providers that don't use the OpenAI (api_key, base_url) model, the delegation in get_chat_model/is_llm_available is tidy, and the unit tests (mocked boto3.Session + ChatBedrockConverse, ARN-vs-model-ID provider pinning, env overrides, provider selection) are thorough.
Requesting changes on one blocking issue before this merges into a public repository.
1. Hardcoded environment-specific defaults (blocking). The default model is a fixed application-inference-profile ARN that embeds a specific private AWS account ID, and the default AWS_PROFILE is an organization-specific profile name. Both are committed as defaults in provider.py, model_registry.yaml, the README, and the CLI help. In a public/OSS project this is problematic for two reasons:
- It bakes account/environment-specific infrastructure identifiers into published source.
- The default is unusable for anyone outside that account — an external user who sets
SKILLSPECTOR_PROVIDER=bedrockwithout overrides will hit an opaque AccessDenied against a profile/ARN they can't access.
Please make the defaults generic and environment-agnostic: don't hardcode a profile (let AWS_PROFILE / the standard boto3 credential chain resolve), and require the user to supply SKILLSPECTOR_MODEL (a Bedrock model ID or their own inference-profile ARN) rather than shipping a private ARN as the default. The bundled model_registry.yaml should key on public Bedrock model IDs / cross-region inference-profile IDs only.
Non-blocking:
2. timeout is accepted but never applied. build_chat_model(..., timeout=...) is invoked with timeout=120 from get_chat_model, but timeout is never forwarded — it's absent from the kwargs passed to ChatBedrockConverse, and the boto3 client is created without a read/connect timeout. Consider applying it via the client config (e.g. botocore.config.Config(read_timeout=timeout, connect_timeout=...)) so long Bedrock calls actually time out, or drop the parameter if it's intentionally unused.
3. Heads-up: the branch is currently in a conflicted state (not mergeable) and will need a rebase on main.
Happy to re-review once the defaults are genericized.
Closes NVIDIA#82. Adds a `bedrock` provider so SkillSpector can run Claude models on AWS Bedrock alongside the existing OpenAI / Anthropic / Anthropic-proxy / NVIDIA paths. Bedrock uses SigV4 (not API keys); the provider implements the unified ChatModelProvider protocol introduced in main and returns its own ChatBedrockConverse from create_chat_model(). - New providers/bedrock/ package: BedrockProvider implements the full LLMProvider surface (ModelMetadataProvider + CredentialsProvider + ChatModelProvider). resolve_credentials() returns None since Bedrock doesn't fit the (api_key, base_url) shape; create_chat_model() probes the boto3 credential chain, returns None when no credentials resolve so the orchestrator can fall through, and otherwise constructs ChatBedrockConverse from a boto3 session. - Generic defaults (no environment-specific identifiers in OSS source): AWS_PROFILE is honored when set, otherwise the standard boto3 credential chain resolves. AWS_REGION defaults to us-west-2. Default model is the public cross-region inference profile us.anthropic.claude-sonnet-4-6-20250915-v1:0; users override via SKILLSPECTOR_MODEL with any Bedrock model ID, cross-region inference profile ID, or their own application-inference-profile ARN. - ARN model strings auto-pin ChatBedrockConverse's provider="anthropic" (required by LiteLLM/langchain-aws for inference-profile ARNs). - Per-call timeout is wired through to the boto3 client via botocore.config.Config(read_timeout=..., connect_timeout=10) so long Bedrock calls actually time out instead of hanging. - Wire selector: SKILLSPECTOR_PROVIDER=bedrock. - Deps: langchain-aws>=0.2.0, boto3>=1.34.0. - README and CLI help updated to document AWS_PROFILE / AWS_REGION behavior and the default model. - New tests/unit/test_bedrock_provider.py (16 tests) covers credential short-circuit, model registry lookup, resolve_model precedence, default-model-is-public-ID assertion, AWS_PROFILE omitted from boto3.Session when env is unset, env overrides, timeout wired through to BotocoreConfig, and ARN-vs-model-id provider pinning. - uv.lock not modified — let maintainer regenerate after merge. Full unit suite: 736 passed, 12 skipped. Signed-off-by: Rohan Isawe <rohan.isawe@smartsheet.com>
943d6c7 to
dfdbf12
Compare
|
Thanks for the thorough review. Force-pushed 1. Generic defaults (blocking) — fixedApologies for the sandbox-specific defaults — they were carryover from the internal testing config and should never have shipped in the PR. Reworked so the OSS defaults are fully generic:
2.
|
rng1995
left a comment
There was a problem hiding this comment.
[Automated SkillSpector Review]
Re-review: blocker resolved — approving.
The environment-specific defaults are gone: BEDROCK_DEFAULT_MODEL is now the public cross-region inference profile us.anthropic.claude-sonnet-4-6-20250915-v1:0 (not a private ARN), the hardcoded org AWS_PROFILE is removed (the provider only passes profile_name when AWS_PROFILE is set, otherwise the standard boto3 chain resolves), and model_registry.yaml keys on public IDs only. test_default_model_is_public_cross_region_inference_profile even asserts the default is not an arn:, locking the fix in. The previously non-blocking timeout issue is also fixed (applied via BotocoreConfig(read_timeout=..., connect_timeout=...)). Thorough provider tests. Approving.
Summary
Closes #82.
Adds a
bedrockprovider so SkillSpector can run Claude models on AWS Bedrock alongside the existing OpenAI / Anthropic / NVIDIA paths. Bedrock authenticates via SigV4 (not API keys), so this also introduces an optionalChatModelFactoryprotocol — providers that don't speak the OpenAI wire protocol implement it, andllm_utils.get_chat_modeldelegates client construction to them. Existing providers are unchanged.Changes
providers/bedrock/package —BedrockProviderbuildsChatBedrockConversefrom a boto3 session.AWS_PROFILEandAWS_REGIONfrom env (region default:us-west-2).SKILLSPECTOR_MODEL; supports both Bedrock model IDs and application-inference-profile ARNs (ARNs auto-pinprovider="anthropic").model_registry.yamlwith token-budget metadata.ChatModelFactoryProtocol inproviders/base.py— runtime-checkable; lets providers supply their own chat client when the(api_key, base_url)model doesn't fit.llm_utils.get_chat_modelnow returnsBaseChatModel(wasChatOpenAI). When the active provider implementsChatModelFactory, construction is delegated to it; otherwise the existingChatOpenAIpath runs unchanged.is_llm_available()short-circuits for factory-style providers — credential errors surface at invoke time via boto3 instead of at startup.SKILLSPECTOR_PROVIDER=bedrock,AWS_PROFILE, andAWS_REGION.langchain-aws>=0.2.0,boto3>=1.34.0(added topyproject.toml;uv.lockleft for the maintainer to regenerate).Usage
Test plan
tests/unit/test_bedrock_provider.py— 16 tests covering credential short-circuit, model-registry lookup,resolve_modelprecedence, ARN-vs-model-id provider pinning, env-drivenAWS_PROFILE/AWS_REGION, andSKILLSPECTOR_PROVIDER=bedrockselection. boto3 +ChatBedrockConverseare mocked, so no AWS calls are made.pytest -m 'not integration'→ 617 passed, 11 skipped, 23 deselected (up from 601 by the 16 new tests).ruff check src/ tests/clean.ruff format --check src/ tests/clean.chat_completion("Reply with just the word: OK")returnedOKvia SigV4 + the application-inference-profile ARN.instruction-overrideskill fromapurin/cursed-skills(intentionally malicious) producedCRITICALfindings for prompt-injection //etc/passwdexfil — confirms the analyzers actually run end-to-end through Bedrock, not just the credential path.Known follow-up (out of scope for this PR)
Issues #66 and #76 report that the Anthropic provider's structured-output schema is rejected because Pydantic emits
minimum/maximumJSON-Schema keywords. Bedrock's Anthropic-backed Converse API has the same constraint surface. End-to-end semantic-analyzer scans through Bedrock should work for endpoints that tolerate the current schema; if the same 400s appear on Bedrock, the fix belongs alongside the upstream fix inLLMAnalyzerBase/meta_analyzer.pyrather than in this provider. Happy to fold that in or follow up in a separate PR per the maintainer's preference.