Codebase audit: SecurityScannerProvider and scanner plugin#279
Codebase audit: SecurityScannerProvider and scanner plugin#279
Conversation
Explores integrating goakt v4 actor framework into the workflow engine as a complementary paradigm alongside pipelines. Covers architecture, YAML config schemas, deployment model, documentation strategy, and the path toward a future actor-native engine. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
11 tasks, 29 tests, TDD throughout. Covers plugin skeleton, actor.system module, actor.pool module, bridge actor, step.actor_send/ask, actor workflow handler, schemas, config example, and integration tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add github.com/tochemey/goakt/v4 as a dependency. Fix VolumeResourceRequirements type change from k8s API v0.35 upgrade. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
BridgeActor is the core goakt<->pipeline integration: - Implements goakt v4 Actor interface (PreStart/Receive/PostStop) - Dispatches incoming ActorMessages to HandlerPipeline by message type - Builds PipelineContext with .message, .state, .actor template variables - Falls back to inline step.set creation when no step registry is available - Merges last step output back into actor state for persistence - Returns error map for unknown message types (not panics) - 3 tests pass: receive, unknown type, state persistence across messages Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Integration tests (Task 10): - TestIntegration_FullActorLifecycle: full lifecycle from module creation through message passing and state persistence verification - TestIntegration_MultipleActorsIndependentState: two actors maintain independent state — no shared-state bugs step.actor_send (Task 5): fire-and-forget Tell to actor pools, template resolution for identity/payload, pool lookup from metadata step.actor_ask (Task 6): request-response Ask with configurable timeout, template resolution, returns actor response as step output module_pool.go: GetOrSpawnActor helper for identity-based spawning, pids map with mutex for concurrent access, SetStepRegistry injection, handlers typed to map[string]*HandlerPipeline bridge_actor.go: NewBridgeActor constructor, State() inspector method All 25 tests pass with -race flag. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- step.actor_send: fire-and-forget Tell to identity-based or pool actors - step.actor_ask: request-response Ask with configurable timeout (default 10s) - Both resolve message/identity as template expressions via PipelineContext - Use GetOrSpawnActor for auto-managed pools, ActorOf for permanent pools - Registered in plugin.StepFactories via wrapStepFactory helper - 9 unit tests covering config validation and timeout parsing Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Shows actor.system, actor.pool, and step.actor_ask in a complete HTTP + actor workflow: stateful order processing where each order_id maps to its own auto-managed actor instance. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- actor.system and actor.pool module types in Actor Model section - step.actor_send and step.actor_ask in Pipeline Steps table - Actors workflow type in Workflow Types section Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…permanent pool spawning
- Fix step.actor_send and step.actor_ask to look up pools via
app.GetService("actor-pool:<name>") instead of unreachable
pc.Metadata["__actor_pools"] map
- Implement permanent pool actor spawning in ActorPoolModule.Start()
— spawns poolSize BridgeActor instances into the goakt system
- Remove double error response in BridgeActor.Receive() — use only
ctx.Err(err), not both ctx.Err and ctx.Response
- Remove unused ActorResponse dead code from messages.go
- Add BridgeGrain integration test (state persistence via grain API)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…k step test - Move nil check for pool.system before ActorSystem() call in both step.actor_send and step.actor_ask to prevent nil pointer dereference - Use sys variable consistently instead of redundant ActorSystem() calls - Convert bare type assertions to two-value form in integration and bridge actor tests to fail gracefully instead of panicking - Add TestActorAskStep_RequiresMessageType test for factory validation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…idation, step caching - Spawn primary actor under pool name so sys.ActorOf(ctx, poolName) succeeds for permanent pools (was spawning pool-0..N only) - Add CBOR struct tags to ActorMessage for cluster mode serialization - Validate identity is required for auto-managed pools at Execute time instead of silently falling through to a failing ActorOf path - Cache step instances in executePipeline to avoid rebuilding per message - Remove AI-generated plan doc (local paths, Claude instructions) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…-only fields Steps now use pool.SelectActor(msg) for permanent pools instead of sys.ActorOf(), enabling actual routing strategy execution. Round-robin, random, broadcast, and sticky routing all work at runtime. Recovery supervisors are applied via actor.WithSupervisor() during Spawn. Removed non-functional cluster-only schema fields (cluster, metrics, tracing, placement, targetRoles, failover) since single-node is the current scope. Added 10 new tests covering routing distribution, permanent pool spawning, broadcast delivery, and recovery. All 41 actor plugin tests pass. Example config updated with permanent pool. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Pre-build step instances during pool init (preBuildSteps) instead of caching in shared handler maps at runtime, eliminating a data race when multiple actors process messages concurrently - Convert if-else chain to switch statement (gocritic) - Guard negative int-to-uint32 conversion (gosec G115) - Remove unnecessary nil check around range (staticcheck S1031) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…vices, error handling - Build fresh step instances per execution in executePipeline() to avoid sharing mutable state across concurrent actors in the same pool - Remove BuiltSteps field from HandlerPipeline and preBuildSteps() from pool (no longer needed with per-execution step building) - Return real error for unknown message types instead of map with error key - Accumulate all step outputs into actor state, not just the last step's - Add RequiresServices() on ActorPoolModule to declare dependency on its actor.system module for correct init ordering - Document nil return convention in module factories (engine handles nil) - Update TestBridgeActor_UnknownMessageType for new error behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Comprehensive audit found 3 stubbed scan steps in core engine, confirmed all external plugins are fully implemented, and identified 8 plugins with zero scenario coverage. Plan: SecurityScannerProvider interface, DockerSandbox module, security scanner plugin, 4 public scenarios, 5 private scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…steps - Create scan_provider.go with SecurityScannerProvider interface and SASTScanOpts/ContainerScanOpts/DepsScanOpts config structs; export SeverityRank as a public wrapper around the existing severityRank helper - Rewrite ScanSASTStep, ScanContainerStep, and ScanDepsStep Execute() methods to delegate to a SecurityScannerProvider looked up from the modular service registry under "security-scanner" - Steps return a clear error when no provider is configured instead of ErrNotImplemented; severity gate evaluation uses existing EvaluateGate() - Add scan_provider_test.go with mock provider covering success, gate failure, and provider error cases for all three scan steps - Update pipeline_step_scan_test.go to verify the no-provider error path Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds plugins/scanner package implementing SecurityScannerProvider. The security.scanner module registers as "security-scanner" service, enabling the existing scan_sast/scan_container/scan_deps steps. Supports mock mode with configurable findings for testing, and defaults to sensible scanner backends (semgrep, trivy, grype). 12 tests covering all scan types, gate evaluation, and config. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…curity.scanner
- Add security_scanner_adapter.go with SecurityScannerRemoteModule that wraps
RemoteModule and registers a remoteSecurityScannerProvider in the service
registry on Init(app), enabling core scan steps to find the provider via
app.GetService("security-scanner", &provider)
- Update adapter.go ModuleFactories() to wrap security.scanner remote modules
with SecurityScannerRemoteModule instead of the plain RemoteModule
- remoteSecurityScannerProvider implements module.SecurityScannerProvider by
delegating ScanSAST/ScanContainer/ScanDeps to InvokeService gRPC calls
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR introduces a provider-based security scanning architecture (core scan steps delegate to a SecurityScannerProvider resolved from the service registry), adds a built-in security.scanner plugin implementation (mock-focused with a CLI-mode placeholder), and also wires in a new built-in actors plugin (goakt-based actor system/pools + send/ask steps). It also updates dependencies (notably Kubernetes libs) and adjusts k8s PVC construction accordingly.
Changes:
- Add
SecurityScannerProviderinterface + scan step rewrites to call a provider viaapp.GetService("security-scanner", ...). - Add built-in
plugins/scannerwithsecurity.scannermodule (mock findings + gating) and external plugin adapter support. - Add built-in
plugins/actors(actor.system / actor.pool + step.actor_send / step.actor_ask) and register it in default plugin sets and docs/examples.
Reviewed changes
Copilot reviewed 38 out of 40 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| testhelpers_test.go | Includes the new actors plugin in the “all plugins” test helper list. |
| plugins/scanner/plugin.go | Registers security.scanner module type + capability + schema. |
| plugins/scanner/module.go | Implements a SecurityScannerProvider module (mock mode; CLI mode stubbed). |
| plugins/scanner/module_test.go | Unit tests for scanner module defaults, mock findings, gating, lifecycle, and plugin wiring. |
| plugins/all/all.go | Adds plugins/scanner and plugins/actors to the default built-in plugin list. |
| plugins/actors/plugin.go | Declares actors plugin manifest, module/step factories, schemas, workflow handler, and wiring hook. |
| plugins/actors/schemas.go | Adds schemas for actor.system, actor.pool, step.actor_send, step.actor_ask. |
| plugins/actors/messages.go | Defines the ActorMessage envelope + HandlerPipeline config representation. |
| plugins/actors/handler.go | Implements actors workflow handler parsing (pools/receive/steps). |
| plugins/actors/handler_test.go | Tests actor workflow config parsing + error cases. |
| plugins/actors/module_system.go | Implements actor.system module around goakt ActorSystem + recovery policy parsing. |
| plugins/actors/module_system_test.go | Tests actor.system defaults, validation, and local-mode start/stop. |
| plugins/actors/module_pool.go | Implements actor.pool module (auto-managed grains + permanent pools + routing + recovery). |
| plugins/actors/module_pool_test.go | Extensive tests for pool config validation, routing, spawning, and recovery behavior. |
| plugins/actors/bridge_actor.go | Bridge actor/grain executing step pipelines on message receive; merges outputs into actor state. |
| plugins/actors/bridge_actor_test.go | Tests bridge actor/grain execution, responses, unknown messages, and state persistence. |
| plugins/actors/step_actor_send.go | Implements step.actor_send (Tell) with templated identity/message resolution + pool lookup. |
| plugins/actors/step_actor_send_test.go | Tests send step config validation and basic construction. |
| plugins/actors/step_actor_ask.go | Implements step.actor_ask (Ask) with timeout parsing + routing constraints. |
| plugins/actors/step_actor_ask_test.go | Tests ask step config validation and timeout parsing. |
| plugins/actors/integration_test.go | Integration tests for basic goakt lifecycle + state isolation across actors. |
| plugin/external/adapter.go | Wraps external security.scanner modules with a scanner-specific remote adapter. |
| plugin/external/security_scanner_adapter.go | Adds SecurityScannerRemoteModule + remote provider that calls InvokeService. |
| pkg/k8s/resources/pvc.go | Updates PVC resource requirements type for newer k8s API versions. |
| module/scan_provider.go | Adds SecurityScannerProvider interface + scan option types + exported severity rank helper. |
| module/scan_provider_test.go | Tests scan steps delegating to a mock provider and severity ranking behavior. |
| module/pipeline_step_scan_test.go | Updates scan step tests to cover “no provider configured” error path. |
| module/pipeline_step_scan_sast.go | Rewrites SAST scan step to delegate to provider + gate evaluation. |
| module/pipeline_step_scan_container.go | Rewrites container scan step to delegate to provider + gate evaluation. |
| module/pipeline_step_scan_deps.go | Rewrites dependency scan step to delegate to provider + gate evaluation. |
| go.mod | Adds goakt + updates multiple deps (notably k8s libs) and bumps some module versions. |
| example_configs_test.go | Extends schema validation/build tests to recognize actors workflow/module types. |
| example/go.mod | Syncs example module Go version/deps with root module changes. |
| example/actor-system-config.yaml | Adds a full actor-model example config (HTTP routes calling actors). |
| docs/plans/2026-03-08-codebase-audit.md | Adds an implementation plan document for the audit/scanner work. |
| docs/plans/2026-03-08-codebase-audit-design.md | Adds design doc for scanner provider + sandbox/docker + scenarios. |
| docs/plans/2026-03-07-actor-model-design.md | Adds design doc for actor model integration. |
| DOCUMENTATION.md | Updates documented step/module/workflow lists to include actor functionality. |
| var provider SecurityScannerProvider | ||
| if err := s.app.GetService("security-scanner", &provider); err != nil { | ||
| return nil, fmt.Errorf("scan_sast step %q: no security scanner provider configured — load a scanner plugin", s.name) | ||
| } |
There was a problem hiding this comment.
s.app is used unconditionally; if the step is ever constructed with a nil modular.Application (as several other step tests/builders do), this will panic on GetService. Other steps typically return a clear "no application context" error when app is nil—consider adding the same guard here (and in the other scan steps).
| passed := result.EvaluateGate(s.severityThreshold) | ||
|
|
There was a problem hiding this comment.
EvaluateGate(s.severityThreshold) will pass the gate when severityThreshold is unknown (rank=0). Since the factory doesn’t validate severity_threshold, typos/misconfigurations (e.g. "hgh") would silently disable enforcement. Consider calling validateSeverity(severityThreshold) during step construction (as scan_deps does) and returning a config error on invalid values.
| var provider SecurityScannerProvider | ||
| if err := s.app.GetService("security-scanner", &provider); err != nil { | ||
| return nil, fmt.Errorf("scan_container step %q: no security scanner provider configured — load a scanner plugin", s.name) | ||
| } |
There was a problem hiding this comment.
s.app is dereferenced without a nil check. For consistency with other steps (which return "no application context"), consider returning an explicit error when app is nil rather than panicking on GetService.
| func (s *ScanDepsStep) Execute(ctx context.Context, _ *PipelineContext) (*StepResult, error) { | ||
| var provider SecurityScannerProvider | ||
| if err := s.app.GetService("security-scanner", &provider); err != nil { | ||
| return nil, fmt.Errorf("scan_deps step %q: no security scanner provider configured — load a scanner plugin", s.name) | ||
| } |
There was a problem hiding this comment.
s.app is used unconditionally; if the step is created with a nil modular.Application, this will panic. Consider adding a nil guard and returning a clear "no application context" error (consistent with other step implementations).
| pluginmarketplace.New(), | ||
| pluginscanner.New(), | ||
| pluginactors.New(), | ||
| } |
There was a problem hiding this comment.
The PR description focuses on the security scanner provider/plugin work, but this change also adds the built-in actors plugin (and introduces the goakt dependency) to the default plugin set. Please update the PR description to reflect this additional feature scope, or split actor-model changes into a separate PR if possible for easier review/rollback.
| if v, ok := cfg["mode"].(string); ok && v != "" { | ||
| m.mode = v | ||
| } |
There was a problem hiding this comment.
mode is accepted from config without validation. Any unexpected value will fall through to the "CLI mode not yet implemented" errors, which is confusing for a simple misconfiguration. Consider validating mode in NewScannerModule (reject anything other than "mock"/"cli").
| "security.scanner": func(name string, cfg map[string]any) modular.Module { | ||
| mod, err := NewScannerModule(name, cfg) | ||
| if err != nil { | ||
| return nil | ||
| } | ||
| return mod |
There was a problem hiding this comment.
If NewScannerModule returns an error, the module factory returns nil without logging the underlying cause. Since the engine error will just say the factory returned nil, it can be hard to debug misconfigurations. Consider logging the error similarly to plugins/actors (or storing the error for later surfacing) before returning nil.
| result, err := p.module.InvokeService("ScanSAST", args) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("remote ScanSAST: %w", err) | ||
| } | ||
| return decodeScanResult(result) | ||
| } |
There was a problem hiding this comment.
These provider methods accept a ctx, but the underlying call path (RemoteModule.InvokeService) uses context.Background() (no cancellation/deadline propagation). That means scan steps can’t reliably time out/cancel remote scans. Consider adding a context-aware InvokeService(ctx, ...) on RemoteModule and using it here so pipeline cancellations/timeouts are honored.
| passed := result.EvaluateGate(s.failOnSeverity) | ||
|
|
There was a problem hiding this comment.
EvaluateGate(s.failOnSeverity) treats unknown severities as rank=0 and passes the gate. Since the factory currently defaults fail_on_severity to "error" (not a recognized severity), the gate will silently always pass unless the user overrides it. Consider validating fail_on_severity in the factory (and defaulting to a valid value like "high"), or explicitly failing on unknown severities to avoid accidentally disabling enforcement.
- Add nil guards for app in all 3 scan steps - Validate severity threshold in scan step factories - Default fail_on_severity to "high" (was "error", not a valid severity) - Validate scanner module mode config - Log errors in scanner module factory - Update plugin description to not claim CLI mode support - Add TODO for context propagation in remote scanner adapter Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve conflict in plugins/all/all.go: keep both scanner and actors plugins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
⏱ Benchmark Results✅ No significant performance regressions detected. benchstat comparison (baseline → PR)
|
| # Codebase Audit Implementation Plan | ||
|
|
||
| > **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task. | ||
|
|
||
| **Goal:** Fix all stubbed code in the workflow engine, implement DockerSandbox, and build test scenarios for all untested plugins. | ||
|
|
||
| **Architecture:** Provider interface for security scanning, Docker sandbox module in existing sandbox plugin, new security scanner plugin, 4 public scenarios, 5 private scenarios. | ||
|
|
||
| **Tech Stack:** Go, Docker Engine API, YAML configs, bash test scripts with jq | ||
|
|
||
| --- | ||
|
|
||
| ### Task 1: SecurityScannerProvider Interface (Core Engine) | ||
|
|
||
| **Files:** | ||
| - Create: `module/scan_provider.go` | ||
| - Modify: `module/pipeline_step_scan_sast.go` | ||
| - Modify: `module/pipeline_step_scan_container.go` | ||
| - Modify: `module/pipeline_step_scan_deps.go` | ||
|
|
||
| **Step 1: Create the provider interface** | ||
|
|
||
| Create `module/scan_provider.go` with: | ||
| - `SecurityScannerProvider` interface (ScanSAST, ScanContainer, ScanDeps methods) | ||
| - `ScanResult` struct (Passed, Findings, Summary, OutputFormat, RawOutput) | ||
| - `ScanFinding` struct (ID, Severity, Title, Description, Package, Version, FixVersion, Location) | ||
| - `SASTScanOpts` (Scanner, SourcePath, Rules, FailOnSeverity, OutputFormat) | ||
| - `ContainerScanOpts` (Scanner, TargetImage, SeverityThreshold, IgnoreUnfixed, OutputFormat) | ||
| - `DepsScanOpts` (Scanner, SourcePath, FailOnSeverity, OutputFormat) | ||
| - `severityRank()` helper mapping severity string to int for comparison | ||
| - `SeverityAtOrAbove()` exported helper for threshold checks | ||
|
|
||
| **Step 2: Rewrite scan_sast to delegate to provider** | ||
|
|
||
| Rewrite `ScanSASTStep.Execute()`: | ||
| 1. Look up `SecurityScannerProvider` from app service registry via `security-scanner:<module>` key | ||
| 2. Build `SASTScanOpts` from step config |
There was a problem hiding this comment.
This plan doc includes tooling-specific instructions ("For Claude") and local absolute filesystem paths under /Users/..., which aren’t portable or generally useful for repository documentation. It also describes provider lookup via security-scanner:<module>, but the implementation in this PR uses the single global security-scanner service key. Consider revising this doc to be tool-agnostic, use repo-relative paths, and reflect the current provider lookup behavior.
| // Fall back to "image" config key for the scan target (as in the YAML spec) | ||
| targetImage = image | ||
| targetImage, _ = config["image"].(string) | ||
| } |
There was a problem hiding this comment.
target_image can end up empty (if both target_image and fallback image are unset), but the step still constructs ContainerScanOpts and calls the provider. This will push invalid input downstream and makes failures harder to diagnose. Consider validating that targetImage is non-empty in the factory and returning a clear config error if it’s missing.
| } | |
| } | |
| targetImage = strings.TrimSpace(targetImage) | |
| if targetImage == "" { | |
| return nil, fmt.Errorf("scan_container step %q: target image is required; set 'target_image' or 'image' in config", name) | |
| } |
| if mockCfg, ok := cfg["mockFindings"].(map[string]any); ok { | ||
| for scanType, findingsRaw := range mockCfg { | ||
| findings, err := parseMockFindings(findingsRaw) | ||
| if err != nil { | ||
| return nil, fmt.Errorf("security.scanner %q: invalid mockFindings.%s: %w", name, scanType, err) | ||
| } | ||
| m.mockFindings[scanType] = findings | ||
| } |
There was a problem hiding this comment.
mockFindings accepts any map key as a scan type and stores it in m.mockFindings, but only "sast"/"container"/"deps" are ever read. A typo in config (e.g. "containers") will be silently ignored and the module will fall back to default findings, which is surprising. Consider validating scanType against the supported set and returning an error for unknown keys.
| f := module.Finding{ | ||
| RuleID: getString(m, "rule_id"), | ||
| Severity: strings.ToLower(getString(m, "severity")), | ||
| Message: getString(m, "message"), | ||
| Location: getString(m, "location"), | ||
| } | ||
| if line, ok := m["line"].(float64); ok { | ||
| f.Line = int(line) | ||
| } |
There was a problem hiding this comment.
line is only parsed when it is a float64. In this codebase config values often arrive as either int or float64 depending on the decoding path, so a YAML line: 55 may be an int and get dropped. Consider accepting both int and float64 (and/or using a small helper for numeric coercion).
- Validate target_image is non-empty in scan_container step factory - Validate mockFindings scan type keys (sast/container/deps only) - Error on malformed finding items instead of silently skipping - Handle both int and float64 types for line field in parseMockFindings - Remove tool-specific instructions and absolute paths from plan doc - Fix provider lookup reference to match implementation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
SecurityScannerProviderinterface inmodule/scan_provider.gowithScanResult,Finding, and severity gatingstep.scan_sast,step.scan_container,step.scan_deps) to delegate to the provider via service registryplugins/scannerbuilt-in plugin withsecurity.scannermodule type (mock mode with configurable findings)SecurityScannerRemoteModule) so gRPC plugins can also provide scanningplugins/allDesign
See:
docs/plans/2026-03-08-codebase-audit-design.mdTest Plan
go build ./...cleango test ./module/... ./plugins/scanner/...pass🤖 Generated with Claude Code