Skip to content

Add support for custom collectors#3

Merged
phenixblue merged 2 commits into
mainfrom
feature/custom-collector
Apr 5, 2026
Merged

Add support for custom collectors#3
phenixblue merged 2 commits into
mainfrom
feature/custom-collector

Conversation

@phenixblue
Copy link
Copy Markdown
Owner

Summary

Adds a full custom collector framework that allows short-lived Kubernetes Jobs to gather node or cluster-scoped data and expose it to Rego policies via input.cluster.collectors.

New subcommand: kvirtbp collect

Deploys collector Jobs against a live cluster and writes the results to a collector-data JSON file.

kvirtbp collect \
  --bundle ./examples/collectors/node-info \
  --bundle ./examples/collectors/hugepages \
  --save-bundle ./saved-bundles \
  --output collector-data.json

Changes to kvirtbp scan

  • --collector-data (repeatable) — injects pre-collected data into input.cluster.collectors
  • --collector-bundle / --collector-config — run inline collectors during scan without a separate collect step
  • --policy-bundle now auto-detected from _meta.bundlePaths in the collector-data file when not explicitly provided (--no-auto-bundle to opt out)
  • Multi-bundle support: --policy-bundle pointing to a directory of bundles evaluates each sub-bundle and merges findings

Collector configuration

Collectors are declared in a bundle's metadata.json under a collectors array (CollectorConfig):

  • scope: once — single cluster-wide Job, output at input.cluster.collectors["name"]["_cluster"]
  • scope: per-node — one Job per node, output keyed by node name
  • tolerations — pod tolerations so collectors can schedule on tainted nodes (e.g. control-plane)
  • privileged, hostPID, hostNetwork, env for advanced collectors

Bundle improvements

  • --bundle on collect is now repeatable; configs merged across bundles
  • --save-bundle persists resolved (possibly remote) bundles locally; multiple bundles saved to bundle-0/, bundle-1/, …
  • bundle.SubBundles() detects whether a directory is a single bundle or a container of multiple bundles
  • bundle.SaveDir() copies a resolved bundle for reuse without re-fetching

Example collectors

Bundle Collector Scope Checks
examples/collectors/node-info node-info per-node Collector data present; CPU architecture consistency
examples/collectors/hugepages hugepages per-node Collector data present; hugepages configured on all/partial/no nodes

Both include OPA unit tests (13 tests total, all passing).

Internal packages

  • internal/collectorCollector interface, JobCollector, CollectorConfig, CollectorResult/CollectorMeta (with _meta envelope), MergeAll
  • internal/bundleResolve, SaveDir, SubBundles
  • internal/cli/collect.go — new collect subcommand
  • docs/collectors.md — full schema and usage guide

@phenixblue phenixblue merged commit 8cd781c into main Apr 5, 2026
4 checks passed
@phenixblue phenixblue deleted the feature/custom-collector branch April 5, 2026 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant