Skip to content

feat: default mTLS to permissive and enable card discovery#408

Open
varshaprasad96 wants to merge 1 commit into
kagenti:mainfrom
varshaprasad96:feat/mtls-defaults
Open

feat: default mTLS to permissive and enable card discovery#408
varshaprasad96 wants to merge 1 commit into
kagenti:mainfrom
varshaprasad96:feat/mtls-defaults

Conversation

@varshaprasad96

Copy link
Copy Markdown
Contributor

Summary

  • Default mTLSMode to permissive so agents get mTLS automatically when SPIRE is available
  • Enable --enable-card-discovery and --enable-verified-fetch by default
  • Treat empty mTLSMode as permissive across all resolution paths (envoy template, pod mutator, resolved config) for consistency with existing CRs
  • Add deprecation warnings for legacy JWS signing flags
  • Add startup logs for upgrade visibility

Spec

Implements Phase 1 tasks (T001-T004) from spec: mTLS transport security (#401)

Changes

File Change
api/v1alpha1/agentruntime_types.go +kubebuilder:default=permissive on MTLSMode, ConditionTypeMTLSReady constant
cmd/main.go Flip flag defaults, add deprecation warnings, add startup logs
internal/webhook/injector/envoy_template.go Treat empty MTLSMode as permissive
internal/webhook/injector/pod_mutator.go Default fallback from disabled to permissive
internal/webhook/injector/resolved_config.go Update comment for new default
internal/webhook/injector/*_test.go Update tests for new default behavior
config/crd/bases/, charts/ Regenerated CRD manifests

Test plan

  • All 17 test packages pass (make test)
  • CRD schema shows default: permissive for mtlsMode
  • Code review (correctness + cleanup) completed — C1 and C2 findings fixed
  • E2E: deploy agents with SPIRE, verify mTLS is active by default
  • E2E: deploy agents without SPIRE, verify graceful fallback with warning logs

Jira: RHAIENG-4944

🤖 Generated with Claude Code

@varshaprasad96 varshaprasad96 requested a review from a team as a code owner June 5, 2026 22:48
@varshaprasad96

Copy link
Copy Markdown
Contributor Author

/hold

This PR introduces permissive mode for easier migration to use mTLS. Follow up after spec is merged: #401

cc: @r3v5 @akram

@cwiklik cwiklik left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well-executed default flip implementing Phase 1 of the mTLS spec (#401): permissive becomes the default mTLS posture (kubebuilder marker + consistent empty→permissive handling across the envoy template, pod mutator, and resolved config), card-discovery and verified-fetch default on (with documented kill switches), plus deprecation warnings for legacy JWS flags and upgrade-visibility startup logs. CRD manifests are regenerated in both config/crd/bases/ and charts/crds/ identically — good hygiene. Code logic is correct and the doc comments accurately describe the new-CR vs existing-CR resolution.

Security: net improvement (mTLS on by default). The main risk is operational (upgrade rollout + SPIRE dependency), not a vulnerability — see comment #1.

Tests: pass, but missing a positive assertion for the new empty→permissive default (comment #2).

CI: 15 green; E2E still running, and the two new E2E scenarios (with/without SPIRE) are unchecked — worth landing those before merge given the fleet-wide rollout.

Commits: signed-off. Nits: subject lacks the emoji prefix the convention favors; and the PR body uses "🤖 Generated with Claude Code" — Kagenti's convention is Assisted-By: Claude Code, never "Generated with."

Approving — the code is correct; the upgrade-impact verification (comment #1) is strongly recommended before merge.

// process start).
//
// +optional
// +kubebuilder:default=permissive

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: This default flip has a notable upgrade impact worth surfacing. Per the field doc, mtlsMode != disabled auto-enables SPIRE and changing the mode triggers a pod rollout. So on operator upgrade, every existing AgentRuntime with an unset mtlsMode flips empty→permissive → SPIRE auto-enabled + pod rollout, fleet-wide. permissive is designed to be safe (accepts/falls back to plaintext), but on clusters without SPIRE this relies entirely on graceful fallback — and the "deploy without SPIRE, verify graceful fallback" E2E box is still unchecked in the test plan. Recommend verifying the no-SPIRE path before merge and calling out the rollout + SPIRE auto-enable in release notes / a prominent startup log (the existing startup logs cover card-discovery/verified-fetch but not the mTLS default change).

// silently leak TLS config into pods that didn't ask for it.
for _, mode := range []string{"", MTLSModeDisabled} {
// Explicitly disabled mode — no TLS blocks should render. Empty
// string is now treated as permissive (mTLS on by default).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: The core new behavior — empty MTLSMode now rendering permissive TLS blocks — isn't positively asserted. This test only removes "" from the disabled set. Add a case asserting MTLSMode: "" produces the permissive shape (and/or that the mutator injects spiffe-helper for empty mode) so the new default is locked in against regressions.

"namespace", namespace, "crName", crName,
"unrecognized", mtlsMode, "source", mtlsSource)
mtlsMode = MTLSModeDisabled
mtlsMode = MTLSModePermissive

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The defense-in-depth fallback for an unrecognized mtlsMode flips from disabled to permissive. Reasonable under the "mTLS by default" theme (fails secure rather than open), but note a garbage value now also pulls in SPIRE — fine given the CRD enum rejects unknowns at admission, just a conscious posture change worth a line in the commit/PR rationale.

@pdettori

pdettori commented Jun 9, 2026

Copy link
Copy Markdown
Member

let's set it to draft until ready to merge

@pdettori pdettori marked this pull request as draft June 9, 2026 01:17
Change mTLS default from disabled to permissive so agents get mTLS
automatically when SPIRE is available. Enable card discovery and
verified fetch by default. Add deprecation warnings for legacy JWS
signing flags.

Changes:
- Add +kubebuilder:default=permissive on MTLSMode field
- Add ConditionTypeMTLSReady and AnnotationMTLSMode constants
- Set kagenti.io/mtls-mode annotation on pod template in controller
- Set MTLS_MODE env var on authbridge containers in webhook
- Treat empty MTLSMode as permissive across all resolution paths
  (controller, pod mutator, envoy template, resolved config)
- Flip --enable-card-discovery and --enable-verified-fetch to true
- Add startup deprecation warnings for legacy signing flags
- Clean up mtls-mode annotation on AgentRuntime deletion
- Update tests for new default behavior

Jira: RHAIENG-4944

Signed-off-by: Varsha Prasad Narsing <varshaprasad96@gmail.com>
Assisted-By: Claude (Anthropic AI) <noreply@anthropic.com>
@varshaprasad96 varshaprasad96 marked this pull request as ready for review June 12, 2026 04:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants