Skip to content

Gate local-agent persistence behind Nix activation preflight #23

@mdheller

Description

@mdheller

Context

Canonical spec: SourceOS-Linux/sourceos-spec specs/local-agent-runtime.md.

The node-commander incident showed that Nix activation can reproducibly install an operationally unsafe service if preflight is not part of activation. A Nix-generated wrapper was installed behind launchd before Podman machine/socket/auth conditions were safe.

Required boot/activation behavior

Nix activation for local agents must run in this order:

  1. Evaluate local-agent declaration.
  2. Generate wrapper.
  3. Lint wrapper.
  4. Generate launchd plist or systemd unit.
  5. Lint service definition.
  6. Run preflight.
  7. Install only if preflight passes.
  8. Enable only if install succeeds.
  9. Start only if enable succeeds.
  10. Emit status.

If preflight fails, activation must stage the service but must not enable persistence unless explicitly requested.

Deliverables

  • Add activation guard hooks for SourceOS local agents.
  • Add stage-only mode for failed preflight.
  • Add clear status output for staged, installed, enabled, and running states.
  • Add policy checks preventing direct /Library/LaunchAgents root-owned user agents, /tmp logs, and unbounded KeepAlive=true.

Acceptance criteria

  • A missing/stopped Podman machine prevents active service installation.
  • Noninteractive credential-helper failure prevents service enablement.
  • Failed preflight creates a staged artifact and clear remediation output.
  • Activation never leaves a half-enabled respawn loop.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions