Skip to content

Define seccomp profile intersection semantics for multi-profile merging #1317

Description

@saschagrunert

Motivation

KEP-6061 introduces support for CRI runtimes (CRI-O, containerd) to merge OCI-pulled seccomp profiles with node baselines. The merge operation is an intersection: the resulting profile permits a syscall only if all input profiles permit it. Multiple runtimes will implement this, so the behavior should be standardized. See also the kubernetes-sigs donation discussion.

A reference implementation exists at security-profiles-merger.

Proposed intersection semantics

Action restrictiveness (most to least):
SCMP_ACT_KILL_PROCESS > SCMP_ACT_KILL_THREAD > SCMP_ACT_TRAP > SCMP_ACT_ERRNO > SCMP_ACT_NOTIFY > SCMP_ACT_TRACE > SCMP_ACT_LOG > SCMP_ACT_ALLOW

SCMP_ACT_KILL is equivalent to SCMP_ACT_KILL_THREAD. Unknown actions are rejected by validation; the comparison functions treat them as maximally restrictive as defense-in-depth.

Profile fields:

  • defaultAction: the more restrictive of the two.
  • Syscall entries are normalized to one name per entry, then for each name:
    • Present in both: more restrictive action wins.
    • Present in one: effective action is the more restrictive of the entry's action and the other profile's defaultAction.
    • Elided if the result matches the merged defaultAction and has no argument filters.
  • architectures / flags: intersection. Empty lists are treated as unspecified and defer to the other profile.
  • defaultErrnoRet / per-syscall errnoRet: follows the selected action. First operand wins on tie.
  • listenerPath / listenerMetadata: from the first operand.

Argument filters:

  • One or neither side has filters: the filters (if any) are kept.
  • Both sides identical: kept.
  • Both sides differ: compared per argument index. Filters for indices in only one side are kept. For shared indices with identical filters, kept; with differing filters, the syscall is denied with SCMP_ACT_KILL_PROCESS (conservative, since intersecting arbitrary comparison predicates is not feasible in general).

Operand ordering: The intersection is not commutative for metadata (errnoRet, listenerPath). For CRI runtimes, the node baseline is the first operand and the OCI-pulled profile is the second.

Known limitations

When profiles have different default actions and a syscall has argument filters in one profile but not the other, the intersection can be overly restrictive for argument values that do not match the filter. The result is safe (never more permissive than correct) but may deny syscalls that should be permitted under the precise intersection.

Beyond seccomp

The reference implementation also supports AppArmor and Landlock merging. The runtime-spec does not currently define types for these, but Landlock in particular may be worth considering given its growing adoption as an unprivileged sandboxing mechanism. If there is interest, we can propose Landlock (and potentially AppArmor) types and merge semantics in a follow-up.

Open questions

  • Should this spec also cover union semantics (used by the Security Profiles Operator for combining recorded profiles)?
  • Where should this live: new section, or extension of the existing LinuxSeccomp definition?
  • Interest in defining Landlock profile types and merge semantics?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions