Conversation
|
@TheBeeZee, @jrajahalme PTAL |
cilium/CFP-23380-ordered-policy.md
Outdated
|
|
||
| ### MapState conflict resolution | ||
|
|
||
| The key observation is this: **we cannot insert a lower-priority, higher-precedence entry in to the MapState**. Since MapState precedence mirrors datapath precedence, determining this is relatively straightforward: to insert a given key, we scan all ancestor keys, checking if the same identity is selected. If a lower-precedence, higher-priority entry is found, we cannot insert the higher-precedence, lower-priority entry. Conversely, if any descendants of the newly proposed entry are higher-precedence and lower-priority, they must be deleted. |
There was a problem hiding this comment.
We should define what we mean with priority and precedence. In ClusterNetworkPolicy lower numeric value "takes precedence" (is considered before any rules with higher numeric "priority"). So "high priority" rules have lower numeric priority value. This is in agreement with language like "this is priority 1 task" (i.e., the highest priority
task in general language).
Furthermore, "higher-precedence" reads as "taking precedence", i.e., being more important. So in written language "higher-priority" has the same meaning as "higher-precedence".
This is hard get right, but we should aim for clarity here by e.g., using the general sense that "higher-priority" == "higher-precedence", and refer to a "numerically low priority value" explicitly.
There was a problem hiding this comment.
+1
There were many discussions in the network-policy-sig about this and in the end it was decided that going with "priority" with the associated semantics (lower numeric value is higher priority) is the most intuitive approach (despite the inversion of numeric value and the real meaning of "priority").
It is of course possible to invert the k8s Cluster Network Policy priorities when translating into an internal representation, but it seems like keeping Cilium internals in sync with what the public APIs do is a net-win.
Explicitly defining what "precedence" and "priority" mean in this context would help the readability of the document (and code). Also, if the choice will be for priority to be "numerically higher means higher priority", then I would recommend against using the name "priority", as this will conflict with Cluster Network Policy semantics.
There was a problem hiding this comment.
Right, based on the existing PoC implementation, priority is an unsigned integer value, where a higher number overrides a lower number. This is unrelated to any particular API convention, where, generally, a lower number overrides.
While this inversion is a bit confusing, it does ultimately simplify the low-level core. I'll add some commentary about it.
There was a problem hiding this comment.
Clarified the wording, PTAL
There was a problem hiding this comment.
I don't see any changes, did you forget to push?
cilium/CFP-23380-ordered-policy.md
Outdated
|
|
||
| The key observation is this: **we cannot insert a lower-priority, higher-precedence entry in to the MapState**. Since MapState precedence mirrors datapath precedence, determining this is relatively straightforward: to insert a given key, we scan all ancestor keys, checking if the same identity is selected. If a lower-precedence, higher-priority entry is found, we cannot insert the higher-precedence, lower-priority entry. Conversely, if any descendants of the newly proposed entry are higher-precedence and lower-priority, they must be deleted. | ||
|
|
||
| Note: higher numerical priority overrides lower numerical priority. |
There was a problem hiding this comment.
ref above. In my PoC implementation I have adopted the k8c CNP view on the API side, where lower priority values override higher ones, but on the datapath I have expanded the notion that higher numerical values override lower numerical values from ProxyPortPriority when generalizing it to a single Precedence field. The reason for the latter is the clarity in code and comments that comes from the ability to have "higher precedence" mean "numerically greater" and using the greater than operator (>) for comparisons in straightforward manner.
There was a problem hiding this comment.
I added a distinction between user API ordering and datapath priority, and made it clear that an ordering API is not part of this CFP.
cilium/CFP-23380-ordered-policy.md
Outdated
|
|
||
| Incremental updates are aggregated by selectors and then inserted in to the MapState. The MapState is responsible for deconflicting overlapping selectors. This remains the same in the ordered version. | ||
|
|
||
| It is safe to insert incremental updates in this manner because identities and selectors are immutable. As a direct consequence, two selectors S1 and S2 might overlap, a new identity I1 can never change state from non-overlapping to overlapping. |
There was a problem hiding this comment.
S1 and S2 might be non-overlapping to begin with, e.g., by selecting nothing. Then I1 appears and is selected by both S1 and S2, making them overlapping. The key for correct incremental updates is that, for the purposes of mapstate updates, S1 and S2 both select any overlapping identities (e.g, I1). This would naturally be the case if we evaluated the selections from ground up each time, but with our use of selector cache we must make sure that both cached selectors are updated at the same transaction, so that the readers (the mapstate update procedure) observe the new, overlapping identity in both S1 and S2. Historically this was not the case which made the mapstate update much more complicated than it is now.
cilium/CFP-23380-ordered-policy.md
Outdated
|
|
||
| ### MapState conflict resolution | ||
|
|
||
| The key observation is this: **we cannot insert a lower-priority, higher-precedence entry in to the MapState**. Since MapState precedence mirrors datapath precedence, determining this is relatively straightforward: to insert a given key, we scan all ancestor keys, checking if the same identity is selected. If a lower-precedence, higher-priority entry is found, we cannot insert the higher-precedence, lower-priority entry. Conversely, if any descendants of the newly proposed entry are higher-precedence and lower-priority, they must be deleted. |
There was a problem hiding this comment.
+1
There were many discussions in the network-policy-sig about this and in the end it was decided that going with "priority" with the associated semantics (lower numeric value is higher priority) is the most intuitive approach (despite the inversion of numeric value and the real meaning of "priority").
It is of course possible to invert the k8s Cluster Network Policy priorities when translating into an internal representation, but it seems like keeping Cilium internals in sync with what the public APIs do is a net-win.
Explicitly defining what "precedence" and "priority" mean in this context would help the readability of the document (and code). Also, if the choice will be for priority to be "numerically higher means higher priority", then I would recommend against using the name "priority", as this will conflict with Cluster Network Policy semantics.
cilium/CFP-23380-ordered-policy.md
Outdated
|
|
||
| If a lower-priority non-wildcard PerSelectorPolicy would override an existing wildcard, higher-priority, lower-precedence selector, adding it is blocked. | ||
|
|
||
| If a lower-priority wildcard PerSelectorPolicy would override an existing non-wildcard, higher-priority, lower-precedence selector, then it must be inserted, but this will have to be separately resolved via the datapath. Specifically, the datapath must look up both wildcarded and non-wildcarded entries, and select the one with the highest priority, using precedence length as a tiebreaker. |
There was a problem hiding this comment.
This might be my lack of understanding of the Cilium datapath, but is this currently handled or is this CFP proposing to implement it?
Also, how about non-wildcard overlaps, for example with CIDR rules? Those are already handled by LPF matching in the dataplane? For example, low priority deny for 10.0.0.0/8 and high priority allow for 10.10.10.0/24?
There was a problem hiding this comment.
Cilium datapath does this currently for deny and proxy port precedence, the proposal is to expand this to a more generic notion of "precedence".
Cilium policy engine operates on "security identities", not on CIDRs directly. Each unique CIDR (or a group of them via CIliumCIDRGroup) will get mapped to a unique security identity, in the datapath before policy enforcement. In your example 10.0.0.0/8 would have a different security identity than 10.10.10.0/24, and if a packet would be covered by both of them, the ipcache will assign the security identity of 10.10.10.0/24 to the packet for policy enforcement.
On the policy computation side, the selector for 10.0.0.0/8 will select both of these security identities since it covers also the 10.10.10.0/24 CIDR, so the low priority deny entry is inserted for both of these identities. For the identity of 10.10.10.0/24 that low priority deny entry would be overridden by the high priority allow entry, so the bpf datapath policy map will only have the high priority allow entry for the 10.10.10.0/24 CIDR identity.
This also expands to port (range) policies due to the LPF policy map.
cilium/CFP-23380-ordered-policy.md
Outdated
| 1. `(priority 2, id 5, tcp, port *)` | ||
| 2. `(priority 1, id * tcp, port 80)` | ||
|
|
||
| Rule 2 has a longer match. For ID 5, rule 2 overrides rule 1, which is an error. However, rule 2's wildcard selector selects more identities, so it cannot be elided. Rather, at policy resolution time, the numeric priorities will be compared. |
There was a problem hiding this comment.
Why is this an error? Again, might be my lack of understanding of the current semantics. Seems like something a user might configure. Also, this says "at policy resolution time, the numeric priorities will be compared". Is this "policy resolution time" the control plane or dataplane?
There was a problem hiding this comment.
"policy resolution time" here is at the dataplane (cilium bpf datapath). Maybe it would be clearer to write "policy enforcement time" instead? "policy resolution" could be understood to mean "policy map computation time".
I think the "is an error" here should be maybe written as "would be an error if the numeric identities were not compared at the policy enforcement time" etc.
This demonstrates a possible implementation of ordered policy. Signed-off-by: Casey Callendrello <cdc@isovalent.com>
00aab61 to
e68322e
Compare
| title: "Example 1d: valid" | ||
| --- | ||
| graph TD | ||
| rule1[prio 1, id *, tcp, 0-1023] |
There was a problem hiding this comment.
I think the priorities are swapped here, rule1 should be prio2 and rule2 should be prio1, no?
|
Was this synced to the latest design? Are you still looking for feedback? |
|
Moved to draft due to lack of activity. |
This demonstrates a possible implementation of ordered policy.