WorkflowAccessPolicy#101
Conversation
(this proposal has not been one-shot). Signed-off-by: joshvanl <me@joshvanl.dev>
There was a problem hiding this comment.
Pull request overview
Adds a new design proposal for a standalone Dapr resource/CRD, WorkflowAccessPolicy, intended to provide ingress access control for starting workflows/activities (including upcoming cross-app invocation) based on caller app ID derived from mTLS/SPIFFE.
Changes:
- Introduces the
WorkflowAccessPolicyCRD concept, schema examples, and Go-type sketches. - Describes runtime enforcement integration at the Durable Task API service /
daprinternalCallActorpath. - Defines intended matching semantics (glob patterns, “most specific match wins”), default-deny behavior, and lifecycle/acceptance criteria.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Signed-off-by: joshvanl <me@joshvanl.dev>
| # Default action when no rule matches. Defaults to "deny" if omitted. | ||
| defaultAction: deny | ||
|
|
||
| # Ingress rules defining which callers can start which workflows/activities. |
There was a problem hiding this comment.
I understand this initial model is only covering start operations, and future work will extend to include other operations, but there needs to be consideration how those other operations are going to be additive to this CRD without creating confusion.
As it currently stands the start operation is implicit, and I think it should be made explicit in the CRD.
|
I've been very vocal about the need for Workflow ACLs so I'm very happy to see this Proposal. I generally support the proposal, but I'm concerned that CRD approach has a low ergonomic ceiling and will become unwieldy at scale when ACLs are spanning many apps, workflows, activities and operations. Having to intersect and aggregate many separate CRs to get a complete ACL picture is not a task for a human brain, so I can see this being somewhat frustrating. Personally, I think the alternative solution offered in 3. Annotation-based or in-code policies would be more ergonomic, and I reject the following criticism
The developer persona architects the workflows and their interoperation with Activities, not Platform Operators. Workflows definitions and Activity definitions are not Infrastructure IMO. I do agree with the criticism that annotations may be difficult to express complex rules, but I think in-code policies can be very expressive here, particularly when taking a Builder / Fluent approach in code. This would also give the benefit that policies could be unit tested, which, is increasingly important, particularly in agentic coding. I do accept that in-code policies would apply more pressure on SDK contributors and maintainers, who are already time-challenged. |
Signed-off-by: joshvanl <me@joshvanl.dev>
|
Thanks @olitomlinson :) I've updated the proposal to add an operation field to each operation rule, which defaults to start if omitted. We can extend to management operations like pause, resume, terminate, etc. later. Also added a note on how the intersections rules work. re: CRD ergonomics vs in-code policies, I think that you probably want both. It is the case that the infrastructure person does want to manage RBAC across a namespace, see the MCP server propsoal as one example #100 |
@JoshVanL I think if protos could be added to this proposal so that SDK maintainers can deliver in-code policies, that would be a great outcome. |
|
@olitomlinson regarding annotation vs CRD, when using CRDs we can set a namespace-wide policy and reuse policies for multiple applications, but using annotation we'd need to duplicate those policies to all annotations, making it harder to maintain. |
Signed-off-by: joshvanl <me@joshvanl.dev>
cicoyle
left a comment
There was a problem hiding this comment.
Few things from me, I like the general idea, but think we could still tweak the design to offer a unified approach for access policy in Dapr. lmk what you think!
|
|
||
| **Trade-offs:** | ||
| - Users must learn a new resource type. | ||
| - Policy configuration is spread across multiple resources rather than centralized. |
There was a problem hiding this comment.
Thinking out loud - if we go to add Actor Access Policy, would that also be its own resource? I almost feel like maybe we should reuse Configuration and extend it to have a per api definition so its extensible and we don't need one off resources, making it a consistent experience for Dapr users for Access Policy across APIs.
There was a problem hiding this comment.
I'm quite keen to keep this as a separate focused resource to avoid the same mistake with Configuration which becomes a monolith the covers completely different concerns. It becomes harder to maintain, test and consume as a user. It all becomes more odd when considered scopes etc.
|
|
||
| ```yaml | ||
| apiVersion: dapr.io/v1alpha1 | ||
| kind: WorkflowAccessPolicy |
There was a problem hiding this comment.
Thinking more about my earlier comment about using the Configuration service invocation access policy - what if we met on a middle ground and had:
apiVersion: dapr.io/v1alpha1
kind: AccessPolicy
...
This way we could have a unified type that can be extended to all APIs without further bloating Configuration - Im imagining it could look like the following:
apiVersion: dapr.io/v1alpha1
kind: AccessPolicy
metadata:
name: order-service-policy
namespace: production
scopes:
- order-service
spec:
defaultAction: deny
serviceInvocation:
rules:
- callers:
- appID: checkout-service
namespace: production
trustDomain: cluster.local
operations:
- name: /orders
httpVerb: [GET, POST]
action: allow
- name: /orders/*
httpVerb: [GET]
action: allow
workflows:
rules:
- callers:
- appID: checkout-service
- appID: inventory-service
operations:
- type: workflow
name: "ProcessOrder"
action: allow
- type: workflow
name: "Cancel*"
action: allow
- type: activity
name: "ChargePayment"
action: allow
- callers:
- appID: admin-service
operations:
- type: workflow
name: "*"
action: allow
- type: activity
name: "*"
action: allow
actors:
rules:
- callers:
- appID: checkout-service
operations:
- type: actor
actorType: "OrderActor"
method: "Process*"
action: allow
- callers:
- appID: admin-service
operations:
- type: actor
actorType: "*"
method: "*"
action: allow
Whereas, then we could slowly migrate service invocation over and deprecate it eventually and have a unified approach within Dapr across APIs for access policy. This also is a nice way to answer how to do Workflows access policy vs Actors access policy and extend to more APIs in a unified manner :)
There was a problem hiding this comment.
This example kinda shows the problem I want to avoid squashing everything into a single resource- an operator trying to understand "who can my workflows" has to consume this full resource to get to the relevant section.
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
|
qq, would it be desirable to use this access policy resource as another layer of authorization for https://github.com/dapr/proposals/blob/main/20260304-CR-mcpserver.md , and if so, would it make sense to overlay an abstraction on top of the WorkflowAccessPolicy resource to make authorization of mcpservers a first class citizen? |
(this proposal has not been one-shot).