Add standalone ResolverRuleAssociation CRD by Mais316 · Pull Request #65 · aws-controllers-k8s/route53resolver-controller

Mais316 · 2026-05-24T17:27:37Z

The existing inline ResolverRule.Spec.Associations field requires ownership of the ResolverRule CR to manage VPC associations. This does not work for shared rules (e.g. rules shared via AWS RAM or managed by a central platform team) where individual clusters only need to associate an existing rule with their VPC.

This adds a standalone ResolverRuleAssociation CRD that enables fleet-scale adoption: each cluster independently associates a shared rule with its own VPC without needing to manage the rule itself.

AWS API mapping:

Create: AssociateResolverRule
Read: GetResolverRuleAssociation
Delete: DisassociateResolverRule
Update: N/A (all fields are immutable)

Spec fields:

resolverRuleID (required, immutable, with ResolverRule reference support)
vpcID (required, immutable)
name (optional, immutable)

Status fields:

id (primary key, AWS-assigned)
status
statusMessage

Includes E2E tests for:

Basic create/delete lifecycle
Multiple VPCs associated with the same rule

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

The existing inline ResolverRule.Spec.Associations field requires ownership of the ResolverRule CR to manage VPC associations. This does not work for shared rules (e.g. rules shared via AWS RAM or managed by a central platform team) where individual clusters only need to associate an existing rule with their VPC. This adds a standalone ResolverRuleAssociation CRD that enables fleet-scale adoption: each cluster independently associates a shared rule with its own VPC without needing to manage the rule itself. AWS API mapping: - Create: AssociateResolverRule - Read: GetResolverRuleAssociation - Delete: DisassociateResolverRule - Update: N/A (all fields are immutable) Spec fields: - resolverRuleID (required, immutable, with ResolverRule reference support) - vpcID (required, immutable) - name (optional, immutable) Status fields: - id (primary key, AWS-assigned) - status - statusMessage Includes E2E tests for: - Basic create/delete lifecycle - Multiple VPCs associated with the same rule

ack-prow · 2026-05-24T17:27:48Z

Hi @Mais316. Thanks for your PR.

I'm waiting for a aws-controllers-k8s member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Mais316 · 2026-05-24T17:28:58Z

@michaelhtm appreciate your kind review when you have time please 🙏

Mais316 · 2026-05-26T10:50:20Z

@michaelhtm gentle reminder 🙏

Mais316 · 2026-05-28T10:22:20Z

@michaelhtm adding more details here as well as per slack

Thanks again :) for the pointer to ResolverRule. Let me explain our case so it's clearer why I think we still need a standalone association.

Our setup is hub-and-spoke. The resolver endpoints and rules live in the hub VPC (DMZ subnet, since that's the only place allowed to send DNS queries out). The rules are then shared to all spoke accounts via AWS RAM. So from a spoke's perspective the rule is already there, read-only. We can't create it locally because the endpoint has to sit in the DMZ.
What each spoke actually needs is just the association of its own VPC to that shared rule, nothing else.

The problem with the current ResolverRule CRD and its inline Associations array is that we have one build module that provisions VPC + EKS per spoke. Every spoke runs the same code against the same shared rule. If each cluster adopts the rule and writes into the Associations array, they end up fighting over it. Spoke A writes its VPC, spoke B's reconcile loop removes it and writes its own, and so on. Our e2e build can't converge because of this. On top of that the rule isn't really owned by the spoke, it's owned by the platform team in the hub, so spokes shouldn't be authoritative over its spec at all.

This is the same reason Terraform models it as a standalone resource (aws_route53_resolver_rule_association), separate from the rule itself. Each spoke manages only its own association, no shared mutable array. That worked for us before we moved to ACK and it's the shape this PR mirrors.

So what the PR adds is a separate ResolverRuleAssociation CRD where each spoke creates only its own association object, the ResolverRule CR doesn't need to exist in the spoke cluster, and fields are immutable to match the AWS API (AssociateResolverRule / DisassociateResolverRule, no update path).

One question back to you :) am I right that with the current CRD the only path for our case would be to re-adopt the same shared rule in every spoke and let them race on the Associations array? If there's a pattern I'm missing to handle cross-account shared-rule association with ResolverRule alone, I'd be glad to learn it. But from the API shape and our e2e results, a standalone association CRD looks like the natural fit, the same way the AWS API itself models it.
Happy to iterate on the design.

Mais316 · 2026-05-28T19:48:05Z

@michaelhtm

As per Trevor Feedback on slack

I've addressed this by adding an annotation:

metadata:
  annotations:
    route53resolver.services.k8s.aws/manage-associations: "false"

When set to "false" on a ResolverRule CR, the controller completely skips managing the spec.associations field — no reading, creating, updating, or deleting associations via that CR. This defers all association lifecycle to standalone ResolverRuleAssociation CRs.

Guarded paths:

sdkFind — skips ListResolverRuleAssociations (won't populate spec.associations)
sdkCreate — skips creating inline associations after rule creation
sdkDelete — skips disassociating VPCs before rule deletion (standalone CRs handle their own cleanup via DisassociateResolverRule)
customUpdate — skips syncAssociation on spec diff

When the annotation is absent or any value other than "false", the existing inline behavior is fully preserved (backward-compatible).

I also added an E2E test (test_manage_associations_annotation.py) that:

Creates a ResolverRule with manage-associations: "false"
Creates a standalone ResolverRuleAssociation targeting the same rule
Triggers a reconcile on the rule (via tag patch)
Verifies the standalone association is not removed

Mais316 · 2026-05-29T22:33:35Z

@michaelhtm annotation change has been reverted, kindly review when you have time 🙏

…Association

Mais316 · 2026-06-01T19:18:03Z

@michaelhtm please review when you have time :)

knottnt

@Mais316 Left a few comments. Just note, any changes to the generator.yaml should be applied to the generator.yaml file in the root of the repository.

- Change GetResolverRuleAssociation operation_type from Read to Read_One so controller uses Get API instead of List (fixes incorrect first-match) - Add VPC cross-resource reference on VPCId field (ec2 VPC) - Add synced.when condition for Status.Status == COMPLETE - Regenerated controller code

knottnt · 2026-06-01T21:27:11Z

/test all

… types The standalone ResolverRuleAssociation CRD introduced a naming conflict — ResolverRuleAssociation now refers to the full Kubernetes resource type. Update getAttachedVPC to use ResolverRuleAssociation_SDK, which is the inline struct used by ResolverRuleSpec.Associations.

Mais316 · 2026-06-01T22:30:30Z

/test all

ack-prow · 2026-06-01T22:30:47Z

@Mais316: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/test all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

knottnt · 2026-06-01T22:32:09Z

/test all

Mais316 · 2026-06-01T23:08:52Z

/test all

Could u plz trigger new test, did dome changes

- Add sdk_create_post_set_output hook to requeue after 5s when association status is not COMPLETE, ensuring faster convergence instead of relying on the default 30s runtime requeue interval. - Add sdk_read_one_post_set_output hook for the same requeue behavior on subsequent reconcile loops. - Wire hooks in generator.yaml under ResolverRuleAssociation resource. - Add hooks.go with requeueWhileCreating variable to avoid time import in generated sdk.go. - Bump ASSOCIATION_TIMEOUT_SECONDS from 90s to 180s to give the controller adequate time with the requeue cadence. - Add AssociationTestVPC to bootstrap resources so test_multiple_vpcs_same_rule uses a pre-provisioned VPC instead of calling create_vpc at runtime (which hit VpcLimitExceeded in the shared CI account). - Remove unused boto3 import from test file.

knottnt · 2026-06-01T23:59:55Z

/test all

knottnt · 2026-06-02T00:10:07Z

@@ -0,0 +1,3 @@
+	if ko.Status.Status != nil && *ko.Status.Status != "COMPLETE" {


@Mais316 This hooks should be unnecessary when using synced:when. When using that config in generator.yaml code is generated to evaluate whether or not the resource is in a synced status and requeue if it is not. From the e2e test logs while the polling logic did time out we do see the test ResolverRuleAssociation reach a synced (COMPLETE) state in the controller logs.

ref = CustomResourceReference(group='route53resolver.services.k8s.aws', version='v1alpha1', plural='resolverruleassociations', name='rule-assoc-88rt14u19eq7eeexuls4k', namespace='default') timeout = 90 def wait_for_association_complete(ref, timeout=ASSOCIATION_TIMEOUT_SECONDS): """Poll the CR until status.status == COMPLETE or timeout.""" deadline = time.time() + timeout while time.time() < deadline: cr = k8s.get_resource(ref) if cr and cr.get('status', {}).get('status') == 'COMPLETE': return cr time.sleep(10) > pytest.fail(f"Association {ref.name} did not reach COMPLETE within {timeout}s") E Failed: Association rule-assoc-88rt14u19eq7eeexuls4k did not reach COMPLETE within 90s

2026-06-01T22:38:28.689Z DEBUG ackrt patched resource status {"kind": "ResolverRuleAssociation", "namespace": "default", "name": "rule-assoc-88rt14u19eq7eeexuls4k", "account": "", "role": "", "": "us-west-2", "is_adopted": false, "generation": 1, "json": "{\"metadata\":{\"resourceVersion\":\"991\"},\"status\":{\"conditions\":[{\"lastTransitionTime\":\"2026-06-01T22:38:28Z\",\"message\":\"Resource synced successfully\",\"reason\":\"\",\"status\":\"True\",\"type\":\"ACK.ResourceSynced\"},{\"lastTransitionTime\":\"2026-06-01T22:38:28Z\",\"message\":\"Resource synced successfully\",\"reason\":\"\",\"status\":\"True\",\"type\":\"Ready\"}],\"status\":\"COMPLETE\",\"statusMessage\":\"\"}}"}

@knottnt Removed unnecessary hooks synced:when already handles the requeue via IsSynced(). The e2e timeout was 90s in the previous run but the test code has 180s now, so it should pass on the next CI run 🤞

The sdk_create_post_set_output and sdk_read_one_post_set_output hooks manually requeue when Status != COMPLETE, but this is already handled by the synced:when configuration in generator.yaml which generates the IsSynced() method in the runtime. Addresses reviewer feedback in PR aws-controllers-k8s#65.

Mais316 · 2026-06-02T18:32:00Z

@knottnt please trigger another test, when you have time :)

michaelhtm · 2026-06-02T19:18:23Z

/test all

pytest-xdist runs test_create_delete and test_multiple_vpcs_same_rule concurrently. Both fixtures created a SYSTEM rule for the same hardcoded domain (test-assoc.example.com) and both tried to associate with ResolverEndpointVPC. AWS enforces one rule per domain per VPC, so the second association never transitions to COMPLETE. Fix: derive a unique domain from the random rule name so parallel tests never collide on the same domain+VPC pair.

Mais316 · 2026-06-02T20:21:05Z

@knottnt one passed the other one failed, pushed a fix, please trigger another test :)

[gw3]�[36m [ 80%] �[0m�[32mPASSED�[0m tests/test_resolver_rule_association.py::TestResolverRuleAssociation::test_create_delete 
[gw4]�[36m [100%] �[0m�[31mFAILED�[0m tests/test_resolver_rule_association.py::TestResolverRuleAssociation::test_multiple_vpcs_same_rule

michaelhtm · 2026-06-02T20:25:01Z

/test all

Mais316 · 2026-06-02T20:46:51Z

@michaelhtm all is passed now :)

michaelhtm · 2026-06-02T21:01:26Z

Thanks @Mais316
Can we add a documentation.yaml where we can put the Associations limitation?
It should have been introduced in the latest code-generator commit
aws-controllers-k8s/code-generator#710

michaelhtm · 2026-06-02T21:18:35Z

/label release/minor

…ation notes

Mais316 · 2026-06-02T21:37:15Z

Thanks @Mais316 Can we add a documentation.yaml where we can put the Associations limitation? It should have been introduced in the latest code-generator commit aws-controllers-k8s/code-generator#710

done :)

Mais316 · 2026-06-02T22:04:49Z

@michaelhtm please advise if it looks good.

michaelhtm

Nice!
/ok-to-test
/lgtm

michaelhtm · 2026-06-02T22:37:31Z

/test all

michaelhtm · 2026-06-02T22:41:41Z

/ok-to-test
/lgtm

ack-prow · 2026-06-02T22:41:47Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Mais316, michaelhtm

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [michaelhtm]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Releasing changes: * #65

ack-prow Bot requested review from gustavodiaz7722 and sapphirew May 24, 2026 17:27

ack-prow Bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 24, 2026

Mais316 force-pushed the main branch 2 times, most recently from f275965 to a5d56b9 Compare May 29, 2026 22:22

chore: fix code generator tag

9a65c3d

Mais316 added 3 commits May 30, 2026 13:36

e2e: improve resolver rule association test reliability

ad07a95

e2e: fix assert_synced bug, poll-based teardown, mark slow test

b9d788d

e2e: bump delete wait to 120s, mark multi-VPC as canary

9b2f2d1

michaelhtm reviewed May 30, 2026

View reviewed changes

Comment thread apis/v1alpha1/generator.yaml Outdated

Comment thread apis/v1alpha1/generator.yaml

Mais316 added 2 commits June 1, 2026 20:10

fix: comments

b427bb3

fix: use Read_One operation_type and add synced.when for ResolverRule…

232511d

…Association

knottnt reviewed Jun 1, 2026

View reviewed changes

Comment thread config/crd/bases/route53resolver.services.k8s.aws_resolverruleassociations.yaml

Comment thread pkg/resource/resolver_rule_association/sdk.go Outdated

michaelhtm reviewed Jun 1, 2026

View reviewed changes

Comment thread apis/v1alpha1/types.go

knottnt reviewed Jun 2, 2026

View reviewed changes

Mais316 added 2 commits June 2, 2026 12:53

chore: run code generator

2287f10

knottnt self-assigned this Jun 2, 2026

ack-prow Bot added the release/minor Indicates this PR should trigger a minor version release on merge. label Jun 2, 2026

docs: add documentation.yaml with ResolverRule and ResolverRuleAssoci…

31464c2

…ation notes

michaelhtm reviewed Jun 2, 2026

View reviewed changes

ack-prow Bot assigned michaelhtm Jun 2, 2026

ack-prow Bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. lgtm Indicates that a PR is ready to be merged. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jun 2, 2026

ack-prow Bot added the approved label Jun 2, 2026

ack-prow Bot merged commit 7f3dbec into aws-controllers-k8s:main Jun 2, 2026
9 checks passed

ack-bot mentioned this pull request Jun 2, 2026

Release artifacts for release v1.4.0 #66

Merged

ack-prow Bot pushed a commit that referenced this pull request Jun 3, 2026

Release artifacts for release v1.4.0 (#66)

71d624d

Releasing changes: * #65

michaelhtm mentioned this pull request Jun 4, 2026

feat: Add ResourceShareAccepter CRD for accepting RAM share invita… aws-controllers-k8s/ram-controller#43

Open

		@@ -0,0 +1,3 @@
		if ko.Status.Status != nil && *ko.Status.Status != "COMPLETE" {

Conversation

Mais316 commented May 24, 2026

Uh oh!

ack-prow Bot commented May 24, 2026

Uh oh!

Mais316 commented May 24, 2026

Uh oh!

Mais316 commented May 26, 2026

Uh oh!

Mais316 commented May 28, 2026

Uh oh!

Mais316 commented May 28, 2026

Uh oh!

Mais316 commented May 29, 2026

Uh oh!

Uh oh!

Uh oh!

Mais316 commented Jun 1, 2026

Uh oh!

knottnt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

knottnt commented Jun 1, 2026

Uh oh!

Uh oh!

Mais316 commented Jun 1, 2026

Uh oh!

ack-prow Bot commented Jun 1, 2026

Uh oh!

knottnt commented Jun 1, 2026

Uh oh!

Mais316 commented Jun 1, 2026

Uh oh!

knottnt commented Jun 1, 2026

Uh oh!

knottnt Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Mais316 Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Mais316 commented Jun 2, 2026

Uh oh!

michaelhtm commented Jun 2, 2026

Uh oh!

Mais316 commented Jun 2, 2026

Uh oh!

michaelhtm commented Jun 2, 2026

Uh oh!

Mais316 commented Jun 2, 2026

Uh oh!

michaelhtm commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

michaelhtm commented Jun 2, 2026

Uh oh!

Mais316 commented Jun 2, 2026

Uh oh!

Mais316 commented Jun 2, 2026

Uh oh!

michaelhtm left a comment

Choose a reason for hiding this comment

Uh oh!

michaelhtm commented Jun 2, 2026

Uh oh!

michaelhtm commented Jun 2, 2026

Uh oh!

ack-prow Bot commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

Mais316 Jun 2, 2026 •

edited

Loading

michaelhtm commented Jun 2, 2026 •

edited

Loading