-
Notifications
You must be signed in to change notification settings - Fork 36
PackageRevision controller architecture docs (draft) #971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Development/preview environment overrides | ||
| # Enables draft content in deploy previews and branch deploys | ||
| buildDrafts = true |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| # Production environment overrides | ||
| # Drafts are excluded by default (Hugo's default behavior) | ||
| # This file exists so the Netlify build command doesn't warn about a missing config merge |
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,75 @@ | ||||||
| --- | ||||||
| title: "PackageRevision Controller" | ||||||
| type: docs | ||||||
| weight: 2 | ||||||
| draft: true | ||||||
| description: | | ||||||
| Kubernetes controller for package revision lifecycle management (v1alpha2). | ||||||
| --- | ||||||
|
|
||||||
| ## Overview | ||||||
|
|
||||||
| The PackageRevision Controller manages the full lifecycle of package revisions as native Kubernetes CRDs. In the v1alpha1 architecture, the Porch API Server and Engine handle all operations synchronously within the request path. The PR controller takes a different approach — it watches `PackageRevision` CRDs in etcd and reconciles their desired state against Git asynchronously, following standard Kubernetes controller patterns. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. should it be PR controller or PR Controller? |
||||||
|
|
||||||
| This means users interact with package revisions the same way they interact with any other Kubernetes resource: create a CRD with the desired state, and the controller makes it so. | ||||||
|
|
||||||
| ## How It Works | ||||||
|
|
||||||
| ``` | ||||||
| ┌─────────────────────┐ ┌──────────────────────────────┐ ┌─────────────────┐ | ||||||
| │ PackageRevision CRD │ │ PR Controller │ │ Shared Cache │ | ||||||
| │ (etcd) │────>│ │────>│ (from Repo Ctr) │ | ||||||
| │ │ │ • Source execution │ │ │ | ||||||
| │ • spec.source │ │ • Render pipeline │ │ • Git read/write│ | ||||||
| │ • spec.lifecycle │ │ • Lifecycle transitions │ │ • Draft mgmt │ | ||||||
| │ • annotations │ │ • Status updates (SSA) │ │ • Content cache │ | ||||||
| └─────────────────────┘ └──────────────────────────────┘ └─────────────────┘ | ||||||
| │ | ||||||
| ▼ | ||||||
| ┌──────────────────────┐ | ||||||
| │ Function Runner │ | ||||||
| │ (gRPC) │ | ||||||
| └──────────────────────┘ | ||||||
| ``` | ||||||
|
|
||||||
| The controller does not manage repository connections or synchronization. That responsibility stays with the Repository Controller, which populates the shared cache. The PR controller reads from and writes to that cache — it never opens a Git connection directly. | ||||||
|
|
||||||
| ## Reconciliation Pipeline | ||||||
|
|
||||||
| Each reconcile executes three phases in sequence. If any phase produces an error or requires a requeue, subsequent phases are skipped. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. likely my misunderstanding but above we discuss how this new controller is asynchronous then down here we are mentioning doing 3 phases in sequence |
||||||
|
|
||||||
| **Source execution** handles one-time package creation. When a user creates a PackageRevision with `spec.source` set (init, clone, copy, or upgrade), the controller executes that source operation to produce the initial package content in Git. Once `status.creationSource` is populated, this phase becomes a no-op on future reconciles. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. is it not "in Git" only if its not draft? init by default on DB cache only does work in the cache no?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. / this shared cache |
||||||
|
|
||||||
| **Rendering** runs the KRM function pipeline defined in the package's Kptfile. Two events trigger rendering: a content push via the PRR handler (signalled by the `porch.kpt.dev/render-request` annotation), or the completion of source execution. The controller reads resources from the cache, invokes kpt render through the function runner, and writes the results back. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| **Lifecycle transition** compares the desired lifecycle in `spec.lifecycle` with the actual lifecycle in Git. If they differ, the controller transitions the package in Git. On publish, it assigns a revision number and updates the `latest-revision` label across all revisions of the same package. | ||||||
|
|
||||||
| ## Relationship to Other Components | ||||||
|
|
||||||
| The PR controller sits alongside the Repository Controller in the controllers deployment. It depends on the shared cache that the Repository Controller creates and populates — this is enforced at startup by initializing the repo reconciler first and injecting its cache into the PR reconciler. | ||||||
|
|
||||||
| The Porch API Server and Engine continue to serve `PackageRevisionResources` for content access. When a user pushes content through PRR, the API Server writes to Git via the Engine and then patches the render-request annotation on the PackageRevision CRD. This annotation change triggers the PR controller to pick up the new content and render it. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. again has this changed? api server writes to git through the engine if its not draft by default right? or has this changed?
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| PackageVariant and PackageVariantSet controllers create PackageRevision CRDs as part of their automation. The PR controller reconciles these like any other PackageRevision — it doesn't know or care who created the CRD. | ||||||
|
|
||||||
| ## Enabling the Controller | ||||||
|
|
||||||
| The PR controller is enabled via the `--reconcilers` flag on the controllers deployment: | ||||||
|
|
||||||
| ``` | ||||||
| --reconcilers=packagerevisions | ||||||
| ``` | ||||||
|
|
||||||
| It requires the Repository Controller to be running (for the shared cache), the `PackageRevision` CRD to be installed, and the `FUNCTION_RUNNER_ADDRESS` environment variable to be set if external function evaluation is needed. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| ## Configuration | ||||||
|
|
||||||
| The controller exposes flags for tuning concurrency and retry behavior: | ||||||
|
|
||||||
| | Flag | Default | Description | | ||||||
| |------|---------|-------------| | ||||||
| | `packagerevisions.max-concurrent-reconciles` | 50 | Maximum parallel reconciles | | ||||||
| | `packagerevisions.max-concurrent-renders` | 20 | Maximum parallel render operations | | ||||||
| | `packagerevisions.render-requeue-delay` | 2s | Delay before requeue when render limit reached | | ||||||
| | `packagerevisions.repo-operation-retry-attempts` | 3 | Retry count for git operations | | ||||||
| | `packagerevisions.max-grpc-message-size` | 6MB | Max gRPC message size for fn-runner | | ||||||
|
Comment on lines
+55
to
+75
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this is "Configuration and Deployment" information and should be in its relevant section 6 no? can have a link to it from here though |
||||||
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,101 @@ | ||||||
| --- | ||||||
| title: "Design" | ||||||
| type: docs | ||||||
| weight: 2 | ||||||
| draft: true | ||||||
| description: | | ||||||
| Internal design and architecture of the PackageRevision Controller. | ||||||
| --- | ||||||
|
|
||||||
| ## Controller Structure | ||||||
|
|
||||||
| The PR controller is a standard controller-runtime reconciler. Its internal structure mirrors the reconciliation pipeline — each concern is handled by a dedicated sub-reconciler that returns early if its work is not needed: | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. link here to what a "standard controller-runtime reconciler" would be good. likely k8s docs has it |
||||||
|
|
||||||
| ``` | ||||||
| PackageRevisionReconciler | ||||||
| ├── reconcileFinalizer() — Finalizer + ownerReference management, deletion gating | ||||||
| ├── reconcileSource() — One-time package creation (init/clone/copy/upgrade) | ||||||
| ├── reconcileRender() — KRM function pipeline execution | ||||||
| └── reconcileLifecycle() — Git lifecycle transitions, revision numbering | ||||||
| ``` | ||||||
|
|
||||||
| ## CRD as Intent, Git as Content | ||||||
|
|
||||||
| The fundamental design decision is the separation of intent from content. The `PackageRevision` CRD in etcd is the source of truth for **what the user wants** — which lifecycle state the package should be in, how it was created, whether rendering is requested. Git is the source of truth for **what the package contains** — the actual KRM resource files. | ||||||
|
|
||||||
| The controller bridges these two stores. A user sets `spec.lifecycle: Published` on the CRD; the controller transitions the package in Git to published state and updates `status` to reflect the result. This is standard Kubernetes controller semantics — spec is desired state, status is observed state. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
|
||||||
| ## Shared Cache | ||||||
|
|
||||||
| The controller does not open Git repositories directly. All Git interaction goes through the `ContentCache` interface, which is backed by the Repository Controller's shared cache. This design centralizes repository connection management, credential handling, and cache invalidation in a single component. | ||||||
|
|
||||||
| The cache provides six operations that cover the controller's needs: | ||||||
|
|
||||||
| - **GetPackageContent** — read package state and files from the cache | ||||||
| - **CreateNewDraft** — open a new draft for writing initial content | ||||||
| - **CreateDraftFromExisting** — open an existing package for modification (used by render) | ||||||
| - **CloseDraft** — commit a draft to Git | ||||||
| - **UpdateLifecycle** — transition a package's lifecycle state in Git | ||||||
| - **DeletePackage** — remove git refs (branches/tags) for a package | ||||||
|
|
||||||
| The controller never needs to know whether the underlying cache is CR-based or DB-based. It works identically with either implementation. | ||||||
|
|
||||||
| ## Server-Side Apply for Status | ||||||
|
|
||||||
| All status updates use Server-Side Apply with distinct field managers to avoid ownership conflicts. This is important because multiple actors write to the same PackageRevision — the Repository Controller sets initial values during discovery, and the PR controller takes over during reconciliation. | ||||||
|
|
||||||
| Three field managers partition the status fields: | ||||||
|
|
||||||
| **packagerev-controller** owns the core status: Ready condition, observedGeneration, revision number, publishedBy/At timestamps, upstream and self locks, and creationSource. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. will likely need scrapping of all docs usage of packagerev CRD with this naming choice otherwise were gonna have confusion. (i know its on the way out with this but just a note) |
||||||
|
|
||||||
| **packagerev-controller-render** owns the render tracking fields: Rendered condition, renderingPrrResourceVersion, and observedPrrResourceVersion. Separating these prevents a lifecycle status update from accidentally clearing render state. | ||||||
|
|
||||||
| **packagerev-controller-kptfile** owns fields synced from the Kptfile after rendering: readinessGates, packageMetadata, and packageConditions. These are written to the CRD spec and status so that external controllers can read Kptfile-derived data without parsing package content. | ||||||
|
|
||||||
| ## Concurrency-Limited Rendering | ||||||
|
|
||||||
| Rendering calls the function runner via gRPC, which is resource-intensive. Rather than allowing all 50 concurrent reconciles to render simultaneously, the controller uses a channel-based semaphore to bound concurrent renders to a configurable limit (default 20). | ||||||
|
|
||||||
| When the semaphore is full, the reconcile doesn't block — it returns a `RequeueAfter` result and tries again after a short delay. This keeps the controller responsive and prevents it from overwhelming the function runner or exhausting gRPC connections. | ||||||
|
Comment on lines
+55
to
+59
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. very nice and clear |
||||||
|
|
||||||
| ## Stale Render Detection | ||||||
|
|
||||||
| A race exists between rendering and content pushes. While the controller is rendering (which may take seconds), the user might push new content through PRR, changing the render-request annotation. If the controller wrote back the now-stale render results, the user's latest content would be overwritten. | ||||||
|
|
||||||
| To handle this, after rendering completes the controller re-reads the PackageRevision directly from etcd (bypassing the informer cache) and compares the current annotation value with the one that triggered the render. If they differ, the render results are discarded and the reconcile requeues to pick up the newer content. | ||||||
|
|
||||||
| ## Deletion Gating | ||||||
|
|
||||||
| Published packages cannot be deleted directly. This is a safety mechanism — deleting a published package from Git is destructive and irreversible. The controller enforces this through a finalizer: | ||||||
|
|
||||||
| When a user deletes the CRD, Kubernetes sets `deletionTimestamp` but the finalizer prevents actual removal. The controller checks the package's lifecycle: | ||||||
|
|
||||||
| - If the package is Published and its owner Repository still exists, the controller does nothing. The object stays in Terminating state until the user first transitions it to DeletionProposed. | ||||||
| - If the package is DeletionProposed (or any non-Published state), the controller cleans up Git refs and removes the finalizer, allowing Kubernetes to complete the deletion. | ||||||
| - If the owner Repository has been deleted (Kubernetes GC cascade), the controller allows deletion regardless of lifecycle — there's no point protecting packages whose repository is gone. | ||||||
|
|
||||||
| ## OwnerReference to Repository | ||||||
|
|
||||||
| Each PackageRevision gets an ownerReference pointing to its Repository CRD. This serves two purposes: it enables Kubernetes garbage collection (deleting a Repository cascades to all its packages), and it allows the controller to detect GC cascade during deletion gating. | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. silly question but what is (GC cascading) or GC? i know you mention it in the point above partially but could there be a bit of info or link to some? |
||||||
|
|
||||||
| The ownerReference is set on first reconcile if not already present, in the same patch that adds the finalizer. | ||||||
|
|
||||||
| ## Source Execution | ||||||
|
|
||||||
| Source execution is idempotent — it only runs once per PackageRevision. The guard is `status.creationSource`: if it's already set, the source phase is skipped entirely. | ||||||
|
|
||||||
| **Init** creates a brand new package by generating a Kptfile with the specified metadata (name, description, keywords). No external dependencies. | ||||||
|
|
||||||
| **Clone** copies content from an upstream package. Two modes are supported: cloning from a registered PackageRevision (by name reference) or from a raw Git URL. In both cases, the Kptfile's upstream and upstreamLock fields are set to track the source. | ||||||
|
|
||||||
| **Copy** creates a new revision from an existing published revision of the same package in the same repository. This is the mechanism for "edit an existing package" — copy the latest published revision into a new draft workspace. | ||||||
|
|
||||||
| **Upgrade** performs a 3-way merge between the old upstream, new upstream, and current local package. It supports multiple merge strategies (resource-merge, fast-forward, force-delete-replace, copy-merge). After merging, the Kptfile upstream/upstreamLock are updated to point at the new upstream. | ||||||
|
|
||||||
| After any source execution, the controller creates a draft in the cache, writes the resources, closes the draft (committing to Git), and requeues to trigger rendering. | ||||||
|
|
||||||
| ## Latest-Revision Labels | ||||||
|
|
||||||
| The controller maintains a `porch.kpt.dev/latest-revision` label on all PackageRevisions. The published revision with the highest revision number gets `"true"`; all others get `"false"`. This label enables efficient queries like "give me the latest published version of package X" without listing and sorting all revisions. | ||||||
|
|
||||||
| Labels are updated on two events: when a package is published (the new revision becomes latest), and when a published package is deleted (the previous revision becomes latest again). | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think we should try and separate ourselves from the infamous AI "—" where we can. i know its a valid grammar but in practice noone ever uses often it apart form AI. gives the impression that it wasn't ran by people at the end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just maybe ask it to reword those sentences without usage of that notorious character
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its one thing when its just once or twice but its all over the section