config: explicit routing.mode enum replaces implicit full_cdc derivation#38
Merged
Conversation
viaduck used to flip between two completely different pipeline shapes
based on len(key_columns) > 0. An empty/typo'd key_columns silently
switched a destination from MERGE-upsert to plain-append; an
accidentally non-empty list did the inverse. The hazard was the
implicitness, not the modes themselves.
Replace with a required routing.mode enum: full_cdc | append_only.
RoutingConfig.__post_init__ validates the matrix at config load:
full_cdc + non-empty key_columns → OK
full_cdc + empty key_columns → ConfigError (need join keys)
append_only + empty key_columns → OK
append_only + non-empty key_columns → ConfigError (apply path
doesn't use them)
unset / unknown / wrong type / null → ConfigError with operator-
actionable hint
The validator orders the type-mismatch check (catches YAML 1.1 bool
coercion `mode: yes` → Python True) ahead of the empty-string check
(catches `mode: ''` and `mode:` distinct from the type-error path)
ahead of the enum check. The loader coerces `mode:` (null) → "" so
the most common operator omission shape hits the "is required, no
default" branch rather than the "quote the value" hint.
DeliveryManager.__init__ takes mode as a keyword-only kwarg with its
own ConfigError guard (belt-and-suspenders for callers bypassing
config). Positional callers fail loudly at the call site instead of
running an unrelated string past the enum check.
main.py threads mode (not the derived full_cdc bool) through
_poll_cycle; the status payload reads cfg.routing.mode directly
instead of reconstructing it. This was the last implicit derivation
the original PR carried.
Also rips append_at_least_once and supporting machinery
-------------------------------------------------------
The append_at_least_once flag (PR #36) added a per-batch fast path
inside apply_full_cdc that called tbl.append() when the batch was
100% inserts. While shipping #36 we discovered posthog had always
been on the OTHER branch (`append_only`, where key_columns: [] in
charts/argocd/viaduck/values/common.yaml routed through delivery.py
straight to apply.append_only) — the fast path could never have
fired in production because apply_full_cdc was never called.
Removed:
- DestinationConfig.append_at_least_once field
- _is_pure_insert_batch helper + per-batch fast-path branch
- used_append return field on counts dict
- viaduck_dest_apply_mode gauge (verified no Grafana/vmalert
references via grep across grafana-dashboards/, charts/,
posthog-cloud-infra/)
- viaduck_dest_apply_fast_path_batches_total counter (same)
- per-destination dict on DeliveryManager + startup log line
- _validate_bool helper (no remaining caller)
- integration test file (was the only one exercising the deleted
path against real pyducklake)
If a future destination genuinely needs full_cdc-with-append-fast-
path semantics, the cleaner re-introduction is on the source-read
side (e.g. ducklake_table_insertions for known-insert-only sources
even when key_columns is set for dedup purposes) rather than as a
post-Phase-2 apply optimization.
Tests
-----
398 unit + 70 integration tests passing locally. Lint + fmt clean.
Three review rounds (Lead-QE + Principal-SWE in parallel) before
merge; round 2 found 3 integration-test blockers that round 1 had
silently broken because I didn't run `just test-integration` —
fixed and re-verified. Round 3 caught one real loader bug
(YAML `mode:` null value misrouted to type-error message), one stale
seed_mode entry in README, and three smaller cleanups; all addressed.
Migration
---------
This is a hard-break config change. RoutingConfig.mode is required
with no default — pods restart and CrashLoop until the configmap
carries `routing.mode: append_only` (or full_cdc). The chart-side
update is a separate PR; that must merge first so ArgoCD has the
new configmap before pods restart on this image.
mode=append_only routes through source.read_cdc which calls ducklake_table_insertions — that macro structurally returns only inserts (no change_type column synthesized). If a future viaduck change accidentally routes a ducklake_table_changes batch (which DOES carry change_type) into append_only(), the destination would silently land deletes and update_postimages as plain inserts. Cheap O(1) check: refuse the batch if `change_type` is in the column names. Catches both classes of regression — a viaduck dispatch bug (wrong source-read function called for the mode) AND a hypothetical DuckLake contract change where table_insertions starts synthesizing change_type. The empty-batch fast-path runs before the boundary check, so an empty batch with change_type (degenerate but reachable via routing filter that masked everything) stays a no-op rather than a noisy error. Three new unit tests: - accepts the read_cdc shape (snapshot_id + rowid present, change_type absent) - rejects a read_cdc_changes-shaped batch (change_type present) - empty batch with change_type is still a no-op 401 unit + 70 integration tests passing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Replaces an implicit pipeline-shape derivation (
full_cdc = len(key_columns) > 0) with a requiredrouting.modeenum:full_cdc | append_only. Validates the matrix inRoutingConfig.__post_init__so misconfiguration fails at startup with an operator-actionable error instead of silently selecting the wrong shape.Simultaneously rips out the
append_at_least_onceflag and its supporting machinery (PR #36) — that fast path lived insideapply_full_cdcand could never fire on posthog's production pipeline because posthog has always been onappend_only(wherekey_columns: []routes straight toapply.append_only, never touchingapply_full_cdc).Validation matrix
modekey_columnsfull_cdcfull_cdcConfigError("requires a non-empty key_columns list")append_onlyappend_onlyConfigError("forbids key_columns; the append path does not use them")ConfigError("must be one of [...], got <value>")mode: yes(YAML 1.1 bool coercion)ConfigError("must be a string ... quote the value if YAML coerced it")mode:(null value)ConfigError("is required, no default")(coerced viaor ""so it doesn't hit the type-error message)DeliveryManager.__init__takesmodeas a keyword-only kwarg with its ownConfigErrorguard (belt-and-suspenders for callers bypassing config validation). Positional callers fail loudly at the call site instead of running an unrelated string past the enum check.main.pythreadsmode(not the derivedfull_cdcbool) through_poll_cycle. The status payload readsmodedirectly instead of reconstructing it via"full_cdc" if full_cdc else "append_only"— the last implicit-derivation thread of the change.What got ripped out
DestinationConfig.append_at_least_oncefield_is_pure_insert_batchhelper + per-batch fast-path branch in_apply_changesused_appendreturn field on the counts dictviaduck_dest_apply_modegauge +viaduck_dest_apply_fast_path_batches_totalcounter (verified no Grafana/vmalert references via grep acrossgrafana-dashboards/,charts/,posthog-cloud-infra/)DeliveryManager+ startup log line_validate_boolconfig helper (no remaining caller)test_append_at_least_once_integration.py(was the only one exercising the deleted path against real pyducklake)Test plan
just lintcleanjust fmt-checkcleanjust test— 398 unit tests passing (+3 vs main: yaml-bool-coercion, empty-mode-distinct, null-mode-routes-to-required)DOCKER_HOST=unix:///Users/jakob/.orbstack/run/docker.sock just test-integration— 70 integration tests passingrouting.mode: append_onlybefore pods restart on this image, otherwise startup fails with the "is required, no default" error.Viaduck started: ... mode='append_only', ...shows up in pod logs andviaduck_dest_lag_snapshotscontinues to drain.Review history
Three rounds of Lead-QE + Principal-SWE parallel review.
Round 1 addressed: append_at_least_once removal scope, DeliveryManager signature shape, AGENT.md rewrite.
Round 2 caught 3 integration-test blockers I'd silently broken because I didn't run
just test-integration(9 staleDeliveryManager(...)call sites + 1used_appendleftover assert + 1 missingmode=in fixture), plus majors: positionalmodewas a footgun (made keyword-only), README still stale (rewrote "Two Modes"),_make_real_deliveryre-introduced the derivation (now keyword-only required),ValueErrorvsConfigErrorinconsistency (unified onConfigError).Round 3 caught one real loader bug: YAML
mode:(null value) routed to the type-error "quote the value" message instead of the empty-string "is required" message. Also: stalecdc_replayrow in README's seed_mode table, an unjustified in-methodConfigErrorimport indelivery.py(no actual cycle — moved to module top), a docstring referencing a deleted symbol, and the lastfull_cdcre-derivation in_poll_cycle/status. All addressed.Migration
Hard-break config change — pods restart and CrashLoop until the configmap carries
routing.mode. The chart-side update is the paired PR; chart PR must merge first.