config: dump every leaf config field to the startup log#37
Merged
Conversation
Previously the only startup log line was "Viaduck started: source=…, routing_field=…, mode=…, destinations=N, instance=…". Operators investigating production behavior at 3am had no way to confirm what delivery.flush_interval_seconds the pod started with, whether append_at_least_once was actually enabled on team-2, what poll.cdc_chunk_snapshots was set to, etc., without re-reading the rendered yaml or execing into the pod. ViaduckConfig.log_summary(log) emits one INFO line per leaf field across every section (source, routing, poll, delivery, server, web, instance, state, plus destinations[i].* per destination). Each value is independently greppable: "config: delivery.workers=8", "config: destinations[0].append_at_least_once=True", etc. Resolved secrets are never logged. log_summary touches only the raw dataclass fields, so postgres_uri_env shows up (the env var NAME, safe) while the postgres_uri @Property (which resolves credentials) does not. Same for properties dicts — they hold env var names by YAML convention, the resolved_properties() @Property is what fetches the actual values, and only the raw dict is dumped. Called from main.run() right after metrics.init, before any external connections. The structured "config: <path>=<value>" shape lets existing Loki / log-search queries lock onto specific keys. Tests: - Section coverage gate: every top-level section has at least one line. Catches a new field added to ViaduckConfig without log_summary being extended (the deploy log would silently miss it). - One-line-per-field assertion: exactly one log line per spot-checked leaf field. Defends against a refactor that bundles multiple fields onto one log line. - Per-destination indexing under destinations[i].* so multi-dest configs stay disambiguated. - append_at_least_once: True flows through verbatim. - Resolved postgres URI passwords (env-resolved) never appear in the log; env var NAMES (postgres_uri_env) do. - Resolved S3 access key / secret (from properties.*_env) never appear; the env var names do.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Previously the only startup log line was
Viaduck started: source=..., routing_field=..., mode=..., destinations=N, instance=.... Investigating production behavior meant re-reading the rendered yaml or execing into the pod to confirm whatdelivery.flush_interval_secondsthe process started with, whetherappend_at_least_oncewas actually enabled on team-2, whatpoll.cdc_chunk_snapshotswas set to, etc.This adds
ViaduckConfig.log_summary(log)which emits one INFO line per leaf field across every section, called frommain.run()right aftermetrics.init.What it looks like in the log
Each value is independently greppable from Loki /
kubectl logs.Secrets
log_summarytouches only raw dataclass fields, never the@propertyaccessors that resolve env vars:source.postgres_uri_env(env var NAME, safe) is loggedsource.postgres_uri(the resolved URI with the embedded password) is NOTsource.properties(raw dict of env var names) is loggedsource.resolved_properties()(resolved S3 credentials) is NOTTwo dedicated tests assert this: a resolved password set into
SRC_PG/DEST_QW_PGmust not appear in the log, and resolved S3 access keys / secrets must not appear when their*_envreferences resolve to known sentinels.Tests
ViaduckConfigwithoutlog_summarybeing extended (would otherwise silently miss it in the deploy log).destinations[i].*.append_at_least_once: Trueflows through verbatim.419 unit tests pass (was 413, +6 new), lint + fmt clean.
Test plan
just lint— cleanjust fmt-check— cleanjust test— 419 passedviaduck_dest_apply_mode(from PR apply: opt-in append_at_least_once fast path for insert-only destinations #36) and the newconfig: destinations[0].append_at_least_once=Trueline are both consistent with the rendered chart values