Skip to content

config: dump every leaf config field to the startup log#37

Merged
jghoman merged 1 commit into
mainfrom
jakob/viaduck-startup-config-log
Jun 18, 2026
Merged

config: dump every leaf config field to the startup log#37
jghoman merged 1 commit into
mainfrom
jakob/viaduck-startup-config-log

Conversation

@jghoman

@jghoman jghoman commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Summary

Previously the only startup log line was Viaduck started: source=..., routing_field=..., mode=..., destinations=N, instance=.... Investigating production behavior meant re-reading the rendered yaml or execing into the pod to confirm what delivery.flush_interval_seconds the process started with, whether append_at_least_once was actually enabled on team-2, what poll.cdc_chunk_snapshots was set to, etc.

This adds ViaduckConfig.log_summary(log) which emits one INFO line per leaf field across every section, called from main.run() right after metrics.init.

What it looks like in the log

config: source.name='megaduck-mw-prod-us'
config: source.postgres_uri_env='SOURCE_POSTGRES_URI'
config: source.data_path='s3://posthog-megaduck-mw-prod-us/'
config: source.table='events_nrt'
config: routing.field='team_id'
config: routing.key_columns=['uuid']
config: routing.seed_mode='scan'
config: poll.interval_seconds=5
config: poll.cdc_chunk_snapshots=50
config: delivery.workers=8
config: delivery.flush_interval_seconds=120
config: delivery.flush_max_rows=500000
...
config: destinations.count=1
config: destinations[0].id='team-2'
config: destinations[0].routing_value='2'
config: destinations[0].append_at_least_once=True
...

Each value is independently greppable from Loki / kubectl logs.

Secrets

log_summary touches only raw dataclass fields, never the @property accessors that resolve env vars:

  • source.postgres_uri_env (env var NAME, safe) is logged
  • source.postgres_uri (the resolved URI with the embedded password) is NOT
  • source.properties (raw dict of env var names) is logged
  • source.resolved_properties() (resolved S3 credentials) is NOT

Two dedicated tests assert this: a resolved password set into SRC_PG / DEST_QW_PG must not appear in the log, and resolved S3 access keys / secrets must not appear when their *_env references resolve to known sentinels.

Tests

  • Section coverage gate — every top-level section has at least one line. Catches a new field added to ViaduckConfig without log_summary being extended (would otherwise silently miss it in the deploy log).
  • One-line-per-field assertion on a spot-check of leaf fields — defends against a refactor that bundles multiple fields onto one log line.
  • Per-destination indexing under destinations[i].*.
  • append_at_least_once: True flows through verbatim.
  • Resolved Postgres URI passwords + resolved S3 credentials never appear in the log.

419 unit tests pass (was 413, +6 new), lint + fmt clean.

Test plan

Previously the only startup log line was "Viaduck started: source=…,
routing_field=…, mode=…, destinations=N, instance=…". Operators
investigating production behavior at 3am had no way to confirm what
delivery.flush_interval_seconds the pod started with, whether
append_at_least_once was actually enabled on team-2, what
poll.cdc_chunk_snapshots was set to, etc., without re-reading the
rendered yaml or execing into the pod.

ViaduckConfig.log_summary(log) emits one INFO line per leaf field
across every section (source, routing, poll, delivery, server, web,
instance, state, plus destinations[i].* per destination). Each value
is independently greppable: "config: delivery.workers=8",
"config: destinations[0].append_at_least_once=True", etc.

Resolved secrets are never logged. log_summary touches only the raw
dataclass fields, so postgres_uri_env shows up (the env var NAME,
safe) while the postgres_uri @Property (which resolves credentials)
does not. Same for properties dicts — they hold env var names by YAML
convention, the resolved_properties() @Property is what fetches the
actual values, and only the raw dict is dumped.

Called from main.run() right after metrics.init, before any external
connections. The structured "config: <path>=<value>" shape lets
existing Loki / log-search queries lock onto specific keys.

Tests:
- Section coverage gate: every top-level section has at least one
  line. Catches a new field added to ViaduckConfig without
  log_summary being extended (the deploy log would silently miss it).
- One-line-per-field assertion: exactly one log line per spot-checked
  leaf field. Defends against a refactor that bundles multiple
  fields onto one log line.
- Per-destination indexing under destinations[i].* so multi-dest
  configs stay disambiguated.
- append_at_least_once: True flows through verbatim.
- Resolved postgres URI passwords (env-resolved) never appear in
  the log; env var NAMES (postgres_uri_env) do.
- Resolved S3 access key / secret (from properties.*_env) never
  appear; the env var names do.
@jghoman jghoman merged commit d0b7862 into main Jun 18, 2026
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant