Skip to content

fix(clickhouse): add opt-out for staging cleanup after QRep flow#4466

Merged
dtunikov merged 1 commit into
mainfrom
fix/clickhouse-staging-cleanup-optout
Jun 19, 2026
Merged

fix(clickhouse): add opt-out for staging cleanup after QRep flow#4466
dtunikov merged 1 commit into
mainfrom
fix/clickhouse-staging-cleanup-optout

Conversation

@dtunikov

Copy link
Copy Markdown
Collaborator

Summary

Restores the ability to retain ClickHouse staging avro files after a snapshot/QRep flow completes, addressing a silent behavior change introduced in #4200.

Background — the regression

#4200 (StagingStore abstraction) refactored the ClickHouse connector so avro upload and staging cleanup both key off the staging store's real bucket prefix (staging.KeyPrefix()). The PR was described as "zero behavior change" for the S3 path, but the cleanup path actually changed behavior:

  • Before: CleanupQRepFlow keyed off config.StagingPath (= SnapshotStagingPath), guarded by strings.HasPrefix(stagingPath, "s3://"). For ClickHouse mirrors, SnapshotStagingPath is typically empty (the bucket comes from PEERDB_CLICKHOUSE_AWS_S3_BUCKET_NAME), so the guard was false and cleanup was a no-op — staging avro files were retained in the bucket.
  • After: cleanup keys off the same real prefix that uploads use, so it now always deletes <staging-prefix>/<flowJobName> once the flow completes.

Deployments that upgraded from 0.36.18 → 0.36.24 and had downstream pipelines consuming those staging avro artifacts saw the files disappear as soon as the mirror's snapshot completed.

Change

Cleanup remains the default — it is the correct behavior and avoids unbounded staging growth. This PR adds an opt-out:

  • New dynconf setting PEERDB_CLICKHOUSE_SKIP_STAGING_CLEANUP (bool, default false).
  • When enabled, CleanupQRepFlow logs and returns early without deleting staging objects.

Notes

  • The staging bucket is PeerDB-internal scratch space; relying on retained files is fragile. The recommended long-term path for consuming avro artifacts is a dedicated PG→S3 QRep mirror. This flag is an escape hatch for affected deployments.
  • Separately, there is a latent mismatch when PEERDB_S3_UUID_PREFIX is enabled: upload keys become <prefix>/<uuid>/<flowJobName>/... while cleanup deletes <prefix>/<flowJobName>, so cleanup misses those objects. Out of scope here but worth tracking.

Test plan

  • go build ./connectors/clickhouse/... ./internal/...
  • go vet ./connectors/clickhouse/
  • Manual: run a CH snapshot with PEERDB_CLICKHOUSE_SKIP_STAGING_CLEANUP=true and confirm staging avro files remain after completion; with default, confirm they are deleted.

🤖 Generated with Claude Code

PR #4200 (StagingStore abstraction) unified the avro upload and staging
cleanup paths onto the staging store's real bucket prefix. As a side
effect, CleanupQRepFlow now always deletes staging avro files once a
ClickHouse snapshot/QRep flow completes.

Previously cleanup keyed off config.StagingPath, which is empty for
ClickHouse mirrors that don't set SnapshotStagingPath (they get their
bucket from PEERDB_CLICKHOUSE_AWS_S3_BUCKET_NAME), so the prefix guard
`strings.HasPrefix(stagingPath, "s3://")` was false and cleanup was a
no-op. Staging avro files were effectively retained. Some deployments
have downstream pipelines that consume those files, and the upgrade
silently broke them by deleting the files as soon as the mirror
completed.

Keep cleanup as the default (it's the correct behavior and avoids
unbounded staging growth), but add PEERDB_CLICKHOUSE_SKIP_STAGING_CLEANUP
to opt out and retain staging artifacts.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@dtunikov dtunikov requested a review from a team as a code owner June 19, 2026 15:11
@claude

claude Bot commented Jun 19, 2026

Copy link
Copy Markdown

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

@dtunikov dtunikov merged commit 2019f86 into main Jun 19, 2026
16 checks passed
@dtunikov dtunikov deleted the fix/clickhouse-staging-cleanup-optout branch June 19, 2026 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants