Skip to content

perf(backup): cut Azure container/list ops in rclone sync#106

Merged
max-tet merged 1 commit into
mainfrom
fix-backup-rclone-ops
Jun 13, 2026
Merged

perf(backup): cut Azure container/list ops in rclone sync#106
max-tet merged 1 commit into
mainfrom
fix-backup-rclone-ops

Conversation

@ClaydeCode

Copy link
Copy Markdown
Contributor

Problem

On the shardbackupstorage Azure account, the "List and Create Container Operations" meter is the single largest cost line — roughly 3x the actual data-stored cost (~€15/mo), ~70% of the account's spend. Nightly-per-shard cadence can't reach that op volume from container creates alone; it's List Blobs ops from rclone sync.

Root cause

shard_core/service/backup.py runs rclone sync once per backup directory (core, user_data) per shard, nightly (0 3 * * *):

  1. No --fast-list → rclone lists the destination hierarchically, one List Blobs op per subdirectory. user_data (app data) is a deep tree, so op count scales with folder-count × shards × 30 nights. This is the bulk of the ~3M ops/month.
  2. No --azureblob-no-check-container → rclone issues a create/check-container op on every invocation (2× per shard per night). The controller (get_backup_sas_url) already creates the container when minting the SAS — the check is pure waste.

Fix

Add both flags to COMMAND_TEMPLATE and CLEARTEXT_COMMAND_TEMPLATE:

  • --fast-list — single recursive listing (~1 op per 5000 blobs) instead of one-per-directory. Modest extra RAM during listing; blob counts are small.
  • --azureblob-no-check-container — skip the redundant container check/create.

Read-side / listing change only — sync semantics unchanged.

Expected impact

Drops the dominant cost line by most of its value → total subscription run rate ~€41/mo → ~€26/mo.

Test notes

No tests assert the rclone command string. just cleanup (ruff/black) could not run — both absent from the env; the change is whitespace inside an existing triple-quoted string, so formatting is unaffected. Verified the new flags tokenize as distinct argv entries under command.split().

🤖 Generated with Claude Code

Add --fast-list and --azureblob-no-check-container to both rclone
backup command templates.

Azure's "List and Create Container Operations" meter was the single
largest line on shardbackupstorage (~3x the actual data-stored cost,
~€15/mo). Two causes:

- Without --fast-list, rclone lists the destination hierarchically,
  issuing one List Blobs op per subdirectory. user_data trees are deep,
  so this scales with folder count x shards x nightly runs. --fast-list
  does a single recursive listing (~1 op per 5000 blobs) instead.
- Without --azureblob-no-check-container, rclone issues a create/check
  container op on every invocation (2x per shard per night). The
  controller already creates the container when minting the SAS, so the
  check is pure waste.

Read-side/listing change only; sync semantics unchanged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@max-tet max-tet merged commit bc9f3a0 into main Jun 13, 2026
4 checks passed
@max-tet max-tet deleted the fix-backup-rclone-ops branch June 13, 2026 19:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants