Skip to content

feat: add commit input to control snapshot behavior#58

Merged
bruce-y merged 2 commits into
mainfrom
devin/1780326868-add-commit-input
Jun 3, 2026
Merged

feat: add commit input to control snapshot behavior#58
bruce-y merged 2 commits into
mainfrom
devin/1780326868-add-commit-input

Conversation

@devin-ai-integration

@devin-ai-integration devin-ai-integration Bot commented Jun 1, 2026

Copy link
Copy Markdown

Summary

Adds a commit input to avoid redundant Ceph snapshots when multiple parallel jobs mount the same sticky disk key.

Problem: When N parallel shards mount the same key (e.g. 16 test runners reading a shared build artifact), every shard commits an identical snapshot at job end, causing N× storage usage.

New commit input (default "true", backward compatible):

Value Behavior
true Always commit (existing behavior)
false Never commit (mount is read-only, post step calls cleanupStickyDiskWithoutCommit)
if-missing Only commit if the disk was freshly formatted at mount time (no prior snapshot existed). Skips commit on cache hits. First-writer-wins semantics.
on-change Commit only if df used-bytes changed by more than 4KB since mount

How it works: maybeFormatBlockDevice now returns wasFormatted: boolean. When blkid finds existing ext4, wasFormatted=false (cache hit); when the device needs mkfs.ext4, wasFormatted=true (cache miss). This is saved to state as STICKYDISK_WAS_FORMATTED and checked by the if-missing mode in the post step.

Usage for the parallel-shard case:

# Build job writes artifacts, commits on first run only
- uses: useblacksmith/stickydisk@v1
  with:
    key: test-backend-${{ github.sha }}-artifact
    path: ./artifacts
    commit: "if-missing"

# Test shards read-only, never commit
- uses: useblacksmith/stickydisk@v1
  with:
    key: test-backend-${{ github.sha }}-artifact
    path: ./artifacts
    commit: "false"

Limitation: If all runners start simultaneously before any snapshot exists, they all get fresh disks (wasFormatted=true) and all commit. True first-writer-wins atomicity for that race would require a server-side change to commitStickyDisk.

Link to Devin session: https://app.devin.ai/sessions/787e1e5841d4400e9d6d18ffc5e81015


View with Codesmith Autofix with Codesmith
Need help on this PR? Tag @codesmith with what you need. Autofix is disabled. (Staging)

Add a new 'commit' input with three modes:
- 'true' (default): always commit — preserves existing behavior
- 'false': never commit — for read-only consumers that mount a
  shared key without needing to snapshot
- 'on-change': only commit if filesystem usage changed since mount —
  auto-detects read-only usage and skips redundant snapshots

This addresses excessive storage usage when N parallel jobs mount
the same sticky disk key and all commit identical snapshots. Users
can now set commit: false on read-only shards, or use on-change to
automatically skip commits when the disk content is unchanged.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@devin-ai-integration

Copy link
Copy Markdown
Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 27e0cc7. Configure here.

Comment thread src/post.ts
core.info(
`Filesystem unchanged (initial: ${initialUsageBytes} bytes, current: ${fsDiskUsageBytes} bytes, delta: ${delta} bytes <= ${thresholdBytes} byte threshold). Skipping commit (on-change mode).`,
);
return false;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On-change skips real file changes

Medium Severity

on-change decides whether to snapshot by comparing df used-byte totals within a 4KB band, but many real updates (same-size overwrites, delete-plus-add with similar footprint, metadata-only churn) leave used space flat, so the post step skips commit while file contents differ from the mounted baseline.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 27e0cc7. Configure here.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Acknowledged — the df-based heuristic is intentionally coarse for this use case. The primary target is read-only consumers (e.g., 16 test shards that mount a shared build artifact and never write to it), where used-bytes will be exactly unchanged.

For the pathological edge case (same-size overwrite, balanced delete+add), users should use commit: "true" (the default) or commit: "false" for explicit control. The on-change mode is a best-effort optimization for the common "mount, read, done" pattern — not a content-addressable guarantee.

A block-device write I/O counter (/sys/block/*/stat field 5) would be more precise, but it's harder to reliably map mount→device at setup time (before findmnt shows it), and any I/O at all (even a stray atime update or journal replay) would trigger a commit, reducing the dedup benefit.

When commit is set to 'if-missing', the post step only commits if the
disk was freshly formatted at mount time (no prior snapshot existed).
If a snapshot already existed (cache hit), the commit is skipped.

This ensures only the first writer to a key snapshots, while all
subsequent consumers that get a cache hit skip the commit
automatically.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
@bruce-y bruce-y merged commit 2b7a836 into main Jun 3, 2026
12 checks passed
@linear-code

linear-code Bot commented Jun 3, 2026

Copy link
Copy Markdown

BLA-4031

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants