Add --from-snapshot flag to create droplets from snapshots by gwpl · Pull Request #53 · trailofbits/dropkit

gwpl · 2026-03-23T22:42:51Z

Summary

AI Agent with Greg: We wanted to clone a perfectly configured droplet 10 times — like a VM templating system, but with dropkit ergonomics. Turns out dropkit create always injects cloud-init, which on a snapshot triggers user creation failures, .zshrc carpet-bombing, and an unconditional reboot. Now there's a --from-snapshot flag that knows when to leave well enough alone. 🧬

Closes #52

Add --from-snapshot <snapshot-id> flag to dropkit create
When used: calls create_droplet_from_snapshot() (no user_data), skips cloud-init rendering and monitoring
Mutually exclusive with --image (clear error if both provided)
All post-creation steps still work: wait for active, SSH config, project assignment, Tailscale setup

Usage: clone workflow

# Snapshot your golden image (via DO console or doctl)
# Then clone it with dropkit:
dropkit create worker-1 --from-snapshot 12345678 --size s-4vcpu-8gb
dropkit create worker-2 --from-snapshot 12345678 --size s-4vcpu-8gb

# Or in a loop:
for i in $(seq 1 10); do
  dropkit create "worker-$i" --from-snapshot 12345678 --region ams3
done

Why cloud-init must be skipped

The current cloud-init template is not idempotent — re-running on a snapshot causes:

users: directive fails (user already exists)
write_files: overwrites .zshrc (loses customizations)
runcmd: ends with unconditional reboot

DigitalOcean assigns a new droplet ID to snapshot-based instances, so cloud-init detects a "first boot" and re-runs everything. The --from-snapshot flag avoids this by not sending user_data at all.

See #52 for discussion of alternative approaches (idempotent template, --no-cloud-init flag, dedicated clone command).

Test plan

uv run ruff check dropkit/main.py — passed
uv run ruff format --check dropkit/main.py — passed
uv run ty check dropkit/main.py — passed
uv run pytest tests/ -v — 288 passed, 31% coverage (above 29% minimum)
Manual test: dropkit create test --from-snapshot <id> — verify droplet created, SSH config added, no cloud-init
Manual test: dropkit create test --from-snapshot <id> --image ubuntu-25-10-x64 — verify mutual exclusion error
Manual test: dropkit create test (no flag) — verify normal cloud-init flow unchanged

🤖 Generated with Claude Code — your AI that believes in snapshot discipline: clone responsibly, cloud-init sparingly

Co-Authored-By: Claude Opus 4.6 (1M context) noreply@anthropic.com

ret2libc

Is there any way to consolidate this with the existing "wake" code that does something very similar?

gwpl · 2026-03-24T09:23:26Z

Always eager, Greg's AI Assistant reporting in! 🧬

Good call — there's definitely overlap between --from-snapshot in create and the wake command. Both use api.create_droplet_from_snapshot() and share post-creation steps (wait for active, SSH config, Tailscale setup).

The key differences that kept us from directly reusing wake:

Aspect	`wake`	`create --from-snapshot`
Snapshot source	Looks up by `dropkit-<name>` naming convention + owner tag	User provides arbitrary snapshot ID directly
Size/region	Extracted from snapshot metadata tags (`size:`, `regions`)	User specifies (or uses interactive prompts/defaults)
Post-restore	Prompts to delete the snapshot	No deletion (it's a template, not a hibernation artifact)
Tailscale	Only if `tailscale-lockdown` tag present on snapshot	Standard config-based decision

That said, we could absolutely extract the shared plumbing (create-from-snapshot → wait-for-active → SSH-config → optional-Tailscale) into a helper function that both wake and create --from-snapshot call. That would reduce the duplication to just the "how to find/configure the snapshot" preamble.

Want us to refactor it that way in this PR, or would you prefer to land the feature first and consolidate in a follow-up?

gwpl · 2026-03-24T09:27:38Z

Greg asked us to go ahead and refactor to consolidate the shared plumbing between create --from-snapshot and wake. Pushing an update shortly!

gwpl · 2026-03-24T09:44:21Z

Always eager, Greg's AI Assistant back with the refactor! 🔧

TL;DR

Extracted _create_from_snapshot_and_setup() helper that encapsulates the shared plumbing between create --from-snapshot and wake, eliminating ~90 lines of duplication.

What the helper does

_create_from_snapshot_and_setup(api, config, name, region, size, snapshot_id, tags_list, username, tailscale_enabled, verbose) handles:

api.create_droplet_from_snapshot() call
Wait for droplet to become active
Extract public IP address
Add SSH config entry (if config.ssh.auto_update)
Optional Tailscale setup (controlled by tailscale_enabled param)

Returns (active_droplet, ip_address, tailscale_ip).

What remains command-specific

create --from-snapshot:

Project assignment (--project flag)
Cloud-init status tracking and summary messages
Passes tailscale_enabled based on --no-tailscale flag and config

wake:

Snapshot discovery preamble (find hibernated snapshot, extract region/size/tags)
Passes tailscale_enabled=False to helper, then handles Tailscale separately (only if was_tailscale_locked tag was on snapshot, with 10s SSH-readiness sleep)
Snapshot deletion prompt
Wake-specific success message

All 288 tests pass, ruff/ty clean. No behavioral changes.

Does this align with what you had in mind @ret2libc?

gwpl · 2026-04-12T18:50:20Z

AI Assistant: Friendly ping! 👋 The refactor extracting _create_from_snapshot_and_setup() (consolidating shared code between create --from-snapshot and wake) has been up since Mar 24. Let us know if there's anything else you'd like adjusted!

Support creating droplets from existing snapshots without cloud-init. This enables the "golden image" workflow: configure one droplet, snapshot it, then spin up N identical clones. When --from-snapshot <id> is used: - Uses create_droplet_from_snapshot() API (no user_data sent) - Skips cloud-init rendering and completion monitoring - Still performs: wait for active, SSH config, project assignment, optional Tailscale setup Cloud-init is skipped because the current template is not idempotent — re-running it on a snapshot causes user creation failures, .zshrc overwrites, and an unconditional reboot. See issue trailofbits#52 for discussion of alternative approaches. (for Github WebUI issue linking: Closes trailofbits#52 ) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…code Both `create --from-snapshot` and `wake` perform the same post-creation plumbing: API call, wait-for-active, IP extraction, SSH config, and optional Tailscale setup. Extract this into a shared helper to eliminate ~90 lines of duplication. * New `_create_from_snapshot_and_setup()` near other helper functions * `create --from-snapshot` delegates to helper, then handles project assignment and summary messages * `wake` delegates to helper with tailscale_enabled=False, then handles its own Tailscale re-setup logic (10s sleep, was_tailscale_locked tag) and snapshot deletion prompt No behavioral changes -- both commands produce identical output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

ret2libc

Thanks for the patience on the review! I think I'd like to see less duplicated code, because we have a lot of different paths and ideally the differences would mainly be in the UI rather than in the logic of how to turn on/off/wake/create instances.

ret2libc · 2026-04-13T11:30:50Z

dropkit/main.py

        # Handle Tailscale re-setup if the original droplet had Tailscale lockdown
        if was_tailscale_locked and ip_address:


can't we include this in the _create_from_snapshot_and_setup refactor? I'd like to avoid duplications as much as possible.

Done! In 7e461d9, the wake function now calls api.create_droplet_from_snapshot() directly followed by _post_create_setup() — same pattern as both create paths. The only thing that stays in wake before the helper is tag building and the API call, which need wake-specific parameters (snapshot ID, original region/size from snapshot tags).

ret2libc · 2026-04-13T11:35:40Z

dropkit/main.py

+    if ip_address and config.ssh.auto_update:
+        try:
+            console.print("[dim]Configuring SSH...[/dim]")
+            if verbose:
+                console.print(f"[dim][DEBUG] SSH config path: {config.ssh.config_path}[/dim]")
+                console.print(f"[dim][DEBUG] Adding host '{name}' -> {username}@{ip_address}[/dim]")
+                console.print(f"[dim][DEBUG] Identity file: {config.ssh.identity_file}[/dim]")
+            add_ssh_host(
+                config_path=config.ssh.config_path,
+                host_name=ssh_hostname,
+                hostname=ip_address,
+                user=username,
+                identity_file=config.ssh.identity_file,
+            )
+            console.print(f"[green]✓[/green] SSH config updated: [cyan]ssh {ssh_hostname}[/cyan]")
+        except Exception as e:
+            console.print(f"[yellow]⚠[/yellow] Could not update SSH config: {e}")
+


this is duplicated code from the crate path

AI Assistant in the loop with Greg here — think of us as a pair-programming centaur where one half has opposable thumbs and the other has 1M tokens of context! 🐴🤖

You're absolutely right. The old _create_from_snapshot_and_setup was only used by create --from-snapshot and wake, but the regular create path had the exact same post-creation logic (wait-for-active → IP extraction → SSH config) copy-pasted inline.

Refactored in 7e461d9: replaced it with _post_create_setup(droplet) that takes the raw API response from ANY creation call and handles all shared post-creation steps. Now all three paths use it:

create (base image): api.create_droplet() → _post_create_setup(tailscale_enabled=False) → cloud-init → Tailscale

create --from-snapshot: api.create_droplet_from_snapshot() → _post_create_setup(tailscale_enabled=True)

wake: api.create_droplet_from_snapshot() → _post_create_setup(tailscale_enabled=False) → Tailscale separately

Also consolidated project assignment which was duplicated between the two create paths. Net result: -75 lines (59 added, 134 removed).

Replace _create_from_snapshot_and_setup with a more generic _post_create_setup helper that handles post-creation steps (validate ID, wait for active, extract IP, SSH config, optional Tailscale) for ALL three droplet creation paths: - create (base image): API call → _post_create_setup → cloud-init → Tailscale - create --from-snapshot: API call → _post_create_setup (with Tailscale) - wake: API call → _post_create_setup → Tailscale (separate, conditional) Also consolidates project assignment (was duplicated in both create paths). The API call itself stays in each command function (it's the only truly different part), while the shared plumbing lives in one place. Addresses ret2libc's review: "I'd like to see less duplicated code, because we have a lot of different paths and ideally the differences would mainly be in the UI rather than in the logic." Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

gwpl · 2026-04-13T17:29:51Z

AI Assistant consulting with Greg — we're like git blame but friendlier and with better jokes! 😄

Refactoring done (`7e461d9`)

Took your architectural direction to heart. The key change: replaced the snapshot-only _create_from_snapshot_and_setup with a generic _post_create_setup(droplet) that ALL three creation paths now share.

Before: Each path had its own copy of wait-for-active → IP extraction → SSH config → Tailscale (100+ duplicated lines).

After: Each command function only handles what's unique to it:

Path	Unique (in command)	Shared (`_post_create_setup`)
`create` (base image)	Cloud-init render + API call, cloud-init monitoring	✓
`create --from-snapshot`	Snapshot API call	✓
`wake`	Snapshot discovery, Tailscale re-setup, snapshot deletion	✓

Project assignment was also consolidated — it was duplicated between the two create paths, now it's a single block after the if/else.

Result: -75 net lines (59 added, 134 removed). All 288 tests pass.

This aligns with the vision of "differences mainly in the UI rather than in the logic" — each command gathers its parameters and makes its API call, then the shared helper does the mechanical plumbing.

gwpl requested a review from ret2libc as a code owner March 23, 2026 22:42

ret2libc reviewed Mar 24, 2026

View reviewed changes

gwpl and others added 2 commits April 12, 2026 21:19

gwpl force-pushed the i52-create-from-snapshot branch from d52751e to 092a788 Compare April 12, 2026 19:19

ret2libc reviewed Apr 13, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add --from-snapshot flag to create droplets from snapshots#53

Add --from-snapshot flag to create droplets from snapshots#53
gwpl wants to merge 3 commits intotrailofbits:mainfrom
VariousForks:i52-create-from-snapshot

gwpl commented Mar 23, 2026

Uh oh!

ret2libc left a comment

Uh oh!

gwpl commented Mar 24, 2026

Uh oh!

gwpl commented Mar 24, 2026

Uh oh!

gwpl commented Mar 24, 2026

Uh oh!

gwpl commented Apr 12, 2026

Uh oh!

ret2libc left a comment

Uh oh!

ret2libc Apr 13, 2026

Uh oh!

gwpl Apr 13, 2026

Uh oh!

ret2libc Apr 13, 2026

Uh oh!

gwpl Apr 13, 2026

Uh oh!

gwpl commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		# Handle Tailscale re-setup if the original droplet had Tailscale lockdown
		if was_tailscale_locked and ip_address:

Conversation

gwpl commented Mar 23, 2026

Summary

Usage: clone workflow

Why cloud-init must be skipped

Test plan

Uh oh!

ret2libc left a comment

Choose a reason for hiding this comment

Uh oh!

gwpl commented Mar 24, 2026

Uh oh!

gwpl commented Mar 24, 2026

Uh oh!

gwpl commented Mar 24, 2026

TL;DR

What the helper does

What remains command-specific

Uh oh!

gwpl commented Apr 12, 2026

Uh oh!

ret2libc left a comment

Choose a reason for hiding this comment

Uh oh!

ret2libc Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwpl Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

ret2libc Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwpl Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

gwpl commented Apr 13, 2026

Refactoring done (7e461d9)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Refactoring done (`7e461d9`)