feat: auto-delete stopped workspaces after configurable TTL#916
Merged
simple-agent-manager[bot] merged 7 commits intomainfrom May 6, 2026
Merged
feat: auto-delete stopped workspaces after configurable TTL#916simple-agent-manager[bot] merged 7 commits intomainfrom
simple-agent-manager[bot] merged 7 commits intomainfrom
Conversation
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extend NodeLifecycle DO to schedule workspace deletion 5 minutes after a workspace stops. This prevents disk exhaustion on nodes caused by accumulated stopped workspaces (Docker volumes with git repos, node_modules). - Add DEFAULT_WORKSPACE_STOPPED_TTL_MS (300000ms) to shared constants - Add scheduleWorkspaceDeletion/cancelWorkspaceDeletion to NodeLifecycle DO - Alarm handler processes expired deletions (VM agent DELETE + D1 update) - recalculateAlarm picks earliest of warm timeout and pending deletions - Schedule deletion from: stop route, cleanupTaskRun, cleanupOnFailure - Cancel deletion from: restart route (before TTL expires) - Cron safety-net in node-cleanup.ts for missed DO alarms - Configurable via WORKSPACE_STOPPED_TTL_MS env var Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
… counter - warm-node-pooling integration test: setAlarm → recalculateAlarm, deleteAlarm → recalculateAlarm(null) - node-cleanup unit test: add stoppedWorkspacesDeleted to expected result, mock node-agent and project-data Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- [HIGH] Use deleteWorkspaceOnNode shared helper with proper JWT auth instead of raw IP fetch with X-User-ID header - [HIGH] Hoist getStoredState() before the deletion loop to avoid N redundant DO storage reads - [MEDIUM] Add status='stopped' guard to cron sweep D1 update (TOCTOU) - [MEDIUM] Use recalculateAlarm in destroying branch so pending workspace deletions are not delayed by retry window Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The processExpiredCleanups idle timeout path stops workspaces but wasn't scheduling automatic deletion. Workspaces stopped via idle timeout now get deletion scheduled via the NodeLifecycle DO, matching the lifecycle route and task-runner paths. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


Summary
Changes
packages/shared: AddedDEFAULT_WORKSPACE_STOPPED_TTL_MSconstant (5 min)apps/api/src/env.ts: AddedWORKSPACE_STOPPED_TTL_MSenv varapps/api/src/durable-objects/node-lifecycle.ts: Extended withscheduleWorkspaceDeletion,cancelWorkspaceDeletion, unifiedrecalculateAlarm(warm + deletion),deleteWorkspacevia shareddeleteWorkspaceOnNodehelperapps/api/src/routes/workspaces/lifecycle.ts: Stop → schedule deletion, Restart → cancel deletionapps/api/src/services/task-runner.ts:cleanupTaskRunschedules deletionapps/api/src/durable-objects/task-runner/state-machine.ts:cleanupOnFailureschedules deletionapps/api/src/durable-objects/project-data/index.ts: Idle cleanup schedules deletionapps/api/src/scheduled/node-cleanup.ts: Safety-net cron sweep for stale stopped workspaces (with TOCTOU-safe status guard)Test plan
Specialist Review Evidence
Agent Preflight (Required)
Classification
External References
N/A: Pure internal feature using existing codebase patterns (NodeLifecycle DO alarm, deleteWorkspaceOnNode helper, cron sweep). No external APIs or new dependencies introduced.
Codebase Impact Analysis
packages/shared/src/constants/node-pooling.ts— newDEFAULT_WORKSPACE_STOPPED_TTL_MSconstantapps/api/src/env.ts— newWORKSPACE_STOPPED_TTL_MSoptional env varapps/api/src/durable-objects/node-lifecycle.ts— workspace deletion scheduling, unified recalculateAlarm, deleteWorkspace method using shared helperapps/api/src/routes/workspaces/lifecycle.ts— stop/restart hooks into NodeLifecycle DOapps/api/src/services/task-runner.ts— cleanupTaskRun schedules deletionapps/api/src/durable-objects/task-runner/state-machine.ts— cleanupOnFailure schedules deletionapps/api/src/durable-objects/project-data/index.ts— idle cleanup schedules deletion after stopapps/api/src/scheduled/node-cleanup.ts— safety-net cron for stale stopped workspaces with TOCTOU guardDocumentation & Specs
N/A: No public-facing API surface or documentation changes. The feature is purely internal (DO alarm + cron) with no user-visible UI. CLAUDE.md already documents NodeLifecycle DO and warm pool concepts.
Constitution & Risk Check
Principle XI (No Hardcoded Values): TTL is configurable via
WORKSPACE_STOPPED_TTL_MSenv var withDEFAULT_WORKSPACE_STOPPED_TTL_MSconstant fallback. No hardcoded timeouts or limits. The cron sweep uses a 2x multiplier on the configurable TTL for its grace buffer.Staging Verification
🤖 Generated with Claude Code