Skip to content

Add deployment removal and drained deployment pruning to ServiceDeployer#87

Merged
pcholakov merged 3 commits intomainfrom
pavel/ovyvxqtnzuwl
Feb 3, 2026
Merged

Add deployment removal and drained deployment pruning to ServiceDeployer#87
pcholakov merged 3 commits intomainfrom
pavel/ovyvxqtnzuwl

Conversation

@pcholakov
Copy link
Collaborator

@pcholakov pcholakov commented Feb 2, 2026

Add automatic pruning of drained deployments

Adds optional automatic cleanup of old deployment versions that accumulate as services are re-registered over time.

Changes

  • Pruning drained deployments: New pruneDrainedDeployments option automatically removes deployments with no associated services and no pinned invocations after each successful registration
  • Revision history control: revisionHistoryLimit parameter allows keeping N most recent drained deployments
  • Invocation filtering: allowPruningDeploymentsWithCompletedInvocations controls whether deployments with only completed invocations can be pruned (conservative by default)
  • Query endpoint integration: Added support for querying Restate's SQL API to identify prunable deployments (fixed Accept header for JSON format)

Usage Example

deployer.register(handler.currentVersion, environment, {
  pruneDrainedDeployments: true,
  revisionHistoryLimit: 3,  // retain 3 most recent deployments, even if drained
});

Testing

  • new E2E test (ec2-pruning-test.e2e.ts) validates pruning behavior with real AWS infrastructure:
    • Deploys Lambda service handler (v1) to EC2-hosted Restate
    • Verifies initial deployment exists
    • Re-deploys with new version (v2) triggering re-registration
    • Confirms old deployment was pruned (only new deployment remains)
    • Test passes with aggressive pruning enabled

Summary of All E2E Test Results

✅ Single Node EC2 Test (311.6s)

  • Deployed Restate on EC2
  • Successfully invoked Greeter service
  • Stack retained due to RETAIN_STACK

✅ Drained Deployment Pruning Test (373.8s)

  • Initial deployment verified
  • Re-deployed with new configuration version
  • Confirmed old drained deployment was pruned
  • Stack retained

✅ Restate Cloud Lambda Test (84.5s)

  • Deployed Lambda to Restate Cloud EU
  • Successfully invoked Greeter service
  • Stack retained due to RETAIN_STACK

Fixes #5, #82

@pcholakov pcholakov requested a review from jackkleeman February 2, 2026 11:41
@pcholakov pcholakov marked this pull request as ready for review February 2, 2026 11:41
@pcholakov pcholakov changed the base branch from pavel/tkywkxnrvqzl to main February 2, 2026 11:48
@pcholakov pcholakov force-pushed the pavel/ovyvxqtnzuwl branch 2 times, most recently from 8af2872 to 07f8a53 Compare February 2, 2026 13:17
? "AND i.status != 'completed'" // Only consider active invocations
: ""; // Consider all invocations (conservative)

const sql = `
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given that invocation id formatting is oddly expensive, we could maybe avoid ever reading the inv id:

const sql = `
  SELECT d.id, d.created_at
  FROM sys_deployment d
  WHERE NOT EXISTS (
    SELECT 1 FROM sys_service s WHERE s.deployment_id = d.id
  )
  AND NOT EXISTS (
    SELECT 1 FROM sys_invocation_status i 
    WHERE i.pinned_deployment_id = d.id ${invocationStatusFilter}
  )
  ORDER BY d.created_at DESC
  OFFSET ${safeOffset}
  LIMIT ${safeLimit}
`;

i may look to do similar things in the operator

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! 😁

LEFT JOIN sys_invocation_status i ON (d.id = i.pinned_deployment_id ${invocationStatusFilter})
WHERE s.name IS NULL
AND i.id IS NULL
ORDER BY d.created_at DESC
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i feel like it should be ASC? so we get the oldest ones

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be, I was counting on OFFSET here to skip the first N "protected" deployments; came up with something way nicer along with your id trick above.

// By default (conservative), exclude deployments with ANY pinned invocations
// If allowPruningDeploymentsWithCompletedInvocations is true, only exclude deployments with active invocations
// Prune oldest first, skip the N most recent drained ones
const invocationStatusFilter = allowPruningDeploymentsWithCompletedInvocations
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed in dm, i dont think we need this

- Add removalPolicy option (RETAIN default, DESTROY force-deletes on removal)
- Add pruneDrainedDeployments option to clean up old drained deployments
- Add revisionHistoryLimit to keep N recent drained revisions
- Add maxPrunedPerRun (default 10) to limit cleanup per deployment
- Query deployments by endpoint ARN for deletion instead of synthetic ID
- Pruning finds all drained deployments (no services, no pinned invocations)
- Delete is best-effort to avoid blocking CloudFormation stack deletion
authHeader: Record<string, string>,
rejectUnauthorized: boolean,
): Promise<ListDeploymentsResponse["deployments"]> {
const listUrl = new URL(`${adminUrl}/${DEPLOYMENTS_PATH}`);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ended up resorting to using the API, which gives us extra service name information that we can map back to our current deployment. Unfortunately the sys_deployment schema doesn't list services, which makes it impossible to query only the relevant fully drained deployments for the service(s) which the service deployer is managing. I've opened restatedev/restate#4316 which would make this much more straightforward.

@pcholakov pcholakov requested a review from jackkleeman February 3, 2026 07:46
@pcholakov
Copy link
Collaborator Author

I'm sorry @jackkleeman, ended up completely changing course here - I couldn't implement the query I wanted to so ended up resorting to calling the REST API + DF for drained status. Inelegant but much safer, I don't want to risk pruning deployments from services that aren't managed by the CDK service deployer.

@pcholakov pcholakov merged commit 2e049c6 into main Feb 3, 2026
2 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Feb 3, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement deregistration of services from Restate

2 participants