Add deployment removal and drained deployment pruning to ServiceDeployer#87
Add deployment removal and drained deployment pruning to ServiceDeployer#87
Conversation
c9b07f4 to
3150033
Compare
41e5215 to
f0c2482
Compare
3150033 to
e6ddfab
Compare
f0c2482 to
10e3d17
Compare
8af2872 to
07f8a53
Compare
| ? "AND i.status != 'completed'" // Only consider active invocations | ||
| : ""; // Consider all invocations (conservative) | ||
|
|
||
| const sql = ` |
There was a problem hiding this comment.
given that invocation id formatting is oddly expensive, we could maybe avoid ever reading the inv id:
const sql = `
SELECT d.id, d.created_at
FROM sys_deployment d
WHERE NOT EXISTS (
SELECT 1 FROM sys_service s WHERE s.deployment_id = d.id
)
AND NOT EXISTS (
SELECT 1 FROM sys_invocation_status i
WHERE i.pinned_deployment_id = d.id ${invocationStatusFilter}
)
ORDER BY d.created_at DESC
OFFSET ${safeOffset}
LIMIT ${safeLimit}
`;
i may look to do similar things in the operator
| LEFT JOIN sys_invocation_status i ON (d.id = i.pinned_deployment_id ${invocationStatusFilter}) | ||
| WHERE s.name IS NULL | ||
| AND i.id IS NULL | ||
| ORDER BY d.created_at DESC |
There was a problem hiding this comment.
i feel like it should be ASC? so we get the oldest ones
There was a problem hiding this comment.
It should be, I was counting on OFFSET here to skip the first N "protected" deployments; came up with something way nicer along with your id trick above.
| // By default (conservative), exclude deployments with ANY pinned invocations | ||
| // If allowPruningDeploymentsWithCompletedInvocations is true, only exclude deployments with active invocations | ||
| // Prune oldest first, skip the N most recent drained ones | ||
| const invocationStatusFilter = allowPruningDeploymentsWithCompletedInvocations |
There was a problem hiding this comment.
as discussed in dm, i dont think we need this
c36a630 to
b6f9dd1
Compare
b6f9dd1 to
e379408
Compare
- Add removalPolicy option (RETAIN default, DESTROY force-deletes on removal) - Add pruneDrainedDeployments option to clean up old drained deployments - Add revisionHistoryLimit to keep N recent drained revisions - Add maxPrunedPerRun (default 10) to limit cleanup per deployment - Query deployments by endpoint ARN for deletion instead of synthetic ID - Pruning finds all drained deployments (no services, no pinned invocations) - Delete is best-effort to avoid blocking CloudFormation stack deletion
… with completed invocations
e379408 to
0eab829
Compare
| authHeader: Record<string, string>, | ||
| rejectUnauthorized: boolean, | ||
| ): Promise<ListDeploymentsResponse["deployments"]> { | ||
| const listUrl = new URL(`${adminUrl}/${DEPLOYMENTS_PATH}`); |
There was a problem hiding this comment.
Ended up resorting to using the API, which gives us extra service name information that we can map back to our current deployment. Unfortunately the sys_deployment schema doesn't list services, which makes it impossible to query only the relevant fully drained deployments for the service(s) which the service deployer is managing. I've opened restatedev/restate#4316 which would make this much more straightforward.
|
I'm sorry @jackkleeman, ended up completely changing course here - I couldn't implement the query I wanted to so ended up resorting to calling the REST API + DF for drained status. Inelegant but much safer, I don't want to risk pruning deployments from services that aren't managed by the CDK service deployer. |
Add automatic pruning of drained deployments
Adds optional automatic cleanup of old deployment versions that accumulate as services are re-registered over time.
Changes
pruneDrainedDeploymentsoption automatically removes deployments with no associated services and no pinned invocations after each successful registrationrevisionHistoryLimitparameter allows keeping N most recent drained deploymentsallowPruningDeploymentsWithCompletedInvocationscontrols whether deployments with only completed invocations can be pruned (conservative by default)Usage Example
Testing
Summary of All E2E Test Results
✅ Single Node EC2 Test (311.6s)
✅ Drained Deployment Pruning Test (373.8s)
✅ Restate Cloud Lambda Test (84.5s)
Fixes #5, #82