Skip to content

feat(openworkflow, server, dashboard): add resume workflow run functionality and UI components#521

Open
vudc wants to merge 2 commits into
openworkflowdev:mainfrom
vudc:feat/resume-workflow
Open

feat(openworkflow, server, dashboard): add resume workflow run functionality and UI components#521
vudc wants to merge 2 commits into
openworkflowdev:mainfrom
vudc:feat/resume-workflow

Conversation

@vudc
Copy link
Copy Markdown

@vudc vudc commented May 18, 2026

Closes #520

Summary

Adds a new client + backend method, resumeWorkflowRun(workflowRunId), that flips a failed workflow run back to pending so the worker picks it up on the next tick. Completed steps are served from the existing step-attempt cache (no re-execution); the failing step starts with a fresh retry budget. A "Resume Run" button is wired up on the run-detail page of the dashboard.

Changes

Backend

  • Backend.resumeWorkflowRun({ workflowRunId }) added to the interface, implemented for Postgres and Sqlite. Atomically (single transaction) flips status from failed to pending, clears error / worker_id / finished_at, sets available_at = NOW(), and DELETEs the failed step_attempts rows for that run. Non-failed runs throw a clear error.
  • ow.resumeWorkflowRun(workflowRunId) client method.
  • New helper resolveResumeWorkflowRunConflict in core/workflow-run.ts.

Dashboard

  • New RunResumeAction component (apps/dashboard/src/components/run-resume-action.tsx) with a confirmation dialog.
  • resumeWorkflowRunServerFn in lib/api.ts.
  • isRunResumableStatus helper in lib/status.ts (currently "failed" only).
  • Button rendered next to the existing "Cancel Run" on the run-detail page.

Tests

  • Two tests in packages/openworkflow/worker/execution.test.ts:
    • Happy path: middle step exhausts retries, run goes failed, resume, run completes, asserts the completed step was not re-executed and failed attempts are removed.
    • Negative path: resume on a non-failed run throws.

Copilot AI review requested due to automatic review settings May 18, 2026 16:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds a "resume failed workflow run" capability across the backend (SQLite + Postgres), core API, client, and dashboard. Resuming flips a failed run back to pending, deletes failed step attempts so retry budgets are fresh, and preserves completed step cache.

Changes:

  • New resumeWorkflowRun method on the Backend interface, implemented in SQLite and Postgres backends; new client helper and conflict resolver.
  • Dashboard gains a "Resume Run" button (component, server function, status helper, route integration).
  • Example workflow + runner demonstrating the resume flow, plus tests for success and invalid-status cases.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
packages/openworkflow/core/backend.ts Adds resumeWorkflowRun to Backend and ResumeWorkflowRunParams type.
packages/openworkflow/core/workflow-run.ts Adds resolveResumeWorkflowRunConflict error helper.
packages/openworkflow/sqlite/backend.ts Implements resumeWorkflowRun with transaction-guarded UPDATE + DELETE.
packages/openworkflow/postgres/backend.ts Implements resumeWorkflowRun via pg.begin transaction.
packages/openworkflow/client/client.ts Exposes resumeWorkflowRun on OpenWorkflow client.
packages/openworkflow/worker/execution.test.ts Tests for resume success + invalid-status rejection.
apps/dashboard/src/lib/api.ts Adds resumeWorkflowRunServerFn.
apps/dashboard/src/lib/status.ts Adds isRunResumableStatus helper.
apps/dashboard/src/components/run-resume-action.tsx New AlertDialog-based Resume Run button.
apps/dashboard/src/routes/runs/$runId.tsx Wires Resume Run button into run detail page.
openworkflow/flaky-payment.ts Demo workflow that fails then succeeds after resume.
openworkflow/flaky-payment.run.ts Runner script for the demo workflow.
Comments suppressed due to low confidence (1)

apps/dashboard/src/components/run-resume-action.tsx:1

  • AlertDialogAction closes the dialog on click by default. Because resumeRun is async and only calls setIsOpen(false) after success, the dialog will close immediately on click regardless of the in-flight request, and any error set in state won't be visible (the dialog is gone). Use event.preventDefault() in the onClick (Radix supports this on AlertDialogAction) to keep the dialog open while resuming.
import {

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +780 to +786
if (updateResult.changes === 0) {
this.db.exec("ROLLBACK");
const existing = await this.getWorkflowRun({
workflowRunId: params.workflowRunId,
});
resolveResumeWorkflowRunConflict(params.workflowRunId, existing);
}
Comment on lines +751 to +754
async resumeWorkflowRun(
params: ResumeWorkflowRunParams,
): Promise<WorkflowRun> {
const currentTime = now();
Comment on lines +734 to +735
const [updated] = await tx<WorkflowRun[]>`
UPDATE ${workflowRunsTable}
Comment on lines +746 to +747
RETURNING *
`;
DELETE FROM "step_attempts"
WHERE "namespace_id" = ?
AND "workflow_run_id" = ?
AND "status" = 'failed'
Comment on lines +72 to +75
>
<Button
type="button"
variant="default"
disabled={isResuming}
>
Resume Run
</Button>
* you restart the worker between the failure and the resume, the counter
* resets, so resume will fail again and you'll need to resume once more.
*/
export const flakyPayment = defineWorkflow<FlakyPaymentInput, FlakyPaymentOutput>(

// eslint-disable-next-line functional/no-throw-statements
throw new Error(
`Cannot resume workflow run ${workflowRunId} with status ${existing.status}; only failed runs can be resumed`,
SET
"status" = 'pending',
"worker_id" = NULL,
"error" = NULL,
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 18, 2026 16:50
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 12 out of 12 changed files in this pull request and generated 12 comments.

Comments suppressed due to low confidence (3)

apps/dashboard/src/components/run-resume-action.tsx:1

  • AlertDialogAction from Radix closes the dialog by default on click. Because the click immediately closes the dialog, onOpenChange(false) fires, which sets error to null, so the error message rendered at line 94 will never be visible if the resume call rejects — the dialog is already gone. Either prevent default close (event.preventDefault()) before invoking resumeRun, or use a regular <Button> and manage close on success only.
import {

openworkflow/flaky-payment.ts:1

  • With maximumAttempts: 3 and the condition reserveAttempt <= RESERVE_MAX_ATTEMPTS (i.e. fails on attempts 1, 2, 3), the first run throws on all 3 attempts and the workflow fails as intended. On resume, reserveAttempt continues from 3, increments to 4, and 4 ≤ 3 is false → succeeds. That works only because the counter persists. If the worker restarts, reserveAttempt resets to 0 and three resumes are needed (each consuming a full retry budget). The comment acknowledges this, but consider also: each resume gives the step a fresh retry budget, but the counter exceeds 3 only after 4 increments, so a worker restart between resumes means the second resume also fails outright. This makes the demo brittle in practice.
import { defineWorkflow } from "openworkflow";

apps/dashboard/src/lib/api.ts:1

  • Like the sibling cancelWorkflowRunServerFn, there is no authentication/authorization check on this endpoint. Anyone able to call the server function can resume any workflow run. If this matches the project's existing authn/z model (e.g. dashboard is internal-only) it's fine; otherwise consider adding an auth guard. Flag for consistency with the rest of the dashboard.
import { getBackend } from "./backend";

Comment on lines +780 to +786
if (updateResult.changes === 0) {
this.db.exec("ROLLBACK");
const existing = await this.getWorkflowRun({
workflowRunId: params.workflowRunId,
});
resolveResumeWorkflowRunConflict(params.workflowRunId, existing);
}
DELETE FROM "step_attempts"
WHERE "namespace_id" = ?
AND "workflow_run_id" = ?
AND "status" = 'failed'
Comment on lines +756 to +763
// Drop the prior failed attempts so the next worker pass starts the
// failed step with a fresh retry budget and the existing completed
// attempts remain in cache (replay skips re-execution).
await tx`
DELETE FROM ${stepAttemptsTable}
WHERE "namespace_id" = ${this.namespaceId}
AND "workflow_run_id" = ${params.workflowRunId}
AND "status" = 'failed'
Comment on lines +750 to +753
const existing = await this.getWorkflowRun({
workflowRunId: params.workflowRunId,
});
resolveResumeWorkflowRunConflict(params.workflowRunId, existing);
Comment on lines +756 to +757
try {
this.db.exec("BEGIN IMMEDIATE");
Comment thread apps/dashboard/src/components/run-resume-action.tsx
Comment thread apps/dashboard/src/routes/runs/$runId.tsx
Comment thread openworkflow/flaky-payment.ts
Comment thread packages/openworkflow/core/workflow-run.ts
Comment thread packages/openworkflow/sqlite/backend.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support manual retry for failed workflows

2 participants