Confirm before destructive Galaxy history delete/purge (#338) by dannon · Pull Request #344 · galaxyproject/loom

dannon · 2026-06-19T18:39:06Z

Closes #338.

The problem

A weak model, asked to delete two datasets, reached for the history-level delete tool (galaxy_update_history(deleted=true)) and then raw curl -- and nothing gated either. The brain exec-guard's policy ends in a catch-all that allows every non-local tool (Galaxy MCP included), and the web-mode gate allows every galaxy_* tool by prefix. So a whole ~10GB history was marked deleted with no are-you-sure. There's also no per-dataset delete tool (#228), which is what pushed the model toward the history-level tool and curl in the first place.

The fix

A shared classifier, shared/galaxy-destructive.js, that both tool_call gates consume so "what counts as destructive" has one home:

Orbit/CLI (exec-guard/policy.ts + gate.ts): a whole-history delete/purge routes to a new galaxy:destructive decision that always confirms, regardless of model tier -- it deliberately bypasses the usual weak->deny downgrade, because the danger in Destructive Galaxy history delete/purge runs with no confirmation -- agent purged a whole history when asked to delete two datasets #338 was silent execution, not the model proposing the action. Non-interactive sessions deny (no one to approve). The confirm is a yes/no that is never cached and never offered "allow for this session", with honest wording that distinguishes a recoverable soft-delete from an irreversible purge.
Remote/web (web/extensions/web-mode-gate.ts): destructive ops are hard-blocked -- that gate has no confirmation UI. A remote-confirm flow is a follow-up.

The classifier reads two reliable, structured shapes exactly -- a direct tool call and the adapter's generic mcp({tool, args}) proxy -- and adds best-effort guardrails for the free-form paths: a raw curl/wget DELETE against /api/histories/, and code mode's run_galaxy_tool(code=...) Python script. (Code mode isn't wired into this stack yet -- galaxy-mcp ships it behind --discovery-mode code and loom's adoption is separate -- so that branch is anticipatory.) The curl and code matchers are guardrails, not security boundaries: they catch the obvious reach-for-the-nearest-tool case, not deliberate obfuscation or a different HTTP client.

Review

The diff went through an adversarial review (a cross-family pass plus several focused lenses). It confirmed end-to-end that the gate fires against the real emitted tool names and that the always-ask-even-weak / non-interactive-deny / uncacheable / honest-wording properties hold on the reliable paths. It also drove a round of hardening to the best-effort matchers, now in this PR: string-quoted booleans (deleted='True', {"purge":"true"}), the code-mode kwargs / galaxy_-prefixed call_tool forms, purge carried in the DELETE body (not just the query), shell-variable history IDs, a whitespace-padded proxied tool name, and curl-regex over/under-matching (requires a curl/wget verb, excludes dataset-level /contents/, handles wget --method=DELETE).

Scope / follow-ups

Whole-history delete/purge only. Dataset/collection-level deletes need the missing per-dataset tool (Orbit/Galaxy MCP: no way to delete/purge datasets from a history #228).
Other in-surface mutating tools (e.g. delete_user_tool) are not yet classified -- a small follow-up to extend the catalog.
Remote/web interactive-confirm UX (today: hard-block).
Upstream, a real delete_dataset op (Orbit/Galaxy MCP: no way to delete/purge datasets from a history #228) plus populating galaxy-ops destructiveHint metadata are the root-cause fixes the classifier can later defer to instead of a hand-maintained catalog.

Tests

TDD throughout. shared/galaxy-destructive unit tests (direct / mcp-proxy / code-mode / curl vectors, including the review findings); exec-guard-policy (asks even weak, denies non-interactive, proxy + code-mode, curl); exec-guard-gate (uncacheable confirm, purge vs delete wording); web-mode-gate (hard-block incl. code-mode). Full suite green; root + app typecheck clean.

…#338) A weak model asked to delete two datasets reached for the history-level delete tool and a raw curl instead, and nothing gated it: the exec-guard's catch-all allowed every Galaxy MCP call and the web-mode gate allowed every galaxy_ tool, so a whole ~10GB history got marked deleted with no are-you-sure. This adds a shared classifier (shared/galaxy-destructive.js) that both tool_call gates consume, so "what counts as destructive" lives in one place. A whole-history delete/purge now always prompts regardless of model tier -- it deliberately skips the usual weak->deny downgrade, since the danger was silent execution, not the model proposing it -- and a non-interactive session denies because there's no one to approve. Orbit/CLI get a yes/no confirm that's never cached and never offered "allow for this session", with honest wording that distinguishes a recoverable soft-delete from an irreversible purge. Remote/web mode hard-blocks these for now; its gate has no confirmation UI, so a remote-confirm flow is left as a follow-up. The classifier covers the direct tool call, the adapter's generic mcp({tool,args}) proxy, and code mode's run_galaxy_tool(code=...) script, plus a best-effort guardrail for a raw curl/wget DELETE against /api/histories/. The curl and code matchers are guardrails, not boundaries -- they catch the obvious reach-for-the-nearest-tool case, not deliberate obfuscation. This doesn't add a per-dataset delete tool (galaxyproject#228). That missing tool is what drove the dangerous workaround in the first place, so a proper delete_dataset op upstream, plus populating galaxy-ops destructive metadata the classifier can later defer to, are the complementary root-cause fixes.

A review pass over the gate turned up a few cheap gaps in the best-effort matchers worth closing: string-quoted booleans (deleted='True', {"purge":"true"}) slipped past the bare true checks; the code-mode matcher only caught a positional call_tool('update_history', ...), so the kwargs form and the galaxy_-prefixed name got through; and normalize() didn't trim, so a whitespace-padded tool name defeated the mcp-proxy unwrap. Also lists the new galaxy:destructive category in the PolicyResult comment. The reliable structured paths were already solid -- this just tightens the free-form curl/code guardrails against ordinary (non-adversarial) shapes.

dannon marked this pull request as ready for review June 20, 2026 01:08

dannon merged commit 9b88c18 into galaxyproject:main Jun 20, 2026
3 checks passed

dannon mentioned this pull request Jun 20, 2026

Gate dataset-level Galaxy delete/purge (follow-up to #338) #345

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Confirm before destructive Galaxy history delete/purge (#338)#344

Confirm before destructive Galaxy history delete/purge (#338)#344
dannon merged 2 commits into
galaxyproject:mainfrom
dannon:fix/338-destructive-delete-confirm

dannon commented Jun 19, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

dannon commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

The problem

The fix

Review

Scope / follow-ups

Tests

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dannon commented Jun 19, 2026 •

edited

Loading