Skip to content

Adopt galaxy-mcp code-mode (draft, needs live testing)#118

Draft
dannon wants to merge 8 commits into
galaxyproject:mainfrom
dannon:feature/galaxy-mcp-code-mode
Draft

Adopt galaxy-mcp code-mode (draft, needs live testing)#118
dannon wants to merge 8 commits into
galaxyproject:mainfrom
dannon:feature/galaxy-mcp-code-mode

Conversation

@dannon

@dannon dannon commented May 22, 2026

Copy link
Copy Markdown
Member

Draft -- not ready to merge. Needs live-Galaxy verification (see below).

Summary

Switches Loom's Galaxy MCP integration to galaxy-mcp's code-mode surface
to cut ~10-15k tokens per turn off the request. Instead of shipping the
~40 named tools (history/dataset/job/tool/workflow/IWC/user-tool ops) as
the full catalog, the server collapses to three meta-tools (search,
get_schema, run_galaxy_tool) toggled by GALAXY_MCP_DISCOVERY_MODE=code.

Server side already shipped in galaxy-mcp 1.4.0; this PR is the Loom-side
wiring + prompt rewrite + opt-out knob.

What changed

  • bin/loom.js seeds mcp.json with galaxy-mcp[code-mode]>=1.4.0 and
    GALAXY_MCP_DISCOVERY_MODE=code when credentials are present.
  • extensions/loom/profiles.ts:syncMcpConfig mirrors the same shape on
    every profile save/switch and defensively rewrites args so old
    installs that pre-date discovery mode get the [code-mode] extra on
    next reconnect (otherwise the server throws RuntimeError at import).
  • shared/loom-config.{js,d.ts} gain a galaxy.discoveryMode field
    (default "code", opt back into "full" to restore the named-tool
    catalog). Malformed values fall through to "code".
  • extensions/loom/context.ts Galaxy guidance block rewritten to teach
    the meta-tool surface (search -> get_schema -> run_galaxy_tool),
    including UDT lifecycle and IWC search. Routing logic (heavy ->
    Galaxy, light -> local, UDT for glue) unchanged.
  • docs/agent/galaxy-routing.md + docs/agent/commands.md updated to
    match.
  • docs/agent/gotchas.md documents the "full" opt-out.
  • tests/discovery-mode.test.ts covers the defaulter.

Why draft

Everything below still needs to happen on a live Galaxy server:

  • Boot Loom against a real Galaxy, confirm exactly three Galaxy
    tools appear in the MCP handshake (search, get_schema,
    run_galaxy_tool).
  • End-to-end smoke: ask "list my histories", confirm the agent calls
    search -> get_schema -> run_galaxy_tool({ name: "get_histories" })
    and gets back a sensible answer.
  • Measure per-turn input tokens before/after on a representative
    short question. Expectation: 10-15k drop. Numbers go in this PR
    description when collected.
  • Sanity-check the prompt rewrite -- BM25 search quality on a few
    realistic queries (alignment, variant calling, IWC RNA-seq).
  • Verify the "full" opt-out path still works for users who hit
    a search-quality miss.

Test plan (local, already passing)

  • npm run typecheck clean
  • npx tsc --noEmit in app/ clean
  • tests/discovery-mode.test.ts 5/5
  • tests/profiles.test.ts 13/13 (touched by syncMcpConfig)
  • Full root npm test -- 181 pass, 2 failures in
    agent-manager.test.ts are pre-existing on main (Electron mock
    issue, unrelated)

Out of scope

  • Loom's six in-process tools (gtn_*, skills_fetch,
    galaxy_invocation_*) -- registered via pi.registerTool, not MCP,
    so they don't ride this transform. ~1k tokens combined; not worth a
    separate code-mode pass.
  • Server-side galaxy-mcp changes (already shipped in 1.4.0).
  • BRC Analytics MCP config (unchanged).

dannon added 4 commits May 22, 2026 15:38
Introduces a getDiscoveryMode() helper that reads galaxy.discoveryMode
from ~/.loom/config.json -- "code" by default, "full" to opt back into
the legacy named-tool catalog. Malformed values fall through to "code"
so a hand-edited typo can't silently re-bloat the per-turn token budget.
No call sites yet; the actual mcp.json wiring lands in the next commit.
bin/loom.js seeds mcp.json with the galaxy-mcp[code-mode] extra and
GALAXY_MCP_DISCOVERY_MODE=code when credentials are present. The
profiles syncMcpConfig path mirrors the same shape on every profile
save/switch, defensively rewriting args so older installs that pre-date
discovery mode get the [code-mode] extra on next reconnect instead of
hitting the server's RuntimeError at import time. The discovery mode is
read from config so a "full" override flows through both paths.
The agent now sees three meta-tools (search / get_schema /
run_galaxy_tool) instead of the ~40 named ones, so the system-prompt
block and the topic docs need to teach that surface explicitly:
search to discover, get_schema before the first call so parameter
names aren't guessed, run_galaxy_tool to invoke. Routing logic
(heavy -> Galaxy, light -> local, UDT for glue) is unchanged --
only the tool-call surface flips. UDT lifecycle, IWC search, and
invocation-recording guidance now read via the meta-tool shape.
Adds a gotchas section explaining the discovery-mode trade-off (token
budget vs search quality) and the one-liner config edit
(galaxy.discoveryMode = "full") that flips Loom back to the legacy
named-tool surface. Standalone commit so it can be cherry-picked or
reverted without touching the wiring.
@dannon dannon marked this pull request as ready for review May 26, 2026 13:39
dannon added 3 commits May 26, 2026 09:43
Auto-merge across release/install pipeline, error humanizer, version
check, and the pi.dev rebrand. No code-mode logic affected.
bin/loom.js (initial mcp.json template) and profiles.ts (syncMcpConfig
rewrite) both build the uvx package spec from the discovery mode using
identical `galaxy-mcp[code-mode]>=1.4.0` literals. Lifting the spec
into a tiny helper next to getDiscoveryMode means the version
constraint lives in one place; if we bump it later, no risk of one
site drifting from the other.
@dannon dannon marked this pull request as draft May 27, 2026 19:43
Code-mode routes every Galaxy call through one meta-tool,
galaxy_run_galaxy_tool({ name, args }), so the tool_* events all carry the
meta-tool name instead of the real tool. That silently broke the bits that key
off the tool name: the galaxy_create_history hook never set currentHistoryId
(so /status loses its history line and the init-gate nags), the galaxy_connect
hooks went dead, and -- worse -- the credential redaction stopped firing on
connect/set_profile, so an api_key could land in activity.jsonl in plaintext.

Added a small resolver that recovers the underlying tool from the meta-tool args
(present on tool_execution_start, remembered per toolCallId for the end/result
hooks). Named mode is unchanged. Redaction now resolves the same way before the
whole-object guard, and the status label shows the real tool name in code-mode.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant