Skip to content

Orbit as a Galaxy analysis workbench (standalone + Galaxy Interactive Tool)#330

Open
dannon wants to merge 19 commits into
galaxyproject:mainfrom
dannon:feat/orbit-gxit-page-sync
Open

Orbit as a Galaxy analysis workbench (standalone + Galaxy Interactive Tool)#330
dannon wants to merge 19 commits into
galaxyproject:mainfrom
dannon:feat/orbit-gxit-page-sync

Conversation

@dannon

@dannon dannon commented Jun 19, 2026

Copy link
Copy Markdown
Member

What this does

Runs the existing LOOM_MODE=remote Orbit web shell as a per-user agentic analysis workbench, from one image, two ways: standalone (docker run) and as a Galaxy Interactive Tool -- Galaxy launches a per-user container, injects the server URL + a scoped API key, and proxies the user in.

The point is that durable artifacts live in Galaxy: tools/workflows run as Galaxy jobs (outputs are history datasets), and the analysis notebook persists as a per-history Galaxy Page that resumes on relaunch. Meant as a proving ground for ideas that can fold back into Galaxy's own assistant, not a replacement.

What's in here

Notebook -> Galaxy Page persistence (extensions/loom/galaxy-page-sync.ts)

  • env-gated (LOOM_GALAXY_PAGE_SYNC=auto), brain-side (pi's rpc dispatch is closed, so the shell just sets the env at spawn and drains the brain on shutdown)
  • per-history page identity (orbit-<historyId>), current history via most_recently_used
  • debounced push, self-trigger dedupe, fail-open; resume looks the page up by id (a slug 400s on GET /pages/{id})
  • wired into session-lifecycle.ts; enabled + SIGTERM-drained in web/server.ts

Galaxy Interactive Tool wrapper (gxit/) -- tool XML (scoped inject="api_key" + $__galaxy_url__, requires_domain entry point) + a deploy guide.

Container fixes (Dockerfile)

  • ship web/auth.ts + web/rpc-guard.ts in the runner -- the LOOM_MODE=remote image didn't actually boot without them (ERR_MODULE_NOT_FOUND)
  • force TMPDIR=/tmp (Galaxy injects the host's TMPDIR, which tsx can't use)
  • bundle uv + pre-warm galaxy-mcp so the Galaxy tool/workflow execution surface loads in-container (node:slim has no Python/uvx)

Remote-mode tool gate (web/extensions/web-mode-gate.ts) -- on a fresh container the galaxy_* MCP tools don't register as direct tools (needs a warm per-server cache, which the per-user key invalidates each launch), so the only path is the mcp proxy. Allow it, scoped to the curated servers (galaxy / brc-analytics); discovery passes, an unscoped call is denied with a message the agent self-corrects on.

Bring-your-own LLM key (web/llm-credentials.ts, web/server.ts, web shim + renderer) -- when the remote shell starts with no provider key in its env, the renderer prompts for a provider + key, the server holds it in memory for the session (never persisted, logged, or echoed -- only a hasApiKey boolean crosses to the renderer) and spawns the brain with it. An admin-injected env key still auto-spawns exactly as before. So the workbench can spread without an admin-baked key. (Standard providers; custom/openai-compatible BYO is a follow-up.)

Testing

New unit tests for the page-sync engine, the gate's proxy scoping, the history helper, and the LLM-credential helpers; full suite green, typechecks clean. Verified live as a real Galaxy Interactive Tool (Galaxy 26.1, interactive tools + gx-it-proxy + Docker): the notebook persisted to a per-history Page, and the agent -- running entirely in the container -- invoked Galaxy MCP through the scoped proxy and ran an upload_file_from_url job that landed a dataset. The BYO-key flow was verified at the WebSocket-protocol level: no env key -> the server reports hasApiKey:false and holds off spawning until the key is provided; an admin env key -> hasApiKey:true + auto-spawn; the key never reaches the logs or config.json.

Notes

  • Draft: builds on the shipped LOOM_MODE=remote shell; branch is based on an older main and will want a rebase. Follow-ups: wire BYO for custom/openai-compatible providers (baseUrl + key), and a browser-level eyeball of the BYO overlay.
  • Sizable but cohesive. The Dockerfile boot-fix is independently valuable and could be split out as a fast standalone fix if you'd prefer.

dannon added 17 commits June 18, 2026 23:17
Import initGalaxyPageSync and flushNotebookToGalaxy into the lifecycle, and
call them at the appropriate hooks: init resumes the per-history page on
launch, and flush guarantees one final push (including any session-summary
block) on shutdown.
Live testing against test.galaxyproject.org showed resume-on-launch silently
failed: a fresh container only knows the derived slug (orbit-<historyId>), but
Galaxy's GET /pages/{id} 400s on a slug. List the history's pages, match the
slug, and resume by the real page id. Without this a fresh container also can't
update the existing page -- the next push would try to create a duplicate slug.
Tool XML (scoped api_key + $__galaxy_url__ injection, port 3000, ALLOW_INSECURE
trust model, explicit server start command) and a deploy README covering image
build, registration, the admin-key-via-job_conf path, a custom-provider option,
and the trust model.

Claude-Session: https://claude.ai/code/session_015LfwTStrqyxDTCv19PbTkT
The container never booted: web/server.ts imports ./auth.js and ./rpc-guard.js
(added by the auth-hardening commit), but the runner stage's COPY list was never
updated to include them -> ERR_MODULE_NOT_FOUND /app/web/auth.js at startup.
Caught by the Plan 3 container smoke test; verified the image now boots, serves
the prebuilt renderer, and the agent responds.
Live GxIT testing on a local Galaxy surfaced two things: Galaxy injects the
host TMPDIR into the container (a macOS /var/folders path that tsx can't use ->
EACCES), so the command forces TMPDIR=/tmp; and node:22-slim has no uvx, so
galaxy-mcp (tool/workflow execution) fails in-container while notebook->Page
persistence (direct API) still works. Documented both in the deploy guide.
The prior commit documented that node:22-slim has no Python or uvx, so the
agent's Galaxy tool/workflow surface (galaxy-mcp, launched via `uvx
galaxy-mcp>=1.8.0`) never started in the container and Galaxy tools silently
vanished. Fix it: copy uv from Astral's published image, point its cache,
python-install, and tool dirs at a node-owned /opt/uv, and pre-warm
`uv tool install galaxy-mcp>=1.8.0` at build so the package and a managed
Python are baked in. Verified the server now launches and answers an MCP
initialize fully offline (--network none), resolving entirely from the baked
cache -- so it works even in a network-locked job container. Notebook->Page
persistence was already fine (direct API); this restores the execution half.
Live GxIT testing surfaced a second blocker after uvx: the agent could
connect to galaxy-mcp but couldn't invoke any of its tools. On a fresh
container pi-mcp-adapter never registers the galaxy_*/brc_analytics_* direct
tools -- that needs a warm per-server metadata cache, and the per-user scoped
GALAXY_API_KEY changes the cache's config hash every launch, so it's always
cold. The only path to those servers is then the single `mcp` proxy tool,
which the gate denied by default ("mcp is not available in remote mode").

Allow the `mcp` proxy, but scope it: a tool call (or connect) must target one
of the curated servers (galaxy, brc-analytics), mirroring what ALLOWED_PREFIXES
already permits for direct tools; read-only discovery (search/describe/list/
status/ui-messages) passes through. A call with no server is unverifiable, so
it's denied with a message telling the agent to set it -- which the model then
does on its own. Verified live in the GxIT: the agent retrieved 42 histories
and ran an upload_file_from_url job that landed a dataset, both through the
scoped proxy.
@dannon dannon force-pushed the feat/orbit-gxit-page-sync branch from 8cb8a86 to cae0ebe Compare June 19, 2026 09:33
@dannon dannon marked this pull request as ready for review June 19, 2026 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants