diff --git a/docs/specs/agentic-workflow-designer.md b/docs/specs/agentic-workflow-designer.md deleted file mode 100644 index 5b31884..0000000 --- a/docs/specs/agentic-workflow-designer.md +++ /dev/null @@ -1,460 +0,0 @@ -# Engineering Specification — Agentic Workflow Designer - -**Status:** Draft for review -**Author:** Mason core -**Target:** Mason v1.5.x (phased; see Rollout) -**Last updated:** 2026-06-11 - ---- - -## 1. Summary - -Add a visual, node-based **Workflow Designer** to Mason — a drag-and-drop canvas (in the spirit of n8n / ComfyUI) where users compose multi-model agentic pipelines out of **Cells**. Each Cell selects a model, a subset of available tools (built-in + MCP + UC MCP), and a prompt. Cells are wired together with edges: an edge from Cell A to Cell B means *A runs first, and A's output is injected into B's context*. Edges can also express **feedback loops** (B sends results back to A for revision) and **review gates** (a cell decides whether the workflow ends or routes work back for another pass). - -The designer is opened from a new button in the sidebar, directly **above the Profile section**. It replaces the chat pane with a full-pane canvas view, following the same view-swapping pattern as Dashboards/Settings/Onboarding. - -Everything executes through the existing Databricks AI Gateway plumbing — per-model format routing, OAuth, streaming, MCP tool dispatch, Anthropic prompt caching — none of which changes. The workflow engine is a thin orchestrator that runs the existing per-turn agent loop once per cell, in graph order. - -### Motivating user story (acceptance scenario) - -> I click **Workflow Designer**. A designer pane opens in the chat window. I create a cell, select **Fable 5**, pick a couple of MCP tools, and write a prompt with the high-level goals and specs of the project. I create a second cell with **Opus 4.8**, a different toolset, and an additional prompt. I drag a line from the Fable cell to the Opus cell — meaning the Fable cell runs first and its output feeds the Opus cell. I create a third cell named **"unit tests"** with a Sonnet model. The Fable cell feeds its unit-test specs to the unit-tests cell via a second line, and the unit-tests cell *also* receives a line from the Opus cell (two inputs) so it can run Opus's work against the spec sheet. The unit-tests cell has a **feedback** line back to the Opus cell: it reports which tests passed and what gaps remain, and Opus iterates until they're closed. When the unit-tests cell deems the work complete, it hands off to the Fable cell for **final review**. Fable either ends the session or passes the work back to Opus for another round. - -Section 12 walks this scenario through the spec end-to-end as the primary acceptance test. - ---- - -## 2. Current-state evaluation (what we're building on) - -A short audit of the parts of Mason this feature touches, and the constraints they impose. - -### 2.1 Architecture facts that shape this design - -| Fact | Where | Consequence for the designer | -|---|---|---| -| Renderer is **script-mode TypeScript** — no bundler, no imports; modules share one global scope and load via `