Robot data engineer skills (parent + transforms + semantic-layer + viz) by escherize · Pull Request #12 · metabase/metabase-cli

escherize · 2026-06-01T18:04:37Z

Builds in 4 new skills:

robot-data-engineer -> the parent skill, that guides you through an e2e workflow
data-transformation Raw data -> clean tables
semantic-layer Clean tables -> reusable definitions
data-analysis Clean tables -> answers and reports

Adds a skill quality linter

Makes sure the files dont get too long, runs on CI.

Below is AI generated:

Bundles the robot data scientist skill suite into the CLI: a non-technical domain user, driving mb through Claude, goes from raw data → analysis-ready tables → a shared semantic vocabulary → dashboards, without leaving the conversation.

Four skills, one parent + three children:

parent — orchestrator: sequences transforms → semantic-layer → viz, sets the autonomy slider, owns the final hard-stop gate. (how to launch TBD — mb robot-data-engineer?)
transforms — raw → clean, wide, analysis-ready tables (adapt Timothy Dean's data-transformation skill; partly covered by the existing bundled transform skill)
semantic-layer — clean tables → reusable segments / measures / metrics (done, see below)
viz / dashboards — analysis-ready tables → charts and dashboards

Building all four on this branch; merging once the suite is coherent.

semantic-layer (done)

Why

The CLI already ships a transform skill (raw → clean wide tables). The next step — turning those tables into a shared vocabulary the org reuses — had no skill.

The vocabulary maps to three Metabase features the CLI already has verbs for:

segment (mb segment create) — a saved filter ("active customers")
measure (mb measure create) — a saved aggregation ("net revenue")
metric (mb card create, type: metric) — an official, collection-living number ("MRR")

Approach

Modeled on the data-transformation skill — hard-rules-vs-prudential-calls split, quiet-investigate → propose-in-plain-language → iterate → build-verify-handback, audience built for a non-technical domain user.

Three decisions worth calling out:

Teach the Metabase words, skip the deep-internals jargon. segment/measure/metric are product vocabulary the user should learn (glossed once); common data words (table, column, foreign key, schema, join) are fine; only deep-internals (grain, cardinality, surrogate key, table_id) are avoided.
Reach constraints drive the architecture. Segments/measures only work on a question built directly on their table (no joins, no nesting); metrics are data-source-bound the same way. Hard rule: a definition that needs more than one table → widen the table first (a transform), never smuggle a join in. This is why semantic-layer runs after transforms.
Autonomy slider + invariant hard stop. User picks check-in frequency (check-everything / balanced / just-go); two things never bend — unsure → ask, and a final plain-language hard stop before anything is treated as published.

Validation

Verified the three create-verbs and their definition shapes against a live staging instance (segment = flat MBQL 5 query; measure = single aggregation; metric = card type: metric). Doc claims linked inline.

Changes

New skill-data/semantic-layer/SKILL.md — auto-discovered by the skill loader's dir scan; no registration code.
skill-data/core/SKILL.md — added to the specialized-skills list.
README.md — bundled-skills table.
tests/e2e/skills.e2e.test.ts — golden list (six → seven).

typecheck + format:check pass.

Add a semantic-layer skill that turns clean, analysis-ready tables into reusable Metabase segments (saved filters), measures (saved aggregations), and metrics (official numbers) for a non-technical domain user. - New skill-data/semantic-layer/SKILL.md (auto-discovered by the skill loader) - Cross-reference it from the core skill's specialized-skills list - Document it in the README bundled-skills table - Add it to the e2e bundled-skill golden list (now seven)

Add the higher-level data-transformation workflow skill: raw, normalized source database -> a small set of clean, wide, analysis-ready Metabase transforms, for a non-technical domain user. Wraps the mechanical transform skill with an investigate -> propose -> build -> verify flow. - New skill-data/data-transformation/SKILL.md (auto-discovered) - Cross-reference from the core skill's specialized-skills list - README bundled-skills table - e2e golden list (seven -> eight) Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>

escherize · 2026-06-01T18:13:36Z

TODO before merge: revisit naming. Parent skill working title is robot-data-engineer (placeholder) — decide the final name with Timothy. Candidates floated: robot-data-engineer, analyst / data-analyst (role-framed). Also reconcile the pre-existing viz vs visualization mismatch (core SKILL.md calls it viz; the bundled skill + e2e golden list use visualization).

Add the front-door router for the robot-data-scientist journey: a light wrapper that detects where the user is (raw data / clean tables / ready to chart), sets up auth + the autonomy slider once, then routes to the specialized child skill (data-transformation / semantic-layer / visualization) and hands off. Stays small by design — it dispatches, it doesn't do the work. Parent owns only the end-of-journey hard stop; children self-manage their in-stage gates. Name is a working title (robot-data-engineer), TBD before merge. - New skill-data/robot-data-engineer/SKILL.md (auto-discovered) - Cross-reference from the core skill's specialized-skills list - README bundled-skills table - e2e golden list (eight -> nine)

Sync Timothy's latest revision: two new hard rules (confirm non-obvious business rules in plain terms before baking them in; flag sensitive personal data rather than silently carrying it), a sensitive-data prudential call, and expanded guidance on decoding, soft-delete filtering, writing table/column descriptions back to Metabase, and one-pass encoding normalization. Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>

Sync Timothy's latest revision: a new hard rule against overwriting an existing table or another transform's output (check the target name is free first), table-name agreement in the iterate phase (propose + confirm free before building), and a new cleaning checklist section whose governing rule is surface-what-you-find rather than silently fix it. Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>

Add a strategy-vs-mechanics carve-out to the trigger clause of the two strategy skills so the model picks the right altitude: - data-transformation: points single-transform work at the transform skill - semantic-layer: points raw segment/measure command mechanics at core Mirror transform's existing downward ref to core with an upward breadcrumb to data-transformation in its body.

New data-analysis sub-skill covers the fourth journey stage: answering real questions from clean tables and handing back a written report (distinct from charting, which stays in visualization). Wire it into the robot-data-engineer router's description, journey list, and route table. Also fixes a latent parse bug in the router frontmatter: an unquoted "light router: it works" made the YAML parser read the description as a mapping, so parseFrontmatter returned null and discoverSkills silently dropped the skill -- robot-data-engineer never appeared in `mb skills list`. Reworded the colon to an em-dash.

ignacio-mb

Very comprehensive

Hoist the cross-cutting rules every child skill must follow into a single Shared Contract section in robot-data-engineer: audience, jargon list (avoid normalize/grain; ERD/foreign key fine; explain wide/long on first use), PII handling (ask before showing rows; default to aggregates), capability limits (name what the CLI can't do instead of erroring into raw SQL), the autonomy slider, and the final hard stop. Each child (data-analysis, data-transformation, semantic-layer, visualization) gets a top-of-file up-pointer: a one-line summary plus an instruction to load the router's Shared Contract. The summary stands on its own so a directly-invoked child still gets the gist if the pointer is skipped. Drop the duplicated autonomy-slider prompt from semantic-layer, keeping only its stage-specific application of the modes.

escherize · 2026-06-02T18:29:48Z

+> - I found a mismatch in ...
+> - This matters because ...
+> - Here's what I was thinking, but I need to check ...


Add three cross-cutting rules to the router's Shared Contract, drawn from two live demo runs (Swoogo, Luma): - Permission-denied discipline: on a denied query, stop -- never silently substitute a different readable table and pass its numbers off as the answer (the incident where an Account-table question got answered with Salesforce data). Diagnose the likely cause in plain terms, offer to search for a readable look-alike, surface any match as a confirm question, and hand control back -- no GRANT statements, no profile-switching Claude can't reliably execute. - Scratch files go in ./.scratch, never /tmp (better perms, persists, user-reviewable). Swept the /tmp examples in core, transform, document, and mbql to match. - Talking to the user: don't reference things they never saw, assume they read only the last ~30 lines, give questions full context, keep permission requests to one plain sentence. Rework the router's discovery section to ask the user where the data lives before crawling (asymmetry: name a db -> ask the schema; name a table -> ask the db), give the efficient command ladder, and offer a sync when a table is missing. De-duplicate auth: core's Auth & profiles section is the single source; the router keeps one line (it's the front door, may run before core loads) and data-transformation defers to core.

Wire skillsaw (uvx) as a deterministic linter for the skill collection and clear every warning it reported: - Content quality: reword two weak-language hedges (ideally/correctly) to concrete behavior; flip the two negative-only "Don't" items (mbql, robot-data-engineer) to lead with the positive action. - Descriptions: compress the four over-long ones (robot-data-engineer, mbql, semantic-layer, data-analysis) under the 1024-char / 200-token limits, keeping the distinctive trigger phrases and dropping only redundant ones. No unquoted colon-space (would break frontmatter parse). - Bodies: a précis pass over the seven over-budget skills -- cut restated lead-ins, filler transitions, emphasis padding, and prose that merely restated an adjacent code block. Every rule, command, footgun, and worked example is kept; the dense skills were already mostly substance. Add .skillsaw.yaml pinned to 0.11.4 with an honest token ceiling (skill.warn 5100 -- above the largest leaf skill's de-fluffed floor, still catching real future bloat) and skill-description.warn 200. Add a strict skillsaw job to the Lint workflow.

Timothy's "Robot Data Analysts should give more context" (964b272) added a "Questions must carry their own context" paragraph that overlapped a bullet I'd added in the same Shared Contract. Keep his fuller version (it carries the recap template) as canonical, drop the redundant bullet, and point to it from the "Talking to the user" list so the rule lives in one place.

The whole-journey router was buried at the bottom of core's specialized-skills list, ranked as a peer of git-sync/mbql, and the autoloaded discovery stub only pointed at core. An outcome-seeking user ("make sense of my data", "build a dashboard") had no direct path to the router that's meant to run first. - Stub: add a journey-intent fast path straight to `mb skills get robot-data-engineer` before loading the dense core ref. - Core: hoist robot-data-engineer to the top of the list with a "start here for anything bigger than one command" lead-in, add data-analysis to its routing targets, drop the "name TBD" marker. - README: drop "name TBD" from the bundled-skills table.

- context-budget warn 5100 -> 6000 (data-transformation's honest floor grew to ~5,805 tokens). - Reword 'flag it when appropriate' -> 'flag it on sight' to drop the hedging the content-weak-language rule flags.

Ships the robot-data-engineer entrypoint promotion. release.yml auto-publishes on push to main only when package.json's version is not yet on npm, so the bump is required for the skill changes to reach installed CLIs via mb skills get.

An unaware user describes a goal ('make sense of my data', 'be my data analyst'), and Claude matches it against the plugin description to decide relevance. The old description was CRUD/CLI-only, so a journey-shaped request matched nothing. Lead with the journey trigger phrases (mirrored from robot-data-engineer); keep CRUD + git-sync as the second half.

'analyze X' over-triggers — it matches any analysis request (logs, code, a CSV, an image), not just Metabase data work. The remaining data-anchored phrases ('make sense of my data', 'answer questions about my data', 'report on who registered', 'set up analytics for X') already cover the intent without the false positives.

The data-analysis skill was added to skill-data/ but the e2e test's BUNDLED_VISIBLE_NAMES still listed nine skills, so list/path/get-all assertions and the unknown-skill 'available' message failed across all E2E matrix lanes.

v58-61 leak the app-DB constraint (NULL not allowed for column "DATABASE_ID"); head validates at the query layer first (missing or invalid Database ID). Accept either exact substring.

head validates dataset_query at the query layer (exit 1, missing or invalid Database ID); v58-61 accept it as an opaque map (exit 0). Assert the pre-flight bypass instead of a fixed server outcome.

escherize and others added 2 commits June 1, 2026 12:02

escherize and others added 6 commits June 1, 2026 12:16

Tweaks to data-transformation skill

8cb6ef9

escherize changed the title ~~Robot data scientist skills (parent + transforms + semantic-layer + viz)~~ Robot data engineer skills (parent + transforms + semantic-layer + viz) Jun 1, 2026

ignacio-mb approved these changes Jun 2, 2026

View reviewed changes

Comment thread skill-data/data-transformation/SKILL.md

Comment thread skill-data/data-transformation/SKILL.md Outdated

Comment thread skill-data/data-transformation/SKILL.md Outdated

ignacio-mb reviewed Jun 2, 2026

View reviewed changes

Comment thread skill-data/data-transformation/SKILL.md

ignacio-mb reviewed Jun 2, 2026

View reviewed changes

Comment thread skill-data/data-transformation/SKILL.md Outdated

ignacio-mb reviewed Jun 2, 2026

View reviewed changes

Comment thread skill-data/data-transformation/SKILL.md Outdated

escherize and others added 2 commits June 2, 2026 09:22

Robot Data Analysts should give more context

964b272

escherize commented Jun 2, 2026

View reviewed changes

escherize added 2 commits June 2, 2026 12:44

escherize marked this pull request as ready for review June 2, 2026 18:46

escherize and others added 10 commits June 2, 2026 12:48

Add lint:skills script (skillsaw via uvx)

7fc2c63

remove overfitting, eg fivetran mention

6bccd2f

Plan mode in data-transformation

2a56836

m.

5f2a391

More review feedback.

2af86a2

Last piece of feedback

2e1bc02

Manual shrinking of data-transformation skill

15edc3a

Possible pretty-print transform fix

83282c2

escherize and others added 9 commits June 2, 2026 16:47

Clear skillsaw warnings: bump skill token limit, drop hedge

e420a65

- context-budget warn 5100 -> 6000 (data-transformation's honest floor grew to ~5,805 tokens). - Reword 'flag it when appropriate' -> 'flag it on sight' to drop the hedging the content-weak-language rule flags.

Release 0.1.11

9c37035

Ships the robot-data-engineer entrypoint promotion. release.yml auto-publishes on push to main only when package.json's version is not yet on npm, so the bump is required for the skill changes to reach installed CLIs via mb skills get.

Fix skills e2e: add data-analysis to bundled skill list

dd75b07

The data-analysis skill was added to skill-data/ but the e2e test's BUNDLED_VISIBLE_NAMES still listed nine skills, so list/path/get-all assertions and the unknown-skill 'available' message failed across all E2E matrix lanes.

Tighten robot-data-engineer scope

a181d3e

Fix card e2e: tolerate version-dependent bad-Database-ID error

b38bf25

v58-61 leak the app-DB constraint (NULL not allowed for column "DATABASE_ID"); head validates at the query layer first (missing or invalid Database ID). Accept either exact substring.

Fix card update e2e: tolerate version-dependent PUT validation

4657be2

head validates dataset_query at the query layer (exit 1, missing or invalid Database ID); v58-61 accept it as an opaque map (exit 0). Assert the pre-flight bypass instead of a fixed server outcome.

format test files

58fa27a

escherize merged commit b1e3d6d into main Jun 3, 2026
15 checks passed

escherize deleted the robot-data-scientist-skills branch June 3, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Robot data engineer skills (parent + transforms + semantic-layer + viz)#12

Robot data engineer skills (parent + transforms + semantic-layer + viz)#12
escherize merged 31 commits into
mainfrom
robot-data-scientist-skills

escherize commented Jun 1, 2026 •

edited

Loading

Uh oh!

escherize commented Jun 1, 2026

Uh oh!

ignacio-mb left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

escherize Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

escherize commented Jun 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Builds in 4 new skills:

Adds a skill quality linter

semantic-layer (done)

Why

Approach

Validation

Changes

Uh oh!

escherize commented Jun 1, 2026

Uh oh!

ignacio-mb left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

escherize Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

escherize commented Jun 1, 2026 •

edited

Loading