Skip to content

Robot data engineer skills (parent + transforms + semantic-layer + viz)#12

Merged
escherize merged 31 commits into
mainfrom
robot-data-scientist-skills
Jun 3, 2026
Merged

Robot data engineer skills (parent + transforms + semantic-layer + viz)#12
escherize merged 31 commits into
mainfrom
robot-data-scientist-skills

Conversation

@escherize

@escherize escherize commented Jun 1, 2026

Copy link
Copy Markdown
Contributor

Builds in 4 new skills:

  1. robot-data-engineer -> the parent skill, that guides you through an e2e workflow
  2. data-transformation Raw data -> clean tables
  3. semantic-layer Clean tables -> reusable definitions
  4. data-analysis Clean tables -> answers and reports

Adds a skill quality linter

Makes sure the files dont get too long, runs on CI.


Below is AI generated:


Bundles the robot data scientist skill suite into the CLI: a non-technical domain user, driving mb through Claude, goes from raw data → analysis-ready tables → a shared semantic vocabulary → dashboards, without leaving the conversation.

Four skills, one parent + three children:

  • parent — orchestrator: sequences transforms → semantic-layer → viz, sets the autonomy slider, owns the final hard-stop gate. (how to launch TBD — mb robot-data-engineer?)
  • transforms — raw → clean, wide, analysis-ready tables (adapt Timothy Dean's data-transformation skill; partly covered by the existing bundled transform skill)
  • semantic-layer — clean tables → reusable segments / measures / metrics (done, see below)
  • viz / dashboards — analysis-ready tables → charts and dashboards

Building all four on this branch; merging once the suite is coherent.


semantic-layer (done)

Why

The CLI already ships a transform skill (raw → clean wide tables). The next step — turning those tables into a shared vocabulary the org reuses — had no skill.

The vocabulary maps to three Metabase features the CLI already has verbs for:

  • segment (mb segment create) — a saved filter ("active customers")
  • measure (mb measure create) — a saved aggregation ("net revenue")
  • metric (mb card create, type: metric) — an official, collection-living number ("MRR")

Approach

Modeled on the data-transformation skill — hard-rules-vs-prudential-calls split, quiet-investigate → propose-in-plain-language → iterate → build-verify-handback, audience built for a non-technical domain user.

Three decisions worth calling out:

  1. Teach the Metabase words, skip the deep-internals jargon. segment/measure/metric are product vocabulary the user should learn (glossed once); common data words (table, column, foreign key, schema, join) are fine; only deep-internals (grain, cardinality, surrogate key, table_id) are avoided.
  2. Reach constraints drive the architecture. Segments/measures only work on a question built directly on their table (no joins, no nesting); metrics are data-source-bound the same way. Hard rule: a definition that needs more than one table → widen the table first (a transform), never smuggle a join in. This is why semantic-layer runs after transforms.
  3. Autonomy slider + invariant hard stop. User picks check-in frequency (check-everything / balanced / just-go); two things never bend — unsure → ask, and a final plain-language hard stop before anything is treated as published.

Validation

Verified the three create-verbs and their definition shapes against a live staging instance (segment = flat MBQL 5 query; measure = single aggregation; metric = card type: metric). Doc claims linked inline.

Changes

  • New skill-data/semantic-layer/SKILL.md — auto-discovered by the skill loader's dir scan; no registration code.
  • skill-data/core/SKILL.md — added to the specialized-skills list.
  • README.md — bundled-skills table.
  • tests/e2e/skills.e2e.test.ts — golden list (six → seven).

typecheck + format:check pass.

escherize and others added 2 commits June 1, 2026 12:02
Add a semantic-layer skill that turns clean, analysis-ready tables into
reusable Metabase segments (saved filters), measures (saved aggregations),
and metrics (official numbers) for a non-technical domain user.

- New skill-data/semantic-layer/SKILL.md (auto-discovered by the skill loader)
- Cross-reference it from the core skill's specialized-skills list
- Document it in the README bundled-skills table
- Add it to the e2e bundled-skill golden list (now seven)
Add the higher-level data-transformation workflow skill: raw, normalized
source database -> a small set of clean, wide, analysis-ready Metabase
transforms, for a non-technical domain user. Wraps the mechanical transform
skill with an investigate -> propose -> build -> verify flow.

- New skill-data/data-transformation/SKILL.md (auto-discovered)
- Cross-reference from the core skill's specialized-skills list
- README bundled-skills table
- e2e golden list (seven -> eight)

Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
@escherize

Copy link
Copy Markdown
Contributor Author

TODO before merge: revisit naming. Parent skill working title is robot-data-engineer (placeholder) — decide the final name with Timothy. Candidates floated: robot-data-engineer, analyst / data-analyst (role-framed). Also reconcile the pre-existing viz vs visualization mismatch (core SKILL.md calls it viz; the bundled skill + e2e golden list use visualization).

escherize and others added 6 commits June 1, 2026 12:16
Add the front-door router for the robot-data-scientist journey: a light
wrapper that detects where the user is (raw data / clean tables / ready to
chart), sets up auth + the autonomy slider once, then routes to the
specialized child skill (data-transformation / semantic-layer / visualization)
and hands off. Stays small by design — it dispatches, it doesn't do the work.

Parent owns only the end-of-journey hard stop; children self-manage their
in-stage gates. Name is a working title (robot-data-engineer), TBD before merge.

- New skill-data/robot-data-engineer/SKILL.md (auto-discovered)
- Cross-reference from the core skill's specialized-skills list
- README bundled-skills table
- e2e golden list (eight -> nine)
Sync Timothy's latest revision: two new hard rules (confirm non-obvious
business rules in plain terms before baking them in; flag sensitive personal
data rather than silently carrying it), a sensitive-data prudential call, and
expanded guidance on decoding, soft-delete filtering, writing table/column
descriptions back to Metabase, and one-pass encoding normalization.

Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
Sync Timothy's latest revision: a new hard rule against overwriting an
existing table or another transform's output (check the target name is free
first), table-name agreement in the iterate phase (propose + confirm free
before building), and a new cleaning checklist section whose governing rule
is surface-what-you-find rather than silently fix it.

Co-authored-by: Timothy Dean <7650347+galdre@users.noreply.github.com>
Add a strategy-vs-mechanics carve-out to the trigger clause of the two
strategy skills so the model picks the right altitude:

- data-transformation: points single-transform work at the transform skill
- semantic-layer: points raw segment/measure command mechanics at core

Mirror transform's existing downward ref to core with an upward
breadcrumb to data-transformation in its body.
New data-analysis sub-skill covers the fourth journey stage: answering
real questions from clean tables and handing back a written report
(distinct from charting, which stays in visualization). Wire it into the
robot-data-engineer router's description, journey list, and route table.

Also fixes a latent parse bug in the router frontmatter: an unquoted
"light router: it works" made the YAML parser read the description as a
mapping, so parseFrontmatter returned null and discoverSkills silently
dropped the skill -- robot-data-engineer never appeared in `mb skills
list`. Reworded the colon to an em-dash.
@escherize escherize changed the title Robot data scientist skills (parent + transforms + semantic-layer + viz) Robot data engineer skills (parent + transforms + semantic-layer + viz) Jun 1, 2026

@ignacio-mb ignacio-mb left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very comprehensive

Comment thread skill-data/data-transformation/SKILL.md
Comment thread skill-data/data-transformation/SKILL.md Outdated
Comment thread skill-data/data-transformation/SKILL.md Outdated
Comment thread skill-data/data-transformation/SKILL.md
Comment thread skill-data/data-transformation/SKILL.md Outdated
Comment thread skill-data/data-transformation/SKILL.md Outdated
escherize and others added 2 commits June 2, 2026 09:22
Hoist the cross-cutting rules every child skill must follow into a single
Shared Contract section in robot-data-engineer: audience, jargon list
(avoid normalize/grain; ERD/foreign key fine; explain wide/long on first
use), PII handling (ask before showing rows; default to aggregates),
capability limits (name what the CLI can't do instead of erroring into raw
SQL), the autonomy slider, and the final hard stop.

Each child (data-analysis, data-transformation, semantic-layer,
visualization) gets a top-of-file up-pointer: a one-line summary plus an
instruction to load the router's Shared Contract. The summary stands on
its own so a directly-invoked child still gets the gist if the pointer is
skipped. Drop the duplicated autonomy-slider prompt from semantic-layer,
keeping only its stage-specific application of the modes.
Comment on lines +67 to +69
> - I found a mismatch in ...
> - This matters because ...
> - Here's what I was thinking, but I need to check ...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lovely

escherize added 2 commits June 2, 2026 12:44
Add three cross-cutting rules to the router's Shared Contract, drawn from
two live demo runs (Swoogo, Luma):

- Permission-denied discipline: on a denied query, stop -- never silently
  substitute a different readable table and pass its numbers off as the
  answer (the incident where an Account-table question got answered with
  Salesforce data). Diagnose the likely cause in plain terms, offer to
  search for a readable look-alike, surface any match as a confirm
  question, and hand control back -- no GRANT statements, no
  profile-switching Claude can't reliably execute.
- Scratch files go in ./.scratch, never /tmp (better perms, persists,
  user-reviewable). Swept the /tmp examples in core, transform, document,
  and mbql to match.
- Talking to the user: don't reference things they never saw, assume they
  read only the last ~30 lines, give questions full context, keep
  permission requests to one plain sentence.

Rework the router's discovery section to ask the user where the data lives
before crawling (asymmetry: name a db -> ask the schema; name a table ->
ask the db), give the efficient command ladder, and offer a sync when a
table is missing.

De-duplicate auth: core's Auth & profiles section is the single source;
the router keeps one line (it's the front door, may run before core
loads) and data-transformation defers to core.
Wire skillsaw (uvx) as a deterministic linter for the skill collection and
clear every warning it reported:

- Content quality: reword two weak-language hedges (ideally/correctly) to
  concrete behavior; flip the two negative-only "Don't" items (mbql,
  robot-data-engineer) to lead with the positive action.
- Descriptions: compress the four over-long ones (robot-data-engineer,
  mbql, semantic-layer, data-analysis) under the 1024-char / 200-token
  limits, keeping the distinctive trigger phrases and dropping only
  redundant ones. No unquoted colon-space (would break frontmatter parse).
- Bodies: a précis pass over the seven over-budget skills -- cut restated
  lead-ins, filler transitions, emphasis padding, and prose that merely
  restated an adjacent code block. Every rule, command, footgun, and
  worked example is kept; the dense skills were already mostly substance.

Add .skillsaw.yaml pinned to 0.11.4 with an honest token ceiling
(skill.warn 5100 -- above the largest leaf skill's de-fluffed floor,
still catching real future bloat) and skill-description.warn 200. Add a
strict skillsaw job to the Lint workflow.
@escherize escherize marked this pull request as ready for review June 2, 2026 18:46
escherize and others added 10 commits June 2, 2026 12:48
Timothy's "Robot Data Analysts should give more context" (964b272) added a
"Questions must carry their own context" paragraph that overlapped a bullet
I'd added in the same Shared Contract. Keep his fuller version (it carries
the recap template) as canonical, drop the redundant bullet, and point to
it from the "Talking to the user" list so the rule lives in one place.
The whole-journey router was buried at the bottom of core's
specialized-skills list, ranked as a peer of git-sync/mbql, and the
autoloaded discovery stub only pointed at core. An outcome-seeking
user ("make sense of my data", "build a dashboard") had no direct path
to the router that's meant to run first.

- Stub: add a journey-intent fast path straight to
  `mb skills get robot-data-engineer` before loading the dense core ref.
- Core: hoist robot-data-engineer to the top of the list with a
  "start here for anything bigger than one command" lead-in, add
  data-analysis to its routing targets, drop the "name TBD" marker.
- README: drop "name TBD" from the bundled-skills table.
escherize and others added 9 commits June 2, 2026 16:47
- context-budget warn 5100 -> 6000 (data-transformation's honest floor
  grew to ~5,805 tokens).
- Reword 'flag it when appropriate' -> 'flag it on sight' to drop the
  hedging the content-weak-language rule flags.
Ships the robot-data-engineer entrypoint promotion. release.yml
auto-publishes on push to main only when package.json's version is
not yet on npm, so the bump is required for the skill changes to
reach installed CLIs via mb skills get.
An unaware user describes a goal ('make sense of my data', 'be my
data analyst'), and Claude matches it against the plugin description
to decide relevance. The old description was CRUD/CLI-only, so a
journey-shaped request matched nothing. Lead with the journey trigger
phrases (mirrored from robot-data-engineer); keep CRUD + git-sync as
the second half.
'analyze X' over-triggers — it matches any analysis request (logs,
code, a CSV, an image), not just Metabase data work. The remaining
data-anchored phrases ('make sense of my data', 'answer questions
about my data', 'report on who registered', 'set up analytics for X')
already cover the intent without the false positives.
The data-analysis skill was added to skill-data/ but the e2e test's
BUNDLED_VISIBLE_NAMES still listed nine skills, so list/path/get-all
assertions and the unknown-skill 'available' message failed across all
E2E matrix lanes.
v58-61 leak the app-DB constraint (NULL not allowed for column
"DATABASE_ID"); head validates at the query layer first (missing or
invalid Database ID). Accept either exact substring.
head validates dataset_query at the query layer (exit 1, missing or
invalid Database ID); v58-61 accept it as an opaque map (exit 0). Assert
the pre-flight bypass instead of a fixed server outcome.
@escherize escherize merged commit b1e3d6d into main Jun 3, 2026
15 checks passed
@escherize escherize deleted the robot-data-scientist-skills branch June 3, 2026 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants