AgentToolkit · vinodmut · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026 · Jun 1, 2026
diff --git a/experiments/skill_from_trajectory.py b/experiments/skill_from_trajectory.py
diff --git a/platform-integrations/bob/evolve-lite/commands/evolve-lite-synthesize-skill.md b/platform-integrations/bob/evolve-lite/commands/evolve-lite-synthesize-skill.md
@@ -0,0 +1,4 @@
+---
+description: Convert a saved trajectory into a reusable agent skill (SKILL.md + supporting scripts) that future agents can invoke to skip rediscovered work. Use when a session captured a non-trivial workflow worth promoting from a free-text guideline to an executable skill.
+---
+Use the `evolve-lite-synthesize-skill` skill on the current conversation. Follow the skill's instructions exactly.
diff --git a/platform-integrations/bob/evolve-lite/skills/evolve-lite-synthesize-skill/SKILL.md b/platform-integrations/bob/evolve-lite/skills/evolve-lite-synthesize-skill/SKILL.md
@@ -0,0 +1,148 @@
+---
+name: evolve-lite:synthesize-skill
+description: Convert a saved trajectory into a reusable agent skill (SKILL.md + supporting scripts) that future agents can invoke to skip rediscovered work. Use when a session captured a non-trivial workflow worth promoting from a free-text guideline to an executable skill.
+---
+
+# Skill Synthesizer
+
+## Overview
+
+This skill reads a saved trajectory and produces a **reusable agent skill** — a `SKILL.md` plus any supporting scripts — that captures the *successful* workflow the session discovered. The output goes to `.evolve/skills/<skill-name>/` (canonical, evolve-managed). Future sessions on the same project can then invoke the skill directly instead of re-deriving the workflow.
+
+This is the **executable** counterpart to the `learn` skill's free-text guidelines: `learn` writes Markdown the next agent has to *read and decide what to do*; `synthesize-skill` writes a skill the next agent can simply *call*.
+
+## When To Use
+
+Use this skill when a trajectory captured:
+
+- A **non-trivial workflow** that succeeded after trial-and-error (the eventual happy path is worth promoting from free-text advice to an invocable artifact).
+- A **reusable script or command sequence** the model wrote during the session — particularly one the agent had to reconstruct over multiple attempts.
+- An environment-specific workaround (a missing system tool, a permissions wrinkle, a fallback pipeline) that future sessions in the same project will hit.
+
+Skip this skill — and let `learn` cover the case with a guideline alone — when:
+
+- The successful path was a single trivial command.
+- The workflow embeds secrets, tokens, or one-off user inputs that can't be safely generalized.
+- A skill with the same trigger already exists in `.evolve/skills/` (use `learn`'s guideline path to refine the existing skill instead of creating a duplicate).
+
+## Workflow
+
+### Step 0: Locate the Trajectory
+
+This skill runs in a forked context. **You cannot see the parent conversation directly** — read the trajectory the parent passed in via `args` or via the `Run evolve-lite:synthesize-skill on <path>` instruction.
+
+The trajectory path is either:
+
+- supplied directly as `args` to the skill invocation, or
+- stated in the parent's invocation message as `The saved trajectory path is: <path>` — take everything after the colon, strip surrounding whitespace and quotes.
+
+If neither is present, scan `.evolve/trajectories/` for the most recently modified `claude-transcript_<session-id>.jsonl` and use that. If `.evolve/trajectories/` does not exist or is empty, output zero artifacts and exit — do not invent a trajectory.
+
+**Read the trajectory with the `Read` tool — do NOT shell out.** The transcript is JSONL: one JSON object per line. Filter for `"type": "assistant"` and `"type": "human"` records and reconstruct the flow from `message.content`.
+
+### Step 1: Identify the Successful Workflow
+
+Walk the trajectory and locate the **final, working** tool sequence — the one that actually produced the answer. Distinguish it from the trial-and-error leading up to it.
+
+Capture:
+
+- **What the user asked** (the original prompt).
+- **What ultimately worked** — the exact tool calls, scripts, or command sequences that produced the answer. Quote them verbatim from the trajectory.
+- **What didn't work** — the dead-ends. You will use these to write a `Triggers` section so the future agent knows when to reach for this skill *instead of* the failing approaches.
+- **Environment assumptions** — what was missing or had to be installed (e.g. "no exiftool, pip install Pillow needed").
+
+If no clearly successful workflow is in the trajectory (the session ended without reaching an answer, or the answer came from a single trivial call), output zero artifacts and exit.
+
+### Step 2: Decide a Skill Name and Trigger
+
+The skill **name** must be:
+
+- kebab-case, action-oriented (`extract-exif-metadata`, `parse-cloudwatch-logs`, `restart-stuck-deploy`)
+- specific enough that a future agent reading just the name can guess what it does
+- not a duplicate of any existing entry under `.evolve/skills/`
+
+The skill **description** (one line, in the SKILL.md frontmatter) should describe the *task* the skill solves, not the trajectory it came from. Bad: "Solves the focal-length question from session abc123." Good: "Extract EXIF metadata (focal length, GPS, lens, timestamps) from JPEG/HEIC images using Pillow when system EXIF tools are unavailable."
+
+The **trigger** (in the SKILL.md body, under `## When To Use`) should describe the broad task context, not the narrow original request — same rule as the `learn` skill's guidelines.
+
+Before continuing, list `.evolve/skills/` (use the `Glob` tool, not `find` / `ls`) and confirm your chosen name does not collide with an existing skill.
+
+### Step 3: Draft the SKILL.md
+
+Author a SKILL.md with this exact frontmatter shape — the validator in Step 5 will reject it otherwise:
+
+```yaml
+---
+name: <kebab-case-name>
+description: <one-line task description>
+---
+
+# <Title Case Name>
+
+## Overview
+<1–2 sentences: what the skill does and when to use it>
+
+## When To Use
+- <trigger 1>
+- <trigger 2>
+
+## Workflow
+<step-by-step instructions for the agent>
+```
+
+Notes:
+
+- `context: fork` is **omitted** for synthesized skills. They run in the parent context so they can write files into the workspace and report back.
+- Do NOT inline the full successful script into the SKILL.md if it's more than ~10 lines — put it in a sibling `scripts/` file (Step 4) and reference it from the SKILL.md.
+- The Workflow section should describe what to do *to solve the task*, not retell the original session. A future agent reading this should be able to act without ever seeing the trajectory.
+
+### Step 4: Emit Supporting Scripts
+
+If the successful workflow used a non-trivial script (more than a one-liner), write it as a sibling file under `scripts/` of your draft skill directory. Use the **already-validated code from the trajectory** — do not invent variations. Strip incidental one-off inputs (literal file names, IDs, hard-coded outputs) and replace with arguments or stdin where appropriate.
+
+Common shape:
+
+```text
+.evolve/skills/<name>/
+├── SKILL.md
+└── scripts/
+    └── <action>.py     # callable as `python3 scripts/<action>.py <args>`
+```
+
+If the workflow was a sequence of shell commands rather than a script, encode it as an executable shell script (`scripts/<action>.sh`) so future agents can invoke it as a single unit instead of replaying each command.
+
+If no non-trivial script is needed (the workflow is a sequence of standard tool calls), skip this step — the SKILL.md alone is the skill.
+
+### Step 5: Finalize
+
+Place your draft files (SKILL.md and any scripts) under a temporary directory inside the workspace, e.g. `/tmp/synthesized-<name>/`, then call:
+
+```bash
+python3 .bob/skills/evolve-lite-synthesize-skill/scripts/synthesize.py finalize --src /tmp/synthesized-<name>/ --name <kebab-case-name> --trajectory <saved_trajectory_path>
+```
+
+The script will:
+
+- Validate the SKILL.md frontmatter (`name` and `description` required; `name` must match `--name`).
+- Reject the skill if a same-named skill already exists in `.evolve/skills/` (overwriting requires `--force`).
+- Copy the directory into `.evolve/skills/<name>/` (canonical).
+- Append a `synthesize_skill` event to `.evolve/audit.log` recording the new skill, the source trajectory, and the timestamp.
+- Print the destination path(s).
+
+If the validator rejects the draft, fix the SKILL.md and retry — do not edit files under `.evolve/skills/` directly.
+
+### Step 6: Confirm
+
+After the script returns, list the destination directories with the `Glob` tool to confirm the files landed. Output a short summary:
+
+- The skill name and description.
+- The destination paths.
+- A one-line note on what future sessions should now be able to do that they couldn't before.
+
+## Best Practices
+
+1. **One skill per workflow.** If the trajectory contains two unrelated successful workflows, run synthesis twice with different names — do not pack them into one skill.
+2. **Cite the trajectory.** Include the `--trajectory` flag so the audit log records provenance; future maintainers can trace the skill back to the session that produced it.
+3. **Don't promote one-shots.** A skill is worth synthesizing only if the trigger is plausibly recurring. If the trajectory looks like a one-off, prefer the `learn` skill's guideline path instead.
+4. **Don't paraphrase failure.** The skill describes what *worked*. If you find yourself writing "this skill avoids the problem where exiftool isn't installed," restate it as "uses Pillow to extract EXIF; works in environments without system EXIF tools." Triggers describe *when*, not *what failed*.
+5. **Keep scripts minimal.** Strip incidental log lines, debug prints, and validation that wasn't actually exercised in the trajectory. If a feature wasn't validated, leave it out.
diff --git a/...rm-integrations/bob/evolve-lite/skills/evolve-lite-synthesize-skill/scripts/synthesize.py b/...rm-integrations/bob/evolve-lite/skills/evolve-lite-synthesize-skill/scripts/synthesize.py
@@ -0,0 +1,171 @@
+#!/usr/bin/env python3
+"""Synthesize-skill helper: validate and install a synthesized skill.
+
+The synthesize-skill skill (a subagent) is responsible for the *judgment* —
+reading the trajectory, identifying the successful workflow, and writing
+draft SKILL.md + supporting scripts into a temporary directory.
+
+This script is the *plumbing* — it validates the draft frontmatter, copies
+the directory into both the canonical evolve-managed location and the
+platform-specific skills loader location, and writes an audit-log entry.
+
+Usage:
+    synthesize.py finalize --src <draft_dir> --name <kebab-case-name> \
+        [--trajectory <path>] [--workspace <path>] [--force]
+"""
+
+from __future__ import annotations
+
+import argparse
+import re
+import shutil
+import sys
+from pathlib import Path
+
+# Reuse the plugin's lib helpers (audit-log writer + entities-dir locator).
+_script = Path(__file__).resolve()
+_lib = None
+for _ancestor in _script.parents:
+    _candidate = _ancestor / "lib" / "evolve-lite"
+    if (_candidate / "audit.py").is_file():
+        _lib = _candidate
+        break
+if _lib is None:
+    raise ImportError(f"Cannot find plugin lib directory above {_script}")
+sys.path.insert(0, str(_lib))
+from audit import append as audit_append  # noqa: E402
+
+
+KEBAB_RE = re.compile(r"^[a-z][a-z0-9]*(?:-[a-z0-9]+)*$")
+FRONTMATTER_RE = re.compile(r"\A---\s*\n(.*?)\n---\s*\n", re.DOTALL)
+
+
+def _parse_frontmatter(skill_md: Path) -> tuple[dict[str, str], str]:
+    """Minimal YAML-ish frontmatter parser. Supports `key: value` lines only.
+
+    Returns (frontmatter_dict, body_text).
+    """
+    text = skill_md.read_text(encoding="utf-8")
+    match = FRONTMATTER_RE.match(text)
+    if not match:
+        raise ValueError(f"{skill_md} has no frontmatter block")
+    fm: dict[str, str] = {}
+    for line in match.group(1).splitlines():
+        line = line.strip()
+        if not line or line.startswith("#"):
+            continue
+        if ":" not in line:
+            raise ValueError(f"{skill_md}: malformed frontmatter line: {line!r}")
+        key, _, value = line.partition(":")
+        fm[key.strip()] = value.strip().strip('"').strip("'")
+    return fm, text[match.end() :]
+
+
+def _validate_draft(src: Path, name: str) -> None:
+    if not KEBAB_RE.match(name):
+        raise SystemExit(f"--name {name!r} is not kebab-case")
+    if not src.is_dir():
+        raise SystemExit(f"--src {src} is not a directory")
+    skill_md = src / "SKILL.md"
+    if not skill_md.is_file():
+        raise SystemExit(f"missing SKILL.md in {src}")
+
+    try:
+        fm, body = _parse_frontmatter(skill_md)
+    except ValueError as exc:
+        raise SystemExit(f"SKILL.md frontmatter is malformed: {exc}") from exc
+    if "name" not in fm or "description" not in fm:
+        raise SystemExit(f"SKILL.md frontmatter must include `name` and `description` (got: {sorted(fm.keys())})")
+    if fm["name"] != name:
+        raise SystemExit(f"frontmatter name {fm['name']!r} does not match --name {name!r}")
+    if not fm["description"]:
+        raise SystemExit("SKILL.md description is empty")
+    if len(body.strip()) < 50:
+        raise SystemExit("SKILL.md body is suspiciously short — not enough instructions to be useful")
+
+
+def _resolve_workspace(arg: str | None) -> Path:
+    if arg:
+        return Path(arg).resolve()
+    # Fall back to the current working directory; at runtime this is the
+    # workspace mounted into the sandbox (`/workspace`) or the host repo root.
+    return Path.cwd().resolve()
+
+
+def _check_dest(dst: Path, force: bool) -> None:
+    """Reject the install if dst would block it; let _copy_into do the actual write."""
+    if dst.exists() and not force:
+        raise SystemExit(f"{dst} already exists (use --force to overwrite)")
+
+
+def _copy_into(src: Path, dst: Path) -> None:
+    if dst.exists():
+        shutil.rmtree(dst)
+    shutil.copytree(src, dst)
+
+
+# Per-platform runtime mirror — the loader-discoverable directory the
+# synthesized skill is copied into so the host agent picks it up
+# automatically. Set to None where the platform doesn't have a runtime
+# skills directory; only the canonical `.evolve/skills/<name>/` write
+# happens in that case.
+_RUNTIME_MIRROR_DIR: str | None = None
+
+
+def cmd_finalize(args: argparse.Namespace) -> int:
+    src = Path(args.src).resolve()
+    name = args.name
+    workspace = _resolve_workspace(args.workspace)
+
+    _validate_draft(src, name)
+
+    evolve_dst = workspace / ".evolve" / "skills" / name
+    runtime_dst: Path | None = workspace / _RUNTIME_MIRROR_DIR / name if _RUNTIME_MIRROR_DIR is not None else None
+
+    # Pre-check both destinations before any copy so a blocked second
+    # write doesn't leave the first half of the install on disk.
+    _check_dest(evolve_dst, args.force)
+    if runtime_dst is not None:
+        _check_dest(runtime_dst, args.force)
+
+    _copy_into(src, evolve_dst)
+    if runtime_dst is not None:
+        _copy_into(src, runtime_dst)
+
+    audit_append(
+        project_root=str(workspace),
+        event="synthesize_skill",
+        skill=name,
+        evolve_path=str(evolve_dst.relative_to(workspace)),
+        runtime_path=str(runtime_dst.relative_to(workspace)) if runtime_dst else "",
+        trajectory=args.trajectory or "",
+    )
+
+    print(f"Installed skill {name!r}:")
+    print(f"  evolve:  {evolve_dst}")
+    if runtime_dst is not None:
+        print(f"  runtime: {runtime_dst}")
+    return 0
+
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    sub = parser.add_subparsers(dest="cmd", required=True)
+
+    p_finalize = sub.add_parser(
+        "finalize",
+        help="Validate a draft skill directory and install it under .evolve/skills/ (and the platform's runtime skills dir, if any).",
+    )
+    p_finalize.add_argument("--src", required=True, help="Draft directory containing SKILL.md and any scripts/")
+    p_finalize.add_argument("--name", required=True, help="Kebab-case skill name; must match SKILL.md frontmatter")
+    p_finalize.add_argument("--trajectory", default="", help="Source trajectory path (recorded in audit.log)")
+    p_finalize.add_argument("--workspace", default=None, help="Project root (defaults to CWD)")
+    p_finalize.add_argument("--force", action="store_true", help="Overwrite existing skill of the same name")
+    p_finalize.set_defaults(func=cmd_finalize)
+
+    args = parser.parse_args(argv)
+    return args.func(args)
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())