CraftKit is a cross-agent toolkit for creating, improving, and operationalizing prompts and skills for coding agents such as Claude Code and Codex.
Prompt assets and agent skills often become fragmented, provider-specific, and hard to reuse. CraftKit exists to keep them file-first, portable, reviewable, and easy to improve over time.
All seven skills install as Claude Code custom slash commands.
npx skills add sungjunlee/craftkitAdd -g -y for global install without prompts:
npx skills add sungjunlee/craftkit -g -y/plugin marketplace add https://github.com/sungjunlee/craftkit.git
/plugin install craftkit@craftkit
Install from a local clone
git clone https://github.com/sungjunlee/craftkit.git
cd craftkit
npx skills add . -g -yFor Codex or any other agent, see Use in other agents below.
| Skill | Use when |
|---|---|
craft-prompt |
a new prompt is needed from scratch for any LLM (Claude, GPT, Gemini, Perplexity, etc.) |
craft-scaffold |
a rough idea needs structure — goals, inputs, workflow, outputs — before implementation |
craft-critique |
a prompt or skill "feels off" and a diagnostic pass should come before any rewrite |
craft-tune |
an existing prompt is close but needs targeted, minimal-diff sharpening |
craft-survey |
a new skill should be grounded in prior art, extracting only patterns that carry their weight |
craft-autoresearch |
a prompt or skill works "sometimes" and needs eval-driven iteration |
craft-handoff |
a session is ending and the next session needs a copy-paste-ready continuation prompt (clipboard + optional auto-load on /clear) |
Each skill lives at skills/<skill-name>/SKILL.md — plain markdown with YAML frontmatter, loadable as a Claude Code skill or copy-pasteable into any other agent.
Six of the seven skills (craft-prompt, craft-scaffold, craft-critique, craft-tune, craft-survey, craft-autoresearch) have been optimized through craft-autoresearch passes against eval suites — including craft-autoresearch itself (reflexive meta-pass). craft-handoff is new and has not yet been through an autoresearch pass. Per-session baseline → kept-state scores and mutation rationale live in the commit bodies. Run artifacts are preserved at ~/.craftkit/autoresearch/<skill>/<date-slug>/ outside the repo.
- generating new prompts from scratch (task, research, session handoff, templates)
- prompt design and restructuring
- reusable skill design
- diagnostic critique and quality checks
- iterative improvement loops
- survey-backed best practices
- copy-pasteable outputs for agent workflows
- File-first and diff-friendly
- Small composable units
- Explicit inputs and outputs
- Cross-agent portability (core skill spines stay provider-neutral; platform-specific detail stays in sub-skills like
craft-prompt/guides/) - Eval-driven improvement when possible
- Copy-pasteable results over fancy abstractions
CraftKit skills are plain markdown with YAML frontmatter, so they port easily:
- Open the relevant
SKILL.md. - Paste the body (everything after the frontmatter) into the target agent's system prompt or instructions.
- Keep the frontmatter
descriptionline as context so the agent knows when to apply the skill.
See docs/examples/tune-a-prompt.md for a walk-through of critiquing an existing prompt, tuning it, and running a short improvement loop.
sungjunlee/prompt-builder— predecessor project. Its mature prompt-authoring asset (5-step process, 6 building blocks, platform guides, templates) was absorbed wholesale intocraft-prompt. Kept on GitHub for reference; new work happens here.karpathy/autoresearch— Andrej Karpathy's ML training-loop project that introduced the autoresearch methodology (give an agent a baseline, let it experiment overnight, keep what improves, discard what doesn't).craft-autoresearchadapts that loop discipline to prompt and skill artifacts instead of model training code.byungjunjang/jangpm-meta-skills— four-skill meta toolkit for Claude Code and Codex (blueprint,deep-dive,reflect,autoresearch). Itsautoresearchskill contributed implementation patterns — experiment contract shape, the three-eval-type taxonomy (binary / comparative / fidelity), deletion discipline — thatcraft-autoresearchbuilds on.
MIT — see LICENSE.