Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
33 changes: 33 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,38 @@
# Changelog

## 0.21.8 — 2026-05-26

### `bstack skills audit` — skill registry audit (Phase 6c)

New subcommand `bstack skills audit` — crystallizes the "Skill Registry Audit" pattern (bstack-engine candidate ledger, 3/3 instances: Steipete's skill-cleaner + the 2026-05-25 manual inventory + P7 Freshness as a degenerate single-dimension case). Adapts Steipete's skill-cleaner (steipete/agent-scripts) algorithm for Claude Code + bstack.

Five reports:
1. **Budget** — total description token cost (`ceil(utf8_bytes / chars_per_token)`, identical to Steipete) vs a ceiling (default 20,000 = 2% of 1M). Flags over-budget.
2. **Duplicates** — same skill name across >1 distinct realpath (symlink-deduped, so the `.agents` ↔ workspace symlink case doesn't false-positive).
3. **Registry coherence** — `companion-skills.yaml` vs installed roots: registered-but-missing + installed-but-unregistered.
4. **Unused** — no invocation trace in recent Claude Code session logs (`~/.claude/projects/**/*.jsonl`, replacing Codex's `~/.codex/history.jsonl`); `--months` window; `--no-logs` to skip.
5. **Roots** — skill count per root.

`--json` for machine output. Env-overridable (`BSTACK_AUDIT_ROOTS`, `BSTACK_DIR`, `BSTACK_AUDIT_LOG_GLOB`) for hermetic testing.

### First real-run findings (2026-05-26, against the 3 broomva skill roots)

- 362 skills, 331 unique names across `~/.claude/skills` + `~/.agents/skills` + `~/broomva/skills`
- Description budget at **223% of the 2% ceiling** (44,681 tokens / 20,000) — corroborates the 2026-05-25 skill-cleaner reading (269% over the broader Codex+plugin set)
- 31 cross-root duplicates (mostly snapshot-in-.claude + source-in-workspace — expected, but worth surfacing)

### Files

- **NEW** `scripts/skill-audit.py` (~250 lines) — the auditor; pyyaml-based; env-overridable for hermetic tests
- **NEW** `tests/skill-audit.test.sh` — 9 hermetic tests (fake roots + registry + logs; realpath-dedupe, duplicate detection, registry coherence, unused detection, budget flag). All pass.
- **CHANGED** `bin/bstack-skills` — `audit)` dispatch + usage
- **VERSION** `0.21.7` → `0.21.8`

This completes the skills-monorepo meta-tooling: `graduate` (migrate skills in) + `audit` (keep the registry healthy).

---


## 0.21.7 — 2026-05-26

### `bstack skills graduate` — crystallized Tier-2 migration (Phase 6b)
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
0.21.7
0.21.8
5 changes: 5 additions & 0 deletions bin/bstack-skills
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,17 @@ Subcommands:
graduate <name> [options] Migrate a standalone broomva/<name> skill repo
into the broomva/skills Tier-2 monorepo
(≥ 0.21.7). Run `bstack skills graduate --help`.
audit [--json] [--no-logs] Registry audit: budget, duplicates, registry
coherence, unused, roots (≥ 0.21.8).
help | --help This message

Examples:
bstack skills install
bstack skills install --required-only
bstack skills install --dry-run
bstack skills status --json
bstack skills audit
bstack skills audit --no-logs --json
bstack skills graduate handoff --category lifecycle --dry-run
EOF
}
Expand Down Expand Up @@ -271,6 +275,7 @@ case "${1:-}" in
status) shift; cmd_status "$@" ;;
list) shift; cmd_list "$@" ;;
graduate) shift; exec "$BSTACK_DIR/scripts/skill-graduate.sh" "$@" ;;
audit) shift; exec python3 "$BSTACK_DIR/scripts/skill-audit.py" "$@" ;;
-h|--help|help|"") usage ;;
*)
echo "bstack-skills: unknown subcommand '$1'" >&2
Expand Down
253 changes: 253 additions & 0 deletions scripts/skill-audit.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,253 @@
#!/usr/bin/env python3
"""skill-audit.py — skill registry audit (bstack v0.21.8).

Invoked as: `bstack skills audit [options]`

Crystallizes the "Skill Registry Audit" pattern (bstack-engine candidate ledger,
3/3 instances: Steipete's skill-cleaner + the 2026-05-25 manual inventory +
P7 Freshness as a degenerate single-dimension case). Adapts Steipete's
skill-cleaner (steipete/agent-scripts) algorithm for Claude Code + bstack:

- token math identical: ceil(utf8_bytes / chars_per_token)
- realpath-dedupe of symlinked roots (the .agents <-> workspace symlink case)
- usage-trace scanning of Claude Code logs (~/.claude/projects/**/*.jsonl)
rather than Codex's ~/.codex/history.jsonl

Five reports:
1. Budget — total description token cost vs ceiling (default 2% of 1M)
2. Duplicates — same skill name across >1 distinct realpath
3. Registry — coherence between companion-skills.yaml and installed roots
(registered-but-missing, installed-but-unregistered)
4. Unused — no invocation trace in recent session logs (--months window)
5. Roots — skill count per root

Env overrides (test fixtures):
BSTACK_DIR bstack root (for default companion-skills.yaml)
BSTACK_AUDIT_ROOTS colon-separated skill roots (overrides defaults)
BSTACK_AUDIT_LOG_GLOB glob for session logs (default ~/.claude/projects/**/*.jsonl)
"""
from __future__ import annotations

import argparse
import glob
import json
import math
import os
import re
import sys
from pathlib import Path

try:
import yaml
except ImportError:
print("skill-audit: python3 yaml module required (pip install pyyaml)", file=sys.stderr)
sys.exit(2)

HOME = Path.home()
DEFAULT_ROOTS = [
HOME / ".claude" / "skills",
HOME / ".agents" / "skills",
Path(os.environ.get("BROOMVA_ROOT", HOME / "broomva")) / "skills",
]


def parse_frontmatter(skill_md: Path) -> dict:
"""Extract YAML frontmatter (name, description) from a SKILL.md."""
try:
text = skill_md.read_text(encoding="utf-8", errors="replace")
except OSError:
return {}
if not text.startswith("---"):
return {}
end = text.find("\n---", 3)
if end == -1:
return {}
block = text[3:end]
try:
data = yaml.safe_load(block)
return data if isinstance(data, dict) else {}
except yaml.YAMLError:
return {}


def token_cost(text: str, chars_per_token: int) -> int:
"""Codex-identical: ceil(utf8_bytes / chars_per_token)."""
if not text:
return 0
return math.ceil(len(text.encode("utf-8")) / chars_per_token)


def discover_skills(roots: list[Path]) -> list[dict]:
"""Walk roots for */SKILL.md (one level deep + monorepo skills/<name>/).
realpath-dedupe so a symlinked root doesn't double-count.
"""
seen_realpaths: set[str] = set()
skills: list[dict] = []
for root in roots:
if not root.is_dir():
continue
# Each immediate child dir with a SKILL.md is a skill.
for child in sorted(root.iterdir()):
skill_md = child / "SKILL.md"
if not skill_md.is_file():
continue
rp = os.path.realpath(skill_md)
if rp in seen_realpaths:
continue
seen_realpaths.add(rp)
fm = parse_frontmatter(skill_md)
name = fm.get("name", child.name)
desc = fm.get("description", "") or ""
if isinstance(desc, list):
desc = " ".join(str(d) for d in desc)
skills.append({
"name": str(name),
"dir_name": child.name,
"root": str(root),
"path": str(skill_md),
"realpath": rp,
"desc_chars": len(str(desc)),
"description": str(desc),
})
return skills


def load_registry(yaml_path: Path) -> list[dict]:
if not yaml_path.is_file():
return []
try:
data = yaml.safe_load(yaml_path.read_text(encoding="utf-8"))
except (OSError, yaml.YAMLError):
return []
return data.get("skills", []) if isinstance(data, dict) else []


def scan_usage(skill_names: list[str], log_glob: str, months: int) -> set[str]:
"""Return the set of skill names with an invocation trace in recent logs.
Heuristic (matches Steipete): a name appears as `$<name>`, `--skill <name>`,
or `skills/<name>/SKILL.md` in a session JSONL within the window.
"""
import time
cutoff = time.time() - months * 31 * 24 * 3600
used: set[str] = set()
# Build one combined regex of all names (word-boundary-ish).
if not skill_names:
return used
patterns = {n: re.compile(
r"(?:\$" + re.escape(n) + r"\b|--skill\s+" + re.escape(n) + r"\b|skills/" + re.escape(n) + r"/SKILL\.md)"
) for n in skill_names}
for fpath in glob.glob(log_glob, recursive=True):
try:
if os.path.getmtime(fpath) < cutoff:
continue
with open(fpath, "r", encoding="utf-8", errors="replace") as fh:
blob = fh.read()
except OSError:
continue
for n, pat in patterns.items():
if n in used:
continue
if pat.search(blob):
used.add(n)
return used


def main() -> int:
ap = argparse.ArgumentParser(prog="bstack skills audit", description="Skill registry audit.")
ap.add_argument("--roots", action="append", default=[], help="Additional skill root (repeatable).")
ap.add_argument("--budget-tokens", type=int, default=20000, help="Token budget ceiling (default 20000 = 2%% of 1M).")
ap.add_argument("--chars-per-token", type=int, default=4, help="Token-cost divisor (default 4).")
ap.add_argument("--months", type=int, default=3, help="Usage-trace window for unused detection (default 3).")
ap.add_argument("--no-logs", action="store_true", help="Skip usage-trace scanning.")
ap.add_argument("--json", action="store_true", help="Machine-readable output.")
args = ap.parse_args()

# Resolve roots: env override > --roots > defaults.
if os.environ.get("BSTACK_AUDIT_ROOTS"):
roots = [Path(p) for p in os.environ["BSTACK_AUDIT_ROOTS"].split(":") if p]
else:
roots = list(DEFAULT_ROOTS)
roots += [Path(p) for p in args.roots]

bstack_dir = Path(os.environ.get("BSTACK_DIR", Path(__file__).resolve().parent.parent))
registry = load_registry(bstack_dir / "references" / "companion-skills.yaml")

skills = discover_skills(roots)
names = sorted({s["name"] for s in skills})

# 1. Budget. Clamp chars_per_token to >=1 so a bad flag can't ZeroDivision.
cpt = max(1, args.chars_per_token)
total_tokens = sum(token_cost(s["description"], cpt) for s in skills)
budget_used_ratio = (total_tokens / args.budget_tokens) if args.budget_tokens else 0.0

# 2. Duplicates — same name across >1 distinct realpath
by_name: dict[str, list[dict]] = {}
for s in skills:
by_name.setdefault(s["name"], []).append(s)
duplicates = {n: v for n, v in by_name.items() if len({x["realpath"] for x in v}) > 1}

# 3. Registry coherence
reg_names = {r["name"] for r in registry if "name" in r}
installed_names = set(names)
registered_missing = sorted(reg_names - installed_names)
installed_unregistered = sorted(installed_names - reg_names)

# 4. Unused
log_glob = os.environ.get("BSTACK_AUDIT_LOG_GLOB", str(HOME / ".claude" / "projects" / "**" / "*.jsonl"))
unused: list[str] = []
if not args.no_logs:
used = scan_usage(names, log_glob, args.months)
unused = sorted(set(names) - used)

# 5. Roots
root_counts: dict[str, int] = {}
for s in skills:
root_counts[s["root"]] = root_counts.get(s["root"], 0) + 1

if args.json:
print(json.dumps({
"total_skills": len(skills),
"unique_names": len(names),
"budget": {"total_tokens": total_tokens, "ceiling": args.budget_tokens, "used_ratio": round(budget_used_ratio, 3)},
"duplicates": {n: [x["path"] for x in v] for n, v in duplicates.items()},
"registry": {"registered_missing": registered_missing, "installed_unregistered": installed_unregistered},
"unused": unused,
"roots": root_counts,
}, indent=2))
return 0

# Human report
print("# Skill Audit Report\n")
print(f"discovered: {len(skills)} skills ({len(names)} unique names) across {len([r for r in roots if r.is_dir()])} roots\n")
print("## Budget")
print(f" description tokens : {total_tokens:,} / {args.budget_tokens:,} ceiling ({budget_used_ratio*100:.1f}%)")
if budget_used_ratio > 1.0:
print(f" ⚠ OVER BUDGET by {(budget_used_ratio-1)*100:.1f}% — consider trimming descriptions or pruning unused skills")
print()
print(f"## Duplicates ({len(duplicates)})")
if duplicates:
for n, v in sorted(duplicates.items()):
print(f" {n}:")
for x in v:
print(f" - {x['path']}")
else:
print(" (none)")
print()
print("## Registry coherence")
print(f" registered but NOT installed ({len(registered_missing)}): {', '.join(registered_missing) or '(none)'}")
print(f" installed but NOT registered ({len(installed_unregistered)}): {', '.join(installed_unregistered) or '(none)'}")
print()
if args.no_logs:
print("## Unused\n (skipped — --no-logs)")
else:
print(f"## Unused (no trace in last {args.months}mo) [{len(unused)}]")
print(f" {', '.join(unused) or '(none — all skills show recent usage)'}")
print()
print("## Roots")
for r, c in sorted(root_counts.items()):
print(f" {c:3d} {r}")
return 0


if __name__ == "__main__":
sys.exit(main())
Loading
Loading