Jar ecosystem hardening by AP3X-Dev · Pull Request #2 · AP3X-Dev/skill-jar

AP3X-Dev · 2026-06-13T04:42:39Z

Summary by CodeRabbit

New Features
- Added comprehensive ecosystem map documentation describing skill routing, dependencies, and usage patterns.
Documentation
- Enhanced skill guidance across 20+ skills with stricter process rules and pressure-resistance tables.
- Updated bug tracking and audit findings for ecosystem documentation accuracy.
- Documented completion of 23 skill-forge verification runs with COMPLY verdicts.
- Refined template naming across subagent references for consistency.
Chores
- Updated tracker state reflecting completed audit cycles and forged skill status.

…ch, add ecosystem map Deep cross-skill audit of all 23 skills. Applied the highest-leverage fixes; filed the rest to agent-state for follow-up cycles. Structural gate stays green. Fixes: - Align bundled subagent-template names in the reaper/backfill/drift kits with the manifest/install role names (dead-code-reaper-*, test-backfill-*, arch-drift-watcher). The unprefixed names had drifted from the SKILL.md install lines and broke the agents/README prefixed-naming policy; they live inside fenced blocks so the audit gate never saw them. - sprint-ticket-runner: add an explicit launch gate (offers launch, never auto-launches) plus a stop condition. It was the only loop skill that would auto-launch code-writing makers with no termination clause. - Normalize the lone ../references/state-templates.md link to ./ to match its siblings (same resolved target). Docs and state: - Add docs/ecosystem-map.md: intent routing, the two pipeline backbones, the autonomy ladder and human gate, a bundled-vs-external dependency matrix, the 23-skill relationship table, the shared-state map, and the gates note. - File remaining findings to triage-inbox.md (F-1..F-12), audit-policy proposals to decisions.md (HD-1..HD-5), and record completed work plus open tasks in completed.md and loop-state.md. Gate: python scripts/audit-jar.py -> 208 checks, 0 failed. Verified by a separate checker (maker != checker).

The new docs/ecosystem-map.md was unreachable from any entry point. Point the "For agents" paragraph at it so a programmatic reader finds cross-skill routing, the pipeline backbones, and the autonomy ladder next to skills.json.

…delete-safety) Autonomous jar-audit cycle over the four filed Open Tasks; each fixed by a maker and verified by a separate checker (maker != checker). - instrument-observability: add a "When NOT to use" boundary (diagnose-loop / optimization-loop / host bugfix) plus a description NOT-for clause, and a handoff noting its telemetry feeds production-readiness's launch gate. - autonomous-advisor + clean-room: reframe MemBerry / memberry-setup as an optional persistence adapter (clean skip on absence) instead of a hard halt, matching optimization-loop; fix duplicate list numbering. - improve-architecture + dead-code-reaper: name arch-drift-watch as the upstream detector; test-backfill-loop: name agent-state/BUG_TRACKER.md as the canonical suspected-bug sink. - plan-prune: a planning doc may be deleted only once git already holds it; untracked or dirty docs are archived or blocked instead. skills.json regenerated for the instrument-observability description change. Remaining findings F-4..F-11 stay in triage-inbox; completed.md and loop-state.md updated. A checker rejected one inaccurate cross-reference in clean-room mid-cycle; it was corrected and re-verified. Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.

…ap MemBerry row) Hunter swept the surfaces changed this effort plus cross-file consistency and filed one LOW defect: the docs/ecosystem-map.md §4 MemBerry dependency row still pointed at "open findings" (F-1/F-2/F-12) that were closed earlier this cycle. Fixer changed the cell to "optimization-loop, autonomous-advisor, clean-room (all optional)". A separate Validator confirmed the symptom is gone, the new cell matches the implemented optional posture in both skills, no other stale pointer references a closed finding (the F-5 "external" note is correctly still open), scope is one table cell, and the gate is green. BUG-001 -> verified. Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.

…SF-023) Batch 1 of the forge queue (forger != judge throughout). - SF-005 clean-room: 3 independent judges ran the firewall/parity-mode pressure scenario against the patched skill and all returned COMPLY -- the 8 captured rationalizations are refused and the reclassify-to-Transparent escape is closed. SF-005 -> forged (3/3). - SF-023 instrument-observability: a forger applied the GREEN patch closing the captured RED rationalizations (non-waivable investigation gate; high-cardinality identifier governance across tags/extra/context/span; logger-not-a-substitute for the sensitive-surface map; full smoke checklist; an 8-row pressure table) in a 45/+2- diff with the frontmatter description unchanged; 3 independent judges then returned COMPLY. SF-023 -> forged (3/3). Tracker, run packages, completed.md and loop-state.md updated. Forge queue: 6 of 23 forged; SF-006..022 + SF-021 remain pending-red (multi-batch). Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.

Ran the rest of the pending-red queue (SF-006..022, SF-021) as one concurrent RED -> GREEN -> judge x3 pipeline per skill (forger != judge, disjoint files). Every skill surfaced a real shortcut under pressure, was patched to refuse the named dodges (a "Known pressure rationalizations" table plus hard-rule tightening), and passed 3 independent judges. Frontmatter descriptions were not touched, so skills.json stays in sync. Spot-checked the add-to-jar and production-readiness diffs -- sane, on-topic rule tightening, no scope creep. - 17/17 forged 3/3; 0 loopholes, 0 needs-stronger-scenario. - Per-skill run packages written under agent-state/skill-forge-runs/. - Tracker rows set to forged and the queue table de-fragmented (SF-022/023 were orphaned below the rules prose). Completes the forge queue (23/23 forged) and the authorized "all three loops until done" rotation (jar-audit + bug-pipeline + skill-forge). Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.

coderabbitai · 2026-06-13T04:42:50Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df37da16-fcd0-49a5-8e4e-34d272fefb2c

📥 Commits

Reviewing files that changed from the base of the PR and between 8d9d273 and abba2d5.

📒 Files selected for processing (52)

README.md
agent-state/BUG_TRACKER.md
agent-state/SKILL_FORGE_TRACKER.md
agent-state/completed.md
agent-state/decisions.md
agent-state/loop-state.md
agent-state/skill-forge-runs/add-to-jar.md
agent-state/skill-forge-runs/api-design.md
agent-state/skill-forge-runs/clean-room.md
agent-state/skill-forge-runs/data-store-selection.md
agent-state/skill-forge-runs/dead-code-reaper.md
agent-state/skill-forge-runs/design-panel.md
agent-state/skill-forge-runs/design-system.md
agent-state/skill-forge-runs/diagnose-loop.md
agent-state/skill-forge-runs/improve-architecture.md
agent-state/skill-forge-runs/instrument-observability.md
agent-state/skill-forge-runs/loop-engineer.md
agent-state/skill-forge-runs/optimization-loop.md
agent-state/skill-forge-runs/plan-prune.md
agent-state/skill-forge-runs/production-readiness.md
agent-state/skill-forge-runs/review-panel.md
agent-state/skill-forge-runs/skill-forge.md
agent-state/skill-forge-runs/sprint-ticket-runner.md
agent-state/skill-forge-runs/test-backfill-loop.md
agent-state/skill-forge-runs/unit-test-quality.md
agent-state/triage-inbox.md
development/add-to-jar/SKILL.md
development/arch-drift-watch/references/drift-kit.md
development/autonomous-advisor/SKILL.md
development/clean-room/SKILL.md
development/dead-code-reaper/SKILL.md
development/dead-code-reaper/references/reaper-kit.md
development/design-panel/SKILL.md
development/diagnose-loop/SKILL.md
development/improve-architecture/SKILL.md
development/instrument-observability/SKILL.md
development/loop-engineer/SKILL.md
development/loop-engineer/references/loop-architecture.md
development/optimization-loop/SKILL.md
development/plan-prune/SKILL.md
development/review-panel/SKILL.md
development/skill-forge/SKILL.md
development/sprint-ticket-runner/SKILL.md
development/test-backfill-loop/SKILL.md
development/test-backfill-loop/references/backfill-kit.md
development/unit-test-quality/SKILL.md
docs/ecosystem-map.md
skills.json
systems-design/api-design/SKILL.md
systems-design/data-store-selection/SKILL.md
systems-design/design-system/SKILL.md
systems-design/production-readiness/SKILL.md

📝 Walkthrough

Walkthrough

This PR formalizes the skill jar's ecosystem knowledge, completes a large skill forge cycle (SF-005..023 → 23/23 forged), and systematically hardens skill documentation across 20+ development and systems-design skills by adding pressure-rationalization tables, stricter gates, and MemBerry optionality clarifications.

Changes

Ecosystem Knowledge, Audit Closure, and Forge Tracking

Layer / File(s)	Summary
Ecosystem map and audit framework `docs/ecosystem-map.md`, `agent-state/decisions.md`, `skills.json`, `README.md`	New authoritative ecosystem routing documentation defines skill selection, composition, autonomy posture, and the `audit-jar.py` verification gate; audit decisions document that the ecosystem map is the durable home for jar knowledge and that template `name:` values were aligned to manifest roles; instrument-observability description clarified to exclude debugging loops.
Audit findings and cycle completion `agent-state/BUG_TRACKER.md`, `agent-state/triage-inbox.md`, `agent-state/loop-state.md`, `agent-state/completed.md`	Closes ecosystem-audit-1 cycle with new Sweep 2 (BUG-001 documenting stale MemBerry "see open findings" pointer now fixed), adds triage findings F-4, F-5, F-6, F-7, F-9, F-11 with verification commands, and updates cycle status ledger and completion journal with 23/23 forge outcomes.
Skill forge queue completion (SF-005..023) `agent-state/SKILL_FORGE_TRACKER.md`, `agent-state/skill-forge-runs/*.md`	Updates tracker to mark all 19 skills as forged with 3/3 clean runs; documents 18 skill-specific forge run reports (add-to-jar, api-design, clean-room, data-store-selection, dead-code-reaper, design-panel, design-system, diagnose-loop, improve-architecture, instrument-observability, loop-engineer, optimization-loop, plan-prune, production-readiness, review-panel, skill-forge, sprint-ticket-runner, test-backfill-loop, unit-test-quality) each recording RED pressure scenarios, GREEN patches applied, 3/3 REFACTOR verdicts, and lint/audit evidence (208 checks, 0 failed).

Subagent Template Name Standardization

Layer / File(s)	Summary
Kit template name prefix alignment `development/arch-drift-watch/references/drift-kit.md`, `development/dead-code-reaper/references/reaper-kit.md`, `development/test-backfill-loop/references/backfill-kit.md`	Renamed three kit template identifiers to use skill-prefixed names: `drift-watcher` → `arch-drift-watcher`, `reaper-` → `dead-code-reaper-`, `backfill-` → `test-backfill-`, aligning manifest install role names with template identifiers.

Skill Documentation Hardening and Gate Enforcement

Layer / File(s)	Summary
Core autonomy and loop engineering skills `development/loop-engineer/SKILL.md`, `development/loop-engineer/references/loop-architecture.md`, `development/autonomous-advisor/SKILL.md`	Hardens autonomy ladder with explicit brand-new-loop level constraints and reviewed-cycle gates; reframes MemBerry as optional non-blocking persistence (skip when unavailable or if `berry_tools` fails) and clarifies pipeline continuation; updates optimization-loop termination conditions to explicit ordered list with >50-cycles safety cap.
Development workflow and jar management skills `development/add-to-jar/SKILL.md`, `development/skill-forge/SKILL.md`, `development/sprint-ticket-runner/SKILL.md`	Tightens add-to-jar by mandating `sync-jar.py` as sole generated-file writer and expanding drop-in prohibitions; adds skill-forge "no small-change exemption" rule; hardens sprint-ticket-runner with strict maker≠checker separation, hard launch-gate, and explicit stop conditions; all three add pressure-rationalization lookup tables with required responses.
Analysis, diagnosis, and correctness skills `development/design-panel/SKILL.md`, `development/diagnose-loop/SKILL.md`, `development/dead-code-reaper/SKILL.md`, `development/test-backfill-loop/SKILL.md`, `development/unit-test-quality/SKILL.md`	Design-panel adds four non-negotiable gates and structural judge independence; diagnose-loop adds stage-gate clarity, "iron law" (no fix before named root cause), and "one-change law"; dead-code-reaper clarifies arch-drift-watch upstream routing and expands safety constraints; test-backfill-loop hardens bug escalation and coverage ratchet; unit-test-quality requires independent expected-value derivation; all five add pressure-rationalization tables.
Architecture improvement and optimization skills `development/improve-architecture/SKILL.md`, `development/optimization-loop/SKILL.md`, `development/clean-room/SKILL.md`	Improve-architecture adds pressure-rationalization table and explicit "green-without-human-shape" failure rule; optimization-loop separates metric vector from boolean gate and mandates real Phase-2 baseline numbers plus immediate cycle-1 execution; clean-room reframes MemBerry bootstrap as optional and records skips in design doc; all three add pressure-dodge prevention tables.
Systems design, observability, and data layer skills `systems-design/api-design/SKILL.md`, `systems-design/data-store-selection/SKILL.md`, `systems-design/design-system/SKILL.md`, `systems-design/production-readiness/SKILL.md`, `development/instrument-observability/SKILL.md`	API-design hardens idempotency (dedup in v1), pagination (cursor-based), error envelope (consistent, correlation-aware), and auth (trust-boundary warnings); data-store-selection tightens shard-key, consistency, queue/topic/cache contracts with explicit ownership; design-system requires committed decision record (not menu) and measurable workload justification for exceptions; production-readiness replaces generic launch-gate items with symptom/SLO alerts, concrete runbooks, and failure drill requirements; instrument-observability adds "When NOT to use" section, non-waivable investigation gate, and high-cardinality identifier privacy rules; all five add pressure-rationalization tables.
Planning and review process skills `development/plan-prune/SKILL.md`, `development/review-panel/SKILL.md`	Plan-prune adds pressure-rationalization table, explicit delete precondition (git-clean only), and "fold before retire" language; review-panel adds hard verification gates, "Unverified hypothesis" severity tier, and pressure-rationalization table; both clarify that process gates are non-negotiable despite deadline pressure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 In the jar, a map now shines so clear,
Nineteen skills forged, the cycle's near,
Gates grown stronger, dodges denied,
MemBerry's soft, a choice, not tied—
Documentation hardens, pressure won't bend,
The ecosystem blooms; the forge ascends. 🌿✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch jar-ecosystem-hardening

AP3X-Dev added 6 commits June 12, 2026 20:19

AP3X-Dev merged commit 971f22a into main Jun 13, 2026
1 of 2 checks passed

AP3X-Dev deleted the jar-ecosystem-hardening branch June 13, 2026 04:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jar ecosystem hardening#2

Jar ecosystem hardening#2
AP3X-Dev merged 6 commits into
mainfrom
jar-ecosystem-hardening

AP3X-Dev commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

AP3X-Dev commented Jun 13, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AP3X-Dev commented Jun 13, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 13, 2026 •

edited

Loading