Skip to content

Jar ecosystem hardening#2

Merged
AP3X-Dev merged 6 commits into
mainfrom
jar-ecosystem-hardening
Jun 13, 2026
Merged

Jar ecosystem hardening#2
AP3X-Dev merged 6 commits into
mainfrom
jar-ecosystem-hardening

Conversation

@AP3X-Dev

@AP3X-Dev AP3X-Dev commented Jun 13, 2026

Copy link
Copy Markdown
Owner

Summary by CodeRabbit

  • New Features

    • Added comprehensive ecosystem map documentation describing skill routing, dependencies, and usage patterns.
  • Documentation

    • Enhanced skill guidance across 20+ skills with stricter process rules and pressure-resistance tables.
    • Updated bug tracking and audit findings for ecosystem documentation accuracy.
    • Documented completion of 23 skill-forge verification runs with COMPLY verdicts.
    • Refined template naming across subagent references for consistency.
  • Chores

    • Updated tracker state reflecting completed audit cycles and forged skill status.

AP3X-Dev added 6 commits June 12, 2026 20:19
…ch, add ecosystem map

Deep cross-skill audit of all 23 skills. Applied the highest-leverage fixes;
filed the rest to agent-state for follow-up cycles. Structural gate stays green.

Fixes:
- Align bundled subagent-template names in the reaper/backfill/drift kits with
  the manifest/install role names (dead-code-reaper-*, test-backfill-*,
  arch-drift-watcher). The unprefixed names had drifted from the SKILL.md
  install lines and broke the agents/README prefixed-naming policy; they live
  inside fenced blocks so the audit gate never saw them.
- sprint-ticket-runner: add an explicit launch gate (offers launch, never
  auto-launches) plus a stop condition. It was the only loop skill that would
  auto-launch code-writing makers with no termination clause.
- Normalize the lone ../references/state-templates.md link to ./ to match its
  siblings (same resolved target).

Docs and state:
- Add docs/ecosystem-map.md: intent routing, the two pipeline backbones, the
  autonomy ladder and human gate, a bundled-vs-external dependency matrix, the
  23-skill relationship table, the shared-state map, and the gates note.
- File remaining findings to triage-inbox.md (F-1..F-12), audit-policy
  proposals to decisions.md (HD-1..HD-5), and record completed work plus open
  tasks in completed.md and loop-state.md.

Gate: python scripts/audit-jar.py -> 208 checks, 0 failed. Verified by a
separate checker (maker != checker).
The new docs/ecosystem-map.md was unreachable from any entry point. Point the
"For agents" paragraph at it so a programmatic reader finds cross-skill routing,
the pipeline backbones, and the autonomy ladder next to skills.json.
…delete-safety)

Autonomous jar-audit cycle over the four filed Open Tasks; each fixed by a maker
and verified by a separate checker (maker != checker).

- instrument-observability: add a "When NOT to use" boundary (diagnose-loop /
  optimization-loop / host bugfix) plus a description NOT-for clause, and a
  handoff noting its telemetry feeds production-readiness's launch gate.
- autonomous-advisor + clean-room: reframe MemBerry / memberry-setup as an
  optional persistence adapter (clean skip on absence) instead of a hard halt,
  matching optimization-loop; fix duplicate list numbering.
- improve-architecture + dead-code-reaper: name arch-drift-watch as the upstream
  detector; test-backfill-loop: name agent-state/BUG_TRACKER.md as the canonical
  suspected-bug sink.
- plan-prune: a planning doc may be deleted only once git already holds it;
  untracked or dirty docs are archived or blocked instead.

skills.json regenerated for the instrument-observability description change.
Remaining findings F-4..F-11 stay in triage-inbox; completed.md and loop-state.md
updated. A checker rejected one inaccurate cross-reference in clean-room
mid-cycle; it was corrected and re-verified. Gate: python scripts/audit-jar.py
-> 208 checks, 0 failed.
…ap MemBerry row)

Hunter swept the surfaces changed this effort plus cross-file consistency and
filed one LOW defect: the docs/ecosystem-map.md §4 MemBerry dependency row still
pointed at "open findings" (F-1/F-2/F-12) that were closed earlier this cycle.
Fixer changed the cell to "optimization-loop, autonomous-advisor, clean-room
(all optional)". A separate Validator confirmed the symptom is gone, the new cell
matches the implemented optional posture in both skills, no other stale pointer
references a closed finding (the F-5 "external" note is correctly still open),
scope is one table cell, and the gate is green. BUG-001 -> verified.

Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.
…SF-023)

Batch 1 of the forge queue (forger != judge throughout).

- SF-005 clean-room: 3 independent judges ran the firewall/parity-mode pressure
  scenario against the patched skill and all returned COMPLY -- the 8 captured
  rationalizations are refused and the reclassify-to-Transparent escape is closed.
  SF-005 -> forged (3/3).
- SF-023 instrument-observability: a forger applied the GREEN patch closing the
  captured RED rationalizations (non-waivable investigation gate; high-cardinality
  identifier governance across tags/extra/context/span; logger-not-a-substitute
  for the sensitive-surface map; full smoke checklist; an 8-row pressure table)
  in a 45/+2- diff with the frontmatter description unchanged; 3 independent
  judges then returned COMPLY. SF-023 -> forged (3/3).

Tracker, run packages, completed.md and loop-state.md updated. Forge queue: 6 of
23 forged; SF-006..022 + SF-021 remain pending-red (multi-batch). Gate: python
scripts/audit-jar.py -> 208 checks, 0 failed.
Ran the rest of the pending-red queue (SF-006..022, SF-021) as one concurrent
RED -> GREEN -> judge x3 pipeline per skill (forger != judge, disjoint files).
Every skill surfaced a real shortcut under pressure, was patched to refuse the
named dodges (a "Known pressure rationalizations" table plus hard-rule
tightening), and passed 3 independent judges. Frontmatter descriptions were not
touched, so skills.json stays in sync. Spot-checked the add-to-jar and
production-readiness diffs -- sane, on-topic rule tightening, no scope creep.

- 17/17 forged 3/3; 0 loopholes, 0 needs-stronger-scenario.
- Per-skill run packages written under agent-state/skill-forge-runs/.
- Tracker rows set to forged and the queue table de-fragmented (SF-022/023 were
  orphaned below the rules prose).

Completes the forge queue (23/23 forged) and the authorized "all three loops
until done" rotation (jar-audit + bug-pipeline + skill-forge).
Gate: python scripts/audit-jar.py -> 208 checks, 0 failed.
@coderabbitai

coderabbitai Bot commented Jun 13, 2026

Copy link
Copy Markdown

Review Change Stack

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df37da16-fcd0-49a5-8e4e-34d272fefb2c

📥 Commits

Reviewing files that changed from the base of the PR and between 8d9d273 and abba2d5.

📒 Files selected for processing (52)
  • README.md
  • agent-state/BUG_TRACKER.md
  • agent-state/SKILL_FORGE_TRACKER.md
  • agent-state/completed.md
  • agent-state/decisions.md
  • agent-state/loop-state.md
  • agent-state/skill-forge-runs/add-to-jar.md
  • agent-state/skill-forge-runs/api-design.md
  • agent-state/skill-forge-runs/clean-room.md
  • agent-state/skill-forge-runs/data-store-selection.md
  • agent-state/skill-forge-runs/dead-code-reaper.md
  • agent-state/skill-forge-runs/design-panel.md
  • agent-state/skill-forge-runs/design-system.md
  • agent-state/skill-forge-runs/diagnose-loop.md
  • agent-state/skill-forge-runs/improve-architecture.md
  • agent-state/skill-forge-runs/instrument-observability.md
  • agent-state/skill-forge-runs/loop-engineer.md
  • agent-state/skill-forge-runs/optimization-loop.md
  • agent-state/skill-forge-runs/plan-prune.md
  • agent-state/skill-forge-runs/production-readiness.md
  • agent-state/skill-forge-runs/review-panel.md
  • agent-state/skill-forge-runs/skill-forge.md
  • agent-state/skill-forge-runs/sprint-ticket-runner.md
  • agent-state/skill-forge-runs/test-backfill-loop.md
  • agent-state/skill-forge-runs/unit-test-quality.md
  • agent-state/triage-inbox.md
  • development/add-to-jar/SKILL.md
  • development/arch-drift-watch/references/drift-kit.md
  • development/autonomous-advisor/SKILL.md
  • development/clean-room/SKILL.md
  • development/dead-code-reaper/SKILL.md
  • development/dead-code-reaper/references/reaper-kit.md
  • development/design-panel/SKILL.md
  • development/diagnose-loop/SKILL.md
  • development/improve-architecture/SKILL.md
  • development/instrument-observability/SKILL.md
  • development/loop-engineer/SKILL.md
  • development/loop-engineer/references/loop-architecture.md
  • development/optimization-loop/SKILL.md
  • development/plan-prune/SKILL.md
  • development/review-panel/SKILL.md
  • development/skill-forge/SKILL.md
  • development/sprint-ticket-runner/SKILL.md
  • development/test-backfill-loop/SKILL.md
  • development/test-backfill-loop/references/backfill-kit.md
  • development/unit-test-quality/SKILL.md
  • docs/ecosystem-map.md
  • skills.json
  • systems-design/api-design/SKILL.md
  • systems-design/data-store-selection/SKILL.md
  • systems-design/design-system/SKILL.md
  • systems-design/production-readiness/SKILL.md

📝 Walkthrough

Walkthrough

This PR formalizes the skill jar's ecosystem knowledge, completes a large skill forge cycle (SF-005..023 → 23/23 forged), and systematically hardens skill documentation across 20+ development and systems-design skills by adding pressure-rationalization tables, stricter gates, and MemBerry optionality clarifications.

Changes

Ecosystem Knowledge, Audit Closure, and Forge Tracking

Layer / File(s) Summary
Ecosystem map and audit framework
docs/ecosystem-map.md, agent-state/decisions.md, skills.json, README.md
New authoritative ecosystem routing documentation defines skill selection, composition, autonomy posture, and the audit-jar.py verification gate; audit decisions document that the ecosystem map is the durable home for jar knowledge and that template name: values were aligned to manifest roles; instrument-observability description clarified to exclude debugging loops.
Audit findings and cycle completion
agent-state/BUG_TRACKER.md, agent-state/triage-inbox.md, agent-state/loop-state.md, agent-state/completed.md
Closes ecosystem-audit-1 cycle with new Sweep 2 (BUG-001 documenting stale MemBerry "see open findings" pointer now fixed), adds triage findings F-4, F-5, F-6, F-7, F-9, F-11 with verification commands, and updates cycle status ledger and completion journal with 23/23 forge outcomes.
Skill forge queue completion (SF-005..023)
agent-state/SKILL_FORGE_TRACKER.md, agent-state/skill-forge-runs/*.md
Updates tracker to mark all 19 skills as forged with 3/3 clean runs; documents 18 skill-specific forge run reports (add-to-jar, api-design, clean-room, data-store-selection, dead-code-reaper, design-panel, design-system, diagnose-loop, improve-architecture, instrument-observability, loop-engineer, optimization-loop, plan-prune, production-readiness, review-panel, skill-forge, sprint-ticket-runner, test-backfill-loop, unit-test-quality) each recording RED pressure scenarios, GREEN patches applied, 3/3 REFACTOR verdicts, and lint/audit evidence (208 checks, 0 failed).

Subagent Template Name Standardization

Layer / File(s) Summary
Kit template name prefix alignment
development/arch-drift-watch/references/drift-kit.md, development/dead-code-reaper/references/reaper-kit.md, development/test-backfill-loop/references/backfill-kit.md
Renamed three kit template identifiers to use skill-prefixed names: drift-watcherarch-drift-watcher, reaper-*dead-code-reaper-*, backfill-*test-backfill-*, aligning manifest install role names with template identifiers.

Skill Documentation Hardening and Gate Enforcement

Layer / File(s) Summary
Core autonomy and loop engineering skills
development/loop-engineer/SKILL.md, development/loop-engineer/references/loop-architecture.md, development/autonomous-advisor/SKILL.md
Hardens autonomy ladder with explicit brand-new-loop level constraints and reviewed-cycle gates; reframes MemBerry as optional non-blocking persistence (skip when unavailable or if berry_tools fails) and clarifies pipeline continuation; updates optimization-loop termination conditions to explicit ordered list with >50-cycles safety cap.
Development workflow and jar management skills
development/add-to-jar/SKILL.md, development/skill-forge/SKILL.md, development/sprint-ticket-runner/SKILL.md
Tightens add-to-jar by mandating sync-jar.py as sole generated-file writer and expanding drop-in prohibitions; adds skill-forge "no small-change exemption" rule; hardens sprint-ticket-runner with strict maker≠checker separation, hard launch-gate, and explicit stop conditions; all three add pressure-rationalization lookup tables with required responses.
Analysis, diagnosis, and correctness skills
development/design-panel/SKILL.md, development/diagnose-loop/SKILL.md, development/dead-code-reaper/SKILL.md, development/test-backfill-loop/SKILL.md, development/unit-test-quality/SKILL.md
Design-panel adds four non-negotiable gates and structural judge independence; diagnose-loop adds stage-gate clarity, "iron law" (no fix before named root cause), and "one-change law"; dead-code-reaper clarifies arch-drift-watch upstream routing and expands safety constraints; test-backfill-loop hardens bug escalation and coverage ratchet; unit-test-quality requires independent expected-value derivation; all five add pressure-rationalization tables.
Architecture improvement and optimization skills
development/improve-architecture/SKILL.md, development/optimization-loop/SKILL.md, development/clean-room/SKILL.md
Improve-architecture adds pressure-rationalization table and explicit "green-without-human-shape" failure rule; optimization-loop separates metric vector from boolean gate and mandates real Phase-2 baseline numbers plus immediate cycle-1 execution; clean-room reframes MemBerry bootstrap as optional and records skips in design doc; all three add pressure-dodge prevention tables.
Systems design, observability, and data layer skills
systems-design/api-design/SKILL.md, systems-design/data-store-selection/SKILL.md, systems-design/design-system/SKILL.md, systems-design/production-readiness/SKILL.md, development/instrument-observability/SKILL.md
API-design hardens idempotency (dedup in v1), pagination (cursor-based), error envelope (consistent, correlation-aware), and auth (trust-boundary warnings); data-store-selection tightens shard-key, consistency, queue/topic/cache contracts with explicit ownership; design-system requires committed decision record (not menu) and measurable workload justification for exceptions; production-readiness replaces generic launch-gate items with symptom/SLO alerts, concrete runbooks, and failure drill requirements; instrument-observability adds "When NOT to use" section, non-waivable investigation gate, and high-cardinality identifier privacy rules; all five add pressure-rationalization tables.
Planning and review process skills
development/plan-prune/SKILL.md, development/review-panel/SKILL.md
Plan-prune adds pressure-rationalization table, explicit delete precondition (git-clean only), and "fold before retire" language; review-panel adds hard verification gates, "Unverified hypothesis" severity tier, and pressure-rationalization table; both clarify that process gates are non-negotiable despite deadline pressure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐰 In the jar, a map now shines so clear,
Nineteen skills forged, the cycle's near,
Gates grown stronger, dodges denied,
MemBerry's soft, a choice, not tied—
Documentation hardens, pressure won't bend,
The ecosystem blooms; the forge ascends. 🌿✨

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch jar-ecosystem-hardening

@AP3X-Dev AP3X-Dev merged commit 971f22a into main Jun 13, 2026
1 of 2 checks passed
@AP3X-Dev AP3X-Dev deleted the jar-ecosystem-hardening branch June 13, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant