Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .claude/scheduled_tasks.lock

This file was deleted.

8 changes: 7 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,10 @@ coverage/
/_archive/
.grepai/

graphify-out/
graphify-out/

# AI coding agent local state (per-developer, not for git)
.claude/
.agents/
CLAUDE.md
skills-lock.json
12 changes: 9 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -222,13 +222,19 @@ When `ANTHROPIC_API_KEY` isn't set, the dashboard renders a read-only banner —

## Agent Skills

Designing an AgentForge workflow from scratch is a lot of YAML. The repo ships an [agent skill](https://skills.sh) that walks any Claude Code / Cursor / Codex session through the design — agents, phases, gates, loops, parallelism, wiring, nodes — and emits a working `.agentforge/` directory.
The repo ships [agent skills](https://skills.sh) that drop into any Claude Code / Cursor / Codex session — installable in one command:

```bash
npx skills add mandarnilange/agentforge/agentforge-workflow
npx skills add mandarnilange/agentforge
```

Then ask the agent something like *"help me design an AgentForge pipeline for PR triage"* and the skill kicks in. Catalog and authoring docs: [`skills/`](skills/).
| Skill | Audience | What it does |
|---|---|---|
| `agentforge-workflow` | End user | Walks through designing an AgentForge workflow and emits a complete `.agentforge/` directory. |
| `agentforge-template-author` | Contributor | Guides adding a new shipped template under `packages/{core,platform}/src/templates/`. |
| `agentforge-debug` | Operator | Triages stuck or failing pipeline runs from symptom to fix path. |

Trigger phrases live in each skill's frontmatter — e.g. *"help me design an AgentForge pipeline for PR triage"* fires `agentforge-workflow`; *"why is pipeline X stuck?"* fires `agentforge-debug`. Catalog, changelog, and authoring docs: [`skills/`](skills/).

---

Expand Down
56 changes: 56 additions & 0 deletions skills/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Skills changelog

Top-level history for every skill under [`skills/`](.). Each skill also
keeps a per-skill `CHANGELOG.md` with finer-grained notes — link to that
file from the entry below.

## Versioning policy

Skills are versioned independently via `metadata.version` in
`SKILL.md`'s frontmatter. Use semver:

| Bump | Trigger |
|---|---|
| **major** | Trigger conditions changed (the skill fires on different prompts now), required tool surface changed, file structure renamed, or an existing reference doc removed. Anything a user has muscle memory for. |
| **minor** | New reference docs, new sections in `SKILL.md`, additive support for new prompts. Existing behaviour preserved. |
| **patch** | Prose tightening, typo fixes, schema cheat-sheet updates that mirror upstream changes, broken-link repairs. |

**Bump on every PR that lands a skill change**, even a tiny one — the
version is what distinguishes one published skill from the next, and
users may pin to it. CI's frontmatter validator enforces presence; the
contributor enforces honesty.

## Release flow

1. Land your change locally and run `npm run skills:validate`.
2. Bump the affected skill's `metadata.version` in its `SKILL.md`.
3. Add an entry to that skill's `CHANGELOG.md` (create the file if it
does not exist).
4. Add a one-line entry here under the next "Unreleased" or current
version block, linking to the per-skill changelog.
5. Open a PR. The `Publish Skills` workflow validates frontmatter on
every push.

There is no separate "publish" command. Once the PR merges into `main`,
`npx skills add mandarnilange/agentforge` picks up the new version on
next install.

## History

### Unreleased

- _add entries here as they merge_

### 2026-05-02

- **`agentforge-template-author` 0.1.0** — initial release. Guides
contributors through adding a new shipped template under
`packages/{core,platform}/src/templates/`. See
[`agentforge-template-author/CHANGELOG.md`](agentforge-template-author/CHANGELOG.md).
- **`agentforge-debug` 0.1.0** — initial release. Triages stuck or
failing pipeline runs from symptom to fix path. See
[`agentforge-debug/CHANGELOG.md`](agentforge-debug/CHANGELOG.md).
- **`agentforge-workflow` 0.1.0** — initial release. Walks users through
designing an AgentForge workflow and emits a complete `.agentforge/`
directory. See
[`agentforge-workflow/CHANGELOG.md`](agentforge-workflow/CHANGELOG.md).
14 changes: 9 additions & 5 deletions skills/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,21 @@ Install all skills from this repo into your agent runtime:
npx skills add mandarnilange/agentforge
```

Or install a single skill by path:
The Vercel skills CLI scans the repo's `skills/` directory and installs every skill it finds. Want a single skill instead? Use the full GitHub tree URL:

```bash
npx skills add mandarnilange/agentforge/agentforge-workflow
npx skills add https://github.com/mandarnilange/agentforge/tree/main/skills/agentforge-workflow
```

## Available skills

| Skill | What it does |
|---|---|
| [`agentforge-workflow`](./agentforge-workflow/SKILL.md) | Walks a user through designing an AgentForge workflow — agents, pipeline, gates, loops, parallelism, wiring, nodes — and emits a complete `.agentforge/` directory. |
| Skill | Audience | What it does |
|---|---|---|
| [`agentforge-workflow`](./agentforge-workflow/SKILL.md) | End user | Walks through designing an AgentForge workflow — agents, pipeline, gates, loops, parallelism, wiring, nodes — and emits a complete `.agentforge/` directory. |
| [`agentforge-template-author`](./agentforge-template-author/SKILL.md) | Contributor | Guides adding a new shipped template under `packages/{core,platform}/src/templates/`. Scope check → package selection → manifest → tests → PR checklist. |
| [`agentforge-debug`](./agentforge-debug/SKILL.md) | Operator | Triages a stuck or failing pipeline run from symptom → state inspection → root-cause classification → fix path. |

Versions, release notes, and the bump policy live in [`CHANGELOG.md`](./CHANGELOG.md).

## Layout

Expand Down
15 changes: 15 additions & 0 deletions skills/agentforge-debug/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# `agentforge-debug` changelog

## 0.1.0 — 2026-05-02

Initial release.

- Triages a stuck or failing AgentForge pipeline run: symptom →
state inspection → root-cause classification → fix path.
- Read-first discipline: never propose state-mutating actions until run
state has been read from the state store / dashboard.
- Reference docs: symptom tree (symptom → cause → first command),
state model (every status enum and transition), fix paths (concrete
recipes per root cause).
- Hard rules: no SQLite-file deletes, no gate bypass without
authorisation, no validation skip flags, one fix at a time.
167 changes: 167 additions & 0 deletions skills/agentforge-debug/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
---
name: agentforge-debug
description: >
Triages a stuck or failing AgentForge pipeline run. Walks the user from
symptom → state inspection → root-cause classification → fix path. Trigger
when the user reports an AgentForge run that is "stuck", "failing",
"looping", "blowing its budget", "not picking up at the gate", "schema
invalid", "won't dispatch", "node not picking it up", or asks "why is
pipeline X not finishing?". Do NOT trigger for general AgentForge design
questions (use `agentforge-workflow`) or for debugging unrelated AI
pipelines.
license: MIT
metadata:
author: mandarnilange
version: "0.1.0"
---

# AgentForge Pipeline Debug

You are helping a user triage an AgentForge pipeline run that is not
behaving. Run state lives in the SQLite/Postgres state store; every agent
step, gate decision, and node heartbeat is recorded there. The dashboard
visualises the same data. This skill walks from the symptom the user
reports to a concrete fix path.

The bias is **read first, change last**. Never propose mutations (re-run,
abort, force-approve gate) until you have read the current run state and
classified the root cause.

## When to use this skill

Trigger on any of:
- "Pipeline X is stuck / not finishing / hung"
- "My run is looping / hitting max iterations / failing schema validation"
- "Gate is not picking up / no one can approve"
- "Agent blew its budget / cost ceiling hit"
- "Node not picking up the job / scheduler isn't dispatching"
- "Dashboard says paused but nothing is happening"
- "Why is pipeline `<id>` failing?"

Do **not** trigger for:
- Workflow design questions (use `agentforge-workflow`).
- General LLM-app debugging unrelated to AgentForge.

## Reference material

Read on demand:

- `references/symptom-tree.md` — symptom → likely cause → first command to run
- `references/state-model.md` — pipeline / agent run / gate status enums and
what each transition means
- `references/fix-paths.md` — concrete recipes for each root cause
(re-run agent, edit prompt, swap node, raise budget, force gate, etc.)

The authoritative status enums live at:
- `packages/core/src/domain/models/pipeline-run.model.ts` — `PipelineRunStatus`
- `packages/core/src/domain/models/agent-run.model.ts` — `AgentRunRecordStatus`, `AgentRunExitReason`
- `packages/core/src/domain/models/gate.model.ts` — `GateStatus`

Read the status enum from the source if you need the exact list — do not
rely on memory.

## The flow

### 1. Capture the symptom precisely

Restate what you heard in one or two lines. Resist jumping to fixes.

Ask only what you need:
- The pipeline run ID (or pipeline name + project name).
- What the user expected to happen.
- What they observe (dashboard state, error message, last log line).
- When it started — was it working before? What changed?

Skip questions whose answers you can infer from the dashboard / state.

### 2. Read the run state

Run these in order. Stop reading the moment you have enough to classify:

```bash
# High-level run state — status, current phase, last update
agentforge get pipeline <run-id>

# Per-agent run records — which one is stuck/failed
agentforge get pipeline <run-id> --agents # if available; otherwise:
# Inspect via dashboard at http://localhost:3001/runs/<run-id>

# Logs for the suspect agent run
agentforge logs <run-id> --agent <agent-name>

# Node pool health — relevant if the symptom is "not dispatching"
agentforge get nodes
```

If the user does not have CLI access (e.g. on a worker host), point them at
the dashboard pages: `/runs/<id>`, `/runs/<id>/agents/<agent-name>`,
`/nodes`.

### 3. Classify the root cause

Use `references/symptom-tree.md`. The classifications:

| Class | One-line tell |
|---|---|
| **Gate hung** | `pipeline_run.status = paused_at_gate`, no recent gate decision |
| **Agent failed (LLM error)** | `agent_run.status = failed`, `error` field non-empty, `exitReason = error` |
| **Agent failed (schema invalid)** | `failed` with `error` mentioning Zod or `validate` step |
| **Agent failed (script step)** | `failed` with last step type `script` and non-zero `exitCode` |
| **Loop spinning** | Agent `running` with high `retryCount` or `loop.iteration ≈ maxIterations` |
| **Budget hit** | `failed` with `exitReason = budget-tokens` or `budget-cost` |
| **Timeout** | `failed` with `exitReason = timeout` |
| **Stuck dispatching** | `agent_run.status = pending`/`scheduled` for long, no node claim |
| **Node down** | `nodes` table shows the targeted node `offline` or stale heartbeat |
| **Node affinity miss** | `pending` forever; no node satisfies `nodeAffinity.required` |
| **Cancelled** | `pipeline_run.status = cancelled` — was someone else's `agentforge cancel <run-id>`? |

Tell the user the class in one sentence and cite the evidence ("agent
`developer` is `failed` with `exitReason: budget-cost`, run cost was
$0.62 vs the $0.60 ceiling").

### 4. Propose the fix path

Use `references/fix-paths.md` for the matched class. Present **the smallest
reversible action first**, then escalating options.

For example, on a schema-invalid failure:
1. Inspect the agent's last LLM output via dashboard or logs.
2. If the LLM hallucinated a missing field, the prompt likely needs a
tightening — propose a one-line addition to `prompts/<agent>.system.md`.
3. If the schema itself is wrong, point at the schema file and propose the
minimum change.
4. Re-run only the failed agent: `agentforge run --continue <run-id>`.

Do not skip to "abort and re-start the pipeline" unless the run is
unrecoverable.

### 5. Confirm before mutating state

Any action that changes shared state — approving a gate, cancelling a run,
re-running an agent, force-clearing a stuck claim — requires explicit user
confirmation. State your understanding, the proposed action, and the
expected outcome. Wait.

Read-only investigation does not need confirmation.

## Hard rules

- **Read before write.** Always inspect run state before proposing a fix.
- **Never tell the user to delete the SQLite file** or `TRUNCATE pipeline_runs`
to "clear stuck state". That is data loss. Use the proper CLI.
- **Never bypass a gate** without explicit user authorisation. Gates are the
audit trail.
- **Do not propose `--no-verify` or skip-validation flags** to bypass
schema failures. Fix the schema or the prompt.
- **Do not assume the dashboard is wrong.** When the dashboard and CLI
disagree, the state store is the source of truth — read it directly.
- **Propose one fix at a time.** Avoid stacked changes that make it
impossible to know which one solved the problem.

## What success looks like

- The user knows exactly which agent / phase / gate is the failure point.
- The user knows the root-cause class with cited evidence.
- The user has a concrete next action (CLI command, file edit, gate
decision) and an idea of what will tell them whether it worked.
- No state has been mutated without their explicit go-ahead.
Loading