Releases · HenryLach/taskplane

13 May 02:32

github-actions

v0.30.1

4055cbe

v0.30.1 Latest

Latest

Enhanced

Dashboard segment-level progress indicators (TP-197, #464): Multi-segment
task rows now show a horizontal pill row of per-segment status badges —
one pill per segment with a status icon (✅ succeeded · ⏳ running · ⬚
pending · ❌ failed · ⏸ stalled · ↷ skipped) plus the segment’s repo
ID. The currently-executing segment is visually emphasized. This closes
the operator-visibility gap introduced by TP-145’s .DONE suppression for
non-final segments: previously, multi-segment lanes sat “running” with
no segment-level signal during the suppression window, which made wave 2+
batches where all tasks were mid-segment appear stuck. With the pill row
in place, operators can see at a glance which segments have finished,
which is running, and which remain. The progress bar itself is unchanged
— TP-174 already made it segment-scoped via the V2 lane snapshot’s
per-segment counts; the new pill row provides the missing context that
makes the existing bar legible as “current segment’s progress.”

Backwards-compatibility: single-segment tasks render an empty pill row
(auto-collapsed grid sub-row), so the DOM and visual layout for
non-segmented batches are identical to before. The pill row lives in a
new grid row 3 of .task-row (cols 3–7), mirroring the
task-title-subtitle pattern from TP-485, and is intentionally placed
outside the .task-step cell so the existing @media (max-width: 900px)
rule that hides .task-step does not hide segment context on narrow
viewports. No dashboard/server.cjs change was required — the existing
API response already exposed batch.segments[], task.segmentIds, and
runtimeLaneSnapshots[*].segmentId.

Fixed

Multi-segment engine hardening (TP-196, #462 + #502 + #503 + #508):
closes four follow-up issues from the multi-repo task execution rollout
with a single coherent hardening pass against the multi-segment engine.
- .DONE authority guards (#462) — three defense-in-depth checks now
  refuse to honor a stale or premature .DONE in multi-segment tasks:
  (a) resolveTaskMonitorState (execution.ts) accepts an optional
  multiSegmentContext: { isFinalSegment, segmentId } parameter; when
  isFinalSegment === false and .DONE is present, Priority 1 is
  skipped and a WARN is logged via execLog; monitorLanes populates
  this context from task.segmentIds + task.activeSegmentId. (b)
  collectDoneTaskIdsForResume (resume.ts) now refuses to add a
  taskId to the done set when persisted segment records exist AND any
  segment is not succeeded/skipped — the task re-reconciles instead
  of silently being marked complete. (c) A new exported
  checkDoneAuthoritySafeguard helper (discovery.ts) emits a
  doctor-style console.warn when .DONE coexists with unchecked
  STATUS.md checkboxes during area scans. The pre-existing TP-135
  "keeps .DONE authoritative even when segment frontier is incomplete"
  test was updated to assert the inverted (post-#462) contract.
- SegmentScopeMode unification (#502 + #503) — promotes the
  FULL_TASK / SEGMENT_SCOPED decision to a first-class
  SegmentScopeMode = "FULL_TASK" | "SEGMENT_SCOPED" type in types.ts
  plus a computeSegmentScopeMode(stepSegmentMap, repoStepNumbers, currentRepoId, currentStepNumber) helper in lane-runner.ts. The
  iteration loop now derives both the authoritative segmentScopeMode
  and the legacy isSegmentScoped boolean alias from one call, and
  the segment-prompt injection block is gated on isSegmentScoped
  instead of the previous scattered stepSegmentMap && currentRepoId && repoStepNumbers && remainingSteps.length > 0 composite. New
  behavioural regression suite
  (extensions/tests/segment-scope-mode-prompt.test.ts, 9 tests
  across 4 describe blocks) mocks spawnAgent to capture the worker
  prompt + env + system prompt and verifies the FULL_TASK,
  SEGMENT_SCOPED, polyrepo single-segment, and legacy/partial-marker
  contracts end-to-end.
- Wasted-iteration elimination (#508) — lane-runner now performs
  an explicit pre-spawn segment-completion check between the existing
  remainingSteps.length === 0 guard and the totalIterations++
  increment, delegating to a new pure helper
  shouldSkipSpawnForCompleteSegment(statusContent, repoStepNumbers, currentRepoId). When every segment-scoped step for the active repo
  is already complete, the loop logs "Pre-spawn segment-completion check" and breaks before incurring a worker spawn. Behavioural
  test (extensions/tests/early-exit-segment-spawn-skip.test.ts)
  mocks agent-host.spawnAgent via mock.module and asserts
  spawnAgentCallCount === 0 for a fixture worktree whose checkboxes
  are pre-checked.
- Validation: typecheck / lint / format:check all exit 0. Fast
  test suite passes at 3678 / 0 fail / 1 skip — net +51 new tests
  spread across 3 new test files plus targeted updates to
  segment-scoped-lane-runner.test.ts, resume-segment-frontier.test.ts,
  and engine-runtime-v2-routing.test.ts (slice-window widening for
  the longer resolveTaskMonitorState body).

Assets 2

10 May 23:10

github-actions

v0.30.0

853b709

v0.30.0

Fixed

Preflight cleanup feature now actually runs (TP-195): runOrchBatch
in extensions/taskplane/engine.ts referenced sweepStaleArtifacts,
formatPreflightSweep, rotateSupervisorLogs, and formatLogRotation
inside the preflight-cleanup try-block, but those identifiers were
never imported from ./cleanup.ts. At runtime the first reference
threw a ReferenceError that the enclosing catch-all swallowed, so
Layers 2–5 of preflight cleanup (age-based artifact sweep, supervisor
log rotation, telemetry size cap, prior-batch artifact cleanup) had
been silently a no-op since TP-065 / #221 (․2024-09). The missing
imports were uncovered by the TP-191 typecheck script; this fix adds
them so the advertised cleanup runs on every batch. Regression test:
tests/lane-runner-v2.test.ts 3.10 asserts the four helpers are
imported from ./cleanup.ts.
max_worker_minutes config field is honored (TP-195): Lane-runner
config in executeLaneV2 (extensions/taskplane/execution.ts) was
reading a non-existent config.failure?.maxWorkerMinutes camelCase
alias — always undefined — silently ignoring any operator-set value
on OrchestratorConfig.failure.max_worker_minutes and always falling
through to the hard-coded 120-minute default. Fixed to read the
canonical snake_case field. Operators with max_worker_minutes
configured in .pi/taskplane-config.json will now have their
configured limit honored; default of 120 preserved when the field is
unset. Regression test: tests/lane-runner-v2.test.ts 3.9 asserts
the corrected accessor and absence of the typo.
Resume’s failed-task supervisor-alert path no longer crashes
(TP-195): When /orch-resume encountered a failed task during a
wave, the supervisor-alert emission block in resume.ts called
batchState.tasks.find(…), but OrchBatchRuntimeState has no
tasks field (only PersistedBatchState does). The runtime call
would throw TypeError: undefined.find is not a function. The
failed-task path was never covered by tests, so the crash never
surfaced. Replaced with a lookup against laneForTask?.tasks.find (…)?.task — the lane-allocated ParsedTask payload carries the
same segmentIds/activeSegmentId data the alert needs.
Regression test: tests/resume-bug-fixes.test.ts 4.1.

Internal

Code-quality gates active (TP-194)

The final task packet implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md,
section 6.4). Flips three static-analysis checks from advisory to
required CI gates: Typecheck (new — tsc --noEmit against
extensions/tsconfig.ci.json), Lint (Biome) (was already wired
but ran with continue-on-error: true until now), and
Format check (Biome) (new — biome format --no-errors-on-unmatched .).
.github/workflows/ci.yml runs the three steps in order before the
existing Run tests step inside the single ci job, so any failure
short-circuits the rest of the pipeline. The existing required ci
branch-protection context already covers the new gates because a
step failure fails the whole job.

Reviewer-agent activation: the TP-188 quality-check verification
section in templates/agents/task-reviewer.md is now fully active.
The temporary activation note added in TP-191 (which previously
surfaced quality-check failures as Issues Found without downgrading
the verdict) is removed; failing typecheck/lint/format:check now
unconditionally downgrades APPROVE → REVISE during code review.
Documentation updates: AGENTS.md adds the three commands to the
validation checklist; docs/maintainers/release-process.md adds
them to the pre-release checks and pre-release checklist;
docs/maintainers/development-setup.md gets a new
"Code-quality gates (required for every PR)" section. The
long-missing lint:fix npm script (referenced by these docs) is
added to package.json.

Operator handoff (verification-only): no branch-protection
changes are required. After this PR merges, verify via
gh api repos/HenryLach/taskplane/branches/main/protection
that required_status_checks.contexts still contains ci (it
does today). If at some future point per-gate visibility in
branch protection is desirable, the follow-up is to split the
gates into separate jobs in ci.yml — out of scope for TP-194
per the spec's Tier-1.5 follow-up list.
Code-quality typecheck cleanup (TP-195): Fourth of four sequenced
packets implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md).
Cleaned up the 264 typecheck errors that TP-191 surfaced when it
first made npm run typecheck runnable, so TP-194’s gate flip can
promote typecheck from advisory to a CI gate. Final state:
npm run typecheck exits 0 against extensions/tsconfig.ci.json at
the current strictness (strict: false, noImplicitAny: false).
Per-category breakdown of fixes (top categories at task start):
TS2339 (63) — property-not-exist; TS2741 (52) — mock-object missing
required fields; TS2345 (30) — caller-shape mismatch; TS2554 (23) —
signature drift; TS2367 (21) — unintentional comparison; TS2322 (19)
— assignment mismatch; TS2739 (12) — type missing properties; plus
smaller TS2769/TS2353/TS2352/TS2559/TS2347/TS2578/TS2304/TS2871/
TS2694 counts. Source-side highlights: 4 latent bugs uncovered
and fixed (preflight-cleanup-feature no-op, max_worker_minutes
typo, resume failed-task crash, plus an extension.ts dashboard
change-detection that was reading non-existent fields and only ever
refreshing on currentTaskId — dropped the dead comparisons,
observable behavior unchanged); widened execLog’s extra
parameter from Record<string, string\|number\|boolean> to
Record<string, unknown> (callers were already passing arrays/
objects; template-string stringification preserved); re-exported
RuntimeRegistry from process-registry.ts; documented optional
batchId? field on OrchestratorConfig.orchestrator; added
EXEC_MISSING_TASK_FOLDER to ExecutionErrorCode; fixed
discriminated-union narrowing under non-strict mode by adding
reason?: undefined / error?: undefined to success branches;
switched loadProjectOverrides / migrateProjectOverrides /
loadJsonConfig / mergeProjectOverrides to
DeepPartial<TaskplaneConfig>; changed spawnMergeAgentV2 return
type to Promise<void> (fire-and-forget). Test-side highlights:
introduced shared tests/helpers/mock-orchestrator-config.ts
factories (makeOrchestratorConfig/makeTaskRunnerConfig) that
wrap DEFAULT_*_CONFIG defaults from types.ts so test mocks stay
in sync with the runtime schema; added expect.unreachable() and
optional 2nd message arg to expect() (Vitest-compat surface that
~190 sites already relied on); fixed phase-narrowing in 9.x
launch-window suite via typed OrchBatchPhase casts; updated
LaneRunnerConfig / PersistedTaskRecord / MergeResult /
BatchSummaryData / MinimalBatchState / WorkspaceRoutingConfig
fixtures to match current schemas; replaced legacy RuntimeAgentStatus
"complete" with canonical "exited"; converted it(name, fn, 30000) calls to it(name, { timeout: 30000 }, fn) for node:test
compatibility; declared mock.fn<(…args: any[]) => any>() so
mockImplementation accepts non-undefined returns. Anti-shortcut
policy enforced: zero new as any casts; zero @ts-expect-error
added (the 3 unused-directive errors were removed); only legitimate
2-step as unknown as X widenings with justifying comments; no
garbage default values — every mock-object missing-field fix uses
a schema-defined value. Pi-shim extended ExtensionContext
from any to a structural interface so ctx.ui.custom<T>()
typechecks at 4 settings-tui.ts call sites; ui left optional so
thin test mocks (e.g., { model: null }) still satisfy the type.
After the pass: npm run typecheck exits 0;
npm run lint / npm run format:check unchanged from baseline;
test suite 3627 passing / 1 skipped / 0 failed (TP-191
baseline 3624 + 3 new TP-195 regression tests for the
fix-the-bug paths). Strict mode remains out of scope — the
strictness ratchet (enabling strict: true /
noImplicitAny: true) is a separate post-TP-194 follow-up. With
this packet merged, TP-194’s typecheck-gate flip CRITICAL
pre-condition (“npm run typecheck exits 0 on main”) is
satisfied.
Code-quality formatter adoption (TP-193): Third of four sequenced
packets implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md
section 6.3). Enabled the Biome formatter and applied it once across
the entire codebase in a single mechanical commit. Formatter rules
pinned in biome.json per spec section 6.3.1: indentStyle: "tab",
indentWidth: 1, lineWidth: 100, lineEnding: "lf",
quoteStyle: "double", trailingCommas: "all", semicolons: "always",
arrowParentheses: "always". Format pass touched 161 files
(every TS/MJS file in scope) with cosmetic-only changes — line
wrapping, trailing-comma insertions, single-param arrow parens, and a
small number of quote-style switches where Biome's smart-quote rule
picked the alternative quote when the primary was inside the string.
No semantic changes. Test resilience prep preceded the format
pass in a separate commit: introduced expect().toContainNormalized()
(whitespace + bracket-padding + trailing-comma normalized substring
match) and updated 22 distinct source-grep test assertions across
~20 test files to use the helper or pre-normalize source before
matching; bumped...

Assets 2

10 May 14:32

github-actions

v0.29.2

35c2784

v0.29.2

Internal

Migrate peerDependencies from @mariozechner/* to @earendil-works/* and mark them optional: every pi update was printing four npm warn deprecated lines (one for each @mariozechner/pi-* package the new pi packages tell npm they are deprecating). Pi v0.74.0+ ships under the @earendil-works scope; the legacy @mariozechner peer-dep entries in taskplane's package.json made npm resolve the deprecated packages and surface the warnings on every install. Fix: switch the four pi-related entries in peerDependencies to @earendil-works/pi-coding-agent, @earendil-works/pi-tui, @earendil-works/pi-ai (kept @sinclair/typebox unchanged — not pi-managed); add a peerDependenciesMeta block marking all three pi packages optional: true so npm doesn't generate unmet-peer warnings for users in transitional setups, and so we don't tell users they MUST have pi globally installed at npm-install time (pi is the runtime, not a strict install-time peer).

No source-code changes. The import statements in extensions/*.ts continue to reference @mariozechner/* because Pi's runtime extension loader (<pi>/dist/core/extensions/loader.js) bundles aliases for BOTH scopes — imports resolve identically regardless of which scope name is used. Changing the import statements would break compat for users still on Pi < v0.74.0 (the alias map was added in v0.74.0). The peerDependencies declaration is informational only; the runtime resolution is unaffected by either approach.

No tests changed; no behavior changed. Tests pass at the v0.29.1 baseline (3624 passing / 1 skipped / 0 failed).

Assets 2

10 May 14:14

github-actions

v0.29.1

f2d11ed

v0.29.1

Fixed

Runtime V2 spawn failures now visible (TP-190, #561): Previously,
when a Runtime V2 lane spawn failed at the very first call site (Pi CLI
not findable, worktree provisioning error, branch collision), the lane
was not transitioned to failed. The engine continued polling
indefinitely, the dashboard showed green/running lanes that had no
actual worker process, orch_status() reported executing, and no
supervisor alert fired. Recovery required the operator to manually
tail engine-worker stderr — not in any documented diagnostic place.
This bug masked the operator-side impact of #559 (orchestrator IPC
crash) and #560 (@earendil-works rename), making both look like
hangs rather than immediate spawn errors. Fix has four parts:
(1) State transition — the existing per-task try/catch in
executeLaneV2 now tags the failed LaneTaskOutcome with
exitDiagnostic.classification = "spawn_failure" (a new
ExitClassification value alongside process_crash, stall_timeout,
etc.) and writes a synthetic terminal RuntimeLaneSnapshot so
monitorLanes resolves the lane to terminal state instead of looping
forever on the never-written snapshot file (the actual root cause of
the silent hang). (2) No-retry policy — spawn_failure is
intentionally NOT in TIER0_RETRYABLE_CLASSIFICATIONS because
spawn-stage errors are never transient; a defense-in-depth early
return in attemptWorkerCrashRetry produces an operator-friendly log
line. (3) IPC alert — the task-failure supervisor alert payload
now carries context.exitCategory (and a "Spawn failure: … escalate
immediately" summary line when applicable) so the supervisor playbook
can branch on spawn-stage failures and escalate without retrying. The
same wiring is mirrored in resume.ts for /orch-resume parity.
(4) Phase transition — when every task in a wave fails with
classification === "spawn_failure", batchState.phase transitions
from "executing" to "failed" (not "paused", because the operator
cannot un-stick spawn failures without changing something external).
Validation: 33 new behavioral + helper tests in
extensions/tests/spawn-failure-visibility.test.ts; full fast suite
3620 pass / 1 skipped / 0 failed (+33 from baseline 3587);
cross-platform Node 24 CI.

Sage post-merge fold: two important correctness issues caught by
sage's review of the merged TP-190 work, both folded before public
release. (a) Residual hang on snapshot-write failure: the spawn-
failure catch's writeLaneSnapshot() is best-effort, but the original
comment claimed a 30-second staleness fallback would recover — not
true when snap == null (no file at all), because snap?.updatedAt
is undefined so staleMs == 0 and the 30-second check never fires.
Snapshot-write failure (disk full, permission, transient I/O) would
have left sessionAlive = true indefinitely, reintroducing the same
#561 hang. Fix: added a null-snapshot tracker-age fallback in
resolveTaskMonitorState — when snap == null AND the tracker has
observed the task for ≥ 60s (past startup grace), consult the
registry liveness check instead of defaulting to alive. (b)
Multi-segment edge in isAllLanesSpawnFailedWave: the
succeededTaskIds.length !== 0 gate was the terminal completion
projection, populated only when a multi-segment task reaches its
final segment. A wave with a multi-segment task succeeding on
segment 1 (with continuation scheduled) plus a single-segment task
spawn-failing would have falsely tripped phase=failed, burying real
progress. Fix: the helper now optionally accepts laneResults and
scans per-task outcomes for any status === "succeeded". 4 new
sage-fold tests in spawn-failure-visibility.test.ts cover both
edge cases. Final test count: 3624 passing (+4 over the 3620
worker-batch baseline).

Polyrepo end-to-end verified by operator in
C:/dev/tp-test-workspace. The bug class that previously left
lanes silently "running" forever now surfaces immediately as a
visible failure with a meaningful phase=failed and task-failure
IPC alert.

Assets 2

10 May 03:10

github-actions

v0.29.0

9a191ff

v0.29.0

New

supervisor_takeover(reason) tool (TP-187, #538): Non-destructive
escape hatch for misbehaving batches. Pauses the running wave, drains
every per-agent on-disk outbox for the current batch, and marks all
active lanes as terminated so any in-transit zombie alerts are dropped
before they reach the supervisor's user-message queue. Worktrees,
branches, batch state, and sessions are all preserved — distinct from
orch_abort, which kills sessions and deletes state. Use this when
the batch is producing alert spam or has hit a death-spiral pattern
but you may still want to resume the same batch. After takeover, call
orch_status() to inspect, then either orch_resume(force=true) to
continue (alert suppression is lifted automatically on resume) or
orch_abort() to escalate to destructive shutdown. Documented in
templates/agents/supervisor.md alongside the existing orch_* tool
surface, plus a new section codifying the lane-runner's text-reply
parser semantics (close keywords skip / let it fail / close /
abort / stop are only treated as session-close directives when
they appear in a reply under 30 characters; longer messages are
always treated as instructional re-prompts).

Fixed

Zombie supervisor alerts after lane termination (TP-187, #538):
Previously, when a worker lane was killed (no-progress threshold or
hard-fail), 3–5 "wants to exit" alerts that the worker emitted before
termination remained in the supervisor's user-message queue and the
agent's on-disk outbox, where they could be re-discovered later.
None of the documented operator responses (steer, skip, let it fail, orch_abort, orch_skip_task) reliably drained either path.
Fix has three parts: (1) at every lane-termination decision point
(no-progress kill in lane-runner.ts, hard-fail in engine.ts), the
agent's outbox is now synchronously drained — pending *.msg.json
files are moved to outbox/processed/ and other pending files (e.g.,
segment-expansion-*.json) are renamed to .drained so they are
invisible to subsequent discovery scans; (2) the engine emits a new
lane-terminated IPC message to the supervisor process, which keys
a per-batch suppression filter (terminatedLanes /
terminatedAgents Maps) that drops any subsequent supervisor-alert
whose context.laneNumber or context.agentId matches before it
reaches pi.sendUserMessage; (3) the engine emits a complementary
lane-respawned IPC at the start of each executeLaneV2 invocation
so a fresh task on a re-allocated lane number lifts the suppression.
The filter is also cleared on orch_resume(), on a new batch start,
and on supervisor_takeover()-then-resume. Implementation: new
drainAgentOutbox helper in mailbox.ts, LaneTerminatedInfo /
LaneTerminatedCallback types in types.ts, callback threading
through engine.ts / execution.ts / resume.ts / engine-worker.ts,
and IPC + filter wiring in extension.ts.
orch_resume(force=true) cannot reattach after orch_abort()
(TP-187, #539): executeAbort() deletes .pi/batch-state.json to
enforce its destructive contract, but the runtime registry, per-agent
manifests, lane snapshots, worktrees, and branches all survive. With
no batch-state.json, loadBatchState() returned null and force-resume
returned the generic "no batch found" error, forcing operators into
~15 minutes of manual git surgery (fast-forward feature branches,
push, remove worktrees, edit STATUS, re-orch_start) just to do what
force-resume should have done. Fix adds a small batch-meta.json
runtime artifact written at batch-start to
.pi/runtime/<batchId>/batch-meta.json capturing the wave plan and
the few non-recoverable scalars (baseBranch, orchBranch, mode,
startedAt, totalWaves). On force-resume after abort, when
loadBatchState() returns null, the new
reconstructBatchStateFromRuntime() helper deterministically rebuilds
a validator-compliant PersistedBatchState from the surviving
artifacts: most-recent batch dir wins by mtime (lex tiebreak),
batch-meta.json provides wave topology and orchBranch, worker
manifests provide per-lane allocation, and the existing reconciliation
pass re-detects succeeded tasks via .DONE markers and STATUS.md.
When required artifacts are missing or validation fails, force-resume
fails loud with a new resumeNoStateAfterAbort message that names
the missing artifact and recommends orch_start <PROMPT.md> as the
recovery path. The non-force orch_resume() path is unchanged.
orch_abort itself remains semantically destructive — only
force-resume reads from the surviving runtime artifacts.
Worker said: is empty in early no-progress alerts (TP-187, #540):
When a worker exits an iteration without producing a visible assistant
message (a known failure mode in the death-spiral pattern), the
worker-exit-intercept alert sent to the supervisor showed
Worker said: "" — leaving the supervisor with no signal about why
the worker is stuck on the iterations where intervention could still
help. By the time the field has content, the worker is already at
no-progress count 3 (kill threshold). Fix has two parts: (1)
templates/agents/task-worker.md now requires a one-sentence reason
before any silent exit-with-no-progress, with concrete examples; (2)
lane-runner.ts falls back to walking the worker's events.jsonl
backward to find the most recent non-empty assistant_message
payload when the current turn produced no visible output, and tags
the alert with which source (current-turn,
events-jsonl-fallback, or empty-sentinel) produced the
Worker said: field. The 500-character truncation invariant is
preserved.
taskplane doctor no longer shows empty parens for pi installed ()
(TP-189-C / TP-185 follow-up): pi prints its --version output to
stderr, but bin/taskplane.mjs's getVersion() only captured
stdout via execSync(... { stdio: 'pipe' }), so the doctor display was
✅ pi installed () with empty parens. The fix extracts getVersion
to bin/get-version.mjs (testable ESM helper) and switches it to
spawnSync with stdio: ['ignore', 'pipe', 'pipe']. The new logic
prefers stdout but falls back to stderr when stdout is empty, and
preserves the prior fail-safe contract (returns null on subprocess
failure or non-zero exit — critical so shell error text isn't surfaced
as a fake version string). Manual verification: taskplane doctor now
shows ✅ pi installed (0.73.0). 7 new behavioral tests in
extensions/tests/cli-doctor-version-capture.test.ts cover the
stdout-precedence, stderr-fallback, trim, and null-on-failure cases.
isStepMarkedComplete death-spiral guard now skips fenced code
blocks (TP-189-A3 / TP-186 follow-up): the helper that powers the
review_step REFUSED guard scanned STATUS.md line-by-line for the
literal **Status:** ✅ Complete pattern. If a step's body documented
that pattern inside a fenced code block (legitimate authoring of the
format itself), the guard would false-positive and refuse a legitimate
code review. The helper now uses CommonMark-aware fence tracking:
recognizes both ``` and ~~~ fences, tracks the opener char + length,
and only closes on a matching delimiter (same char, length ≥ opener
length, no trailing non-whitespace text). Mixed-delimiter examples and
```
close it. Step-heading detection is gated on being outside a fence so
a `### Step N:` line inside a code-block sample is treated as content
rather than a step boundary. 6 new unit tests cover the edge cases.
```

Docs

templates/agents/task-worker.md reconciled with TP-186's Order of
Operations rule (TP-189-E): two older sections were ambiguous when
read alongside the new review-gated step-completion contract from
TP-186. (1) Resume Algorithm step 6 ("all items checked → proceed to
next step") now splits behavior by Review Level: 0/1 may proceed,
but 2/3 must commit the implementation, call
review_step(type="code"), and only flip the per-step **Status:**
heading after APPROVE — with a cross-reference to the Order of
Operations section. (2) The Checkpoint Discipline / Git commits
example commit message changed from feat(TASK-ID): complete Step N — description to feat(TASK-ID): step N implementation, plus
explicit Level 0/1 vs Level 2/3 paragraphs and a separate
chore(TASK-ID): step N complete (code review APPROVE) example for
the post-APPROVE status-flip commit. Both edits reuse canonical
wording from the Order of Operations + Recovery Recipe sections so
the existing source-pattern tests in
extensions/tests/worker-step-completion-protocol.test.ts continue to
pass; a new test 1.4b regression-guards the Resume Algorithm wording.
skills/create-taskplane-task/SKILL.md Complexity Assessment
augmented with Per-Step Reviews vs. Consolidated Reviews
(Checkpoint Markers) sub-section (TP-189-E): the existing rubric
documents Review Levels 0–3 but not the second axis — how many
reviews fire for a given level. PROMPT authors had been discovering
this empirically (e.g., TP-186 fired only 2 reviews via checkpoint
markers vs the default ~8 it would have fired without them). The new
sub-section makes the choice explicit: per-step is the default and
right for independent multi-feature work; consolidation via
**Plan-review checkpoint** / **Code review checkpoint** markers
is appropriate for single-deliverable tasks where the steps are
mechanical applications of one design. TP-186 is referenced as the
canonical consolidation example.

Internal

DEFAULT_WORKER_USER_TOOLS migrated to a shared lightweight
constants module (TP-189-B / TP-184 follow-up): the literal
`"rea...

Assets 2

07 May 02:46

github-actions

v0.28.8

c3edf21

v0.28.8

Enhanced

Dashboard: task title row widened to span cols 3–6 (#485 follow-up):
The task title subtitle introduced in v0.28.7 was constrained to the
100px-wide task-id column, which truncated most realistic titles
('Reviewer runs typec...') after just a few words. Restructured the
task-row grid to two rows: row 1 holds the primary cells (icon, actions,
task-id, status, duration, progress, step+telemetry), row 2 holds the
optional task-title-subtitle spanning cols 3–6 (~486px combined width
vs. the previous 100px). Stops before col 7 (task-step + telemetry) so
step info and worker stats stay visible alongside the title. Auto row 2
collapses to 0 height when no subtitle exists, so tasks with null
taskTitle look identical to the v0.28.7 single-line layout. Display-only
change — cannot affect orchestrator correctness.

Fixed

Code reviewer now runs project quality checks (typecheck/lint/format)
before deciding (TP-188, #541): Previously, the reviewer agent spawned
via review_step(type="code") evaluated changes through behavioural
inspection only. It did NOT run npm run typecheck / npm run lint /
npm run format:check, so code with TypeScript strict-mode errors or
lint failures could receive APPROVE — those issues then surfaced at the
worker's Testing & Verification step, blocking the entire batch. In one
observed production batch, a code review returned APPROVE for a step
that subsequently failed npm run typecheck with 5 strict-mode errors
in the test code the reviewer had just signed off on. Cost of catching
these earlier: one extra typecheck per code review. Cost of NOT
catching them: the entire investment in the affected step plus all
dependents. Fix is a prompt-only change to
templates/agents/task-reviewer.md: a new Quality-check verification
section (between How You Work and Verdict Criteria) instructs the
reviewer to (1) discover commands by reading
.pi/taskplane-config.json taskRunner.testing.commands first, then
fall back to package.json scripts for typecheck / lint /
format:check; (2) run any matching commands using its existing bash
tool (no allowlist change required — bash is already in the default
reviewer tool list); (3) surface failures as Issues Found with
severity important; (4) downgrade an otherwise-APPROVE verdict to
REVISE when any quality check fails. Plan reviews skip the section
entirely (no code exists yet to typecheck). Skip-silently rule: if
neither config nor package.json yields a relevant command, the
reviewer notes the skip in the Summary and proceeds normally rather
than blocking on absent infrastructure. 10 new source-pattern tests in
extensions/tests/reviewer-quality-checks.test.ts lock the section
shape, the hybrid discovery wording, and the verdict-downgrade rule.
Windows worktree cleanup falls back to cmd rd /s /q when git hits
MAX_PATH (TP-188, #543): On Windows with default
core.longpaths = false, git worktree remove --force fails with
error: failed to delete '<path>': Filename too long when the worktree
contains a deep node_modules tree (most non-trivial Node projects).
Previously the orchestrator surfaced cleanup-incomplete via the
post-integration banner but didn't recover — the operator had to run
cmd /c "rd /s /q <path>" manually. Observed twice during a single
recovery flow on the user's Windows machine working with
emailgistics-astro (700+ npm deps). Fix adds two new exported helpers
in extensions/taskplane/worktree.ts: isWindowsMaxPathError(stderr)
(returns true only on win32 + /filename too long/i) and
runWindowsCmdRd(absolutePath) (invokes execFileSync("cmd", ["/c", "rd", "/s", "/q", winPath]) with forward slashes normalized to
backslashes for native Windows path semantics). The fallback fires
inside removeWorktree's retry loop when the predicate matches,
prunes git's bookkeeping on success so post-removal verification
passes, and falls through to the existing terminal/retry classification
on failure (with both git's stderr and cmd's stderr enriched into the
thrown error so operators can diagnose). Other error classes (lock
errors, permission denied, generic git errors) are unaffected. INFO-level
logs via execLog("cleanup", "worktree", ...) make the rescue path
visible in operator-facing output. 17 new tests in
extensions/tests/windows-worktree-cleanup-fallback.test.ts cover the
source-pattern wiring (helpers exist; removeWorktree calls them;
git worktree prune runs on fallback success; failure path enriches
the error), platform guard (returns false on linux/macOS), regex
case-insensitivity, and runWindowsCmdRd's mocked invocation. Tests
are platform-agnostic via child_process mocking so the suite passes
on every CI runner.

Internal

CI workflow upgraded to Node 24 LTS: .github/workflows/ci.yml was
on Node 22; release.yml had moved to Node 24 LTS during the v0.28.5
release work but ci.yml was not aligned. Two motivations converged: the
Node 22 / Node 24 mock.module() semantics divergence caused TP-188's
runWindowsCmdRd unit tests to fail on Node 22 CI while passing locally
on Node 24 (Node 24 aliases bare child_process and node:child_process;
Node 22 treats them as separate modules). Bumping ci.yml to Node 24 fixes
the test mock portability AND completes TP-189's Cluster D ahead of
schedule.

Assets 2

07 May 01:52

github-actions

v0.28.7

4664376

v0.28.7

Enhanced

Dashboard: lane parallelization visible in wave indicator chips (#484):
Wave chips at the top of the dashboard now group tasks by lane within each
wave, joining same-lane tasks with → (serial) and different-lane tasks
with | (parallel). For example, W1 [TP-165, TP-166, TP-168, TP-167]
now reads W1 [TP-165 → TP-166 | TP-168 | TP-167], immediately revealing
that TP-165→TP-166 are serialized on lane 1 while TP-168 and TP-167 run in
parallel on lanes 2 and 3. Within each lane, tasks render in execution
order (per lane.taskIds). Hover tooltip on the chip exposes the
expanded multi-line lane breakdown. Future waves with no lane assignment
data fall back to the previous flat comma-separated display — no
regression for unprovisioned waves.
Dashboard: task title under task ID in lane view (#485): The lane
view now renders the human-readable task title (extracted from PROMPT.md's
# Task: <ID> - <title> first-line heading) as a smaller muted subtitle
beneath the task ID. Operator no longer needs to remember what each
TP-XXX is. The title is read once from PROMPT.md and cached for the
server's lifetime (PROMPT.md is immutable above the --- divider).
Surfaced via a new taskTitle field on /api/state task records;
frontend falls back gracefully when the field is null.

Assets 2

07 May 00:11

github-actions

v0.28.6

2417cfc

v0.28.6

Fixed

Worker death-spiral when code review returns REVISE on a step already
marked Complete in STATUS (TP-186, #537, #542): Previously, if a worker
set a step's **Status:** ✅ Complete heading in STATUS.md before calling
review_step(type="code"), and the reviewer returned REVISE, the worker
was caught in a state contradiction (STATUS says done, reviewer says not)
with no recovery recipe in the prompt. The worker would loop through 3
no-progress iterations and the orch's safety mechanism would kill the
lane — the entire batch was a write-off, requiring ~15 min of manual git
surgery per occurrence. The fix is structural: (1) the base worker prompt
(templates/agents/task-worker.md) now contains an explicit Order of
Operations rule that mandates code review BEFORE marking a step
Complete, a Recovery Recipe for the case when the rule is
accidentally violated (revert STATUS → commit → handle REVISE through
the normal flow), and a Forbidden callout naming the death-spiral
anti-pattern alongside the existing "NEVER add, remove, or renumber
steps" family of MUST-NOT rules; (2) the engine-side review_step tool
now refuses to run on a step already marked **Status:** ✅ Complete,
returning a REFUSED verdict that points the worker at the Recovery
Recipe (the refusal applies to code and test review types only — plan
reviews fire pre-implementation and are correctly exempt). Until this
fix shipped, Review Level ≥ 2 was effectively unsafe in production. 14
new tests in worker-step-completion-protocol.test.ts. Supersedes the
partial diagnosis in #510. Thanks to the production batch
20260506T105850 against emailgistics-astro for surfacing the
reproducer.

Assets 2

06 May 21:16

HenryLach

v0.28.5

eacbcd8

v0.28.5

Fixed

Pi no longer hard-blocks startup with a red error when run in directories
that aren't configured for Taskplane (TP-183, #523): Previously, launching
pi in any non-git directory (or any directory without
.pi/taskplane-workspace.yaml / taskplane-config.json) raised a verbose
red WORKSPACE_SETUP_REQUIRED notification at session_start. For users who
only want Taskplane in some projects, this was wrong UX. The
orchestrator now soft-fails the WORKSPACE_SETUP_REQUIRED case
specifically: no error notification, status line shows the quiet
🔀 Orchestrator · disabled (no taskplane config in workspace) indicator,
orchestrator commands stay gracefully disabled (and still explain why if
invoked, via the existing requireExecCtx guard). Configuration errors in
workspaces that ARE set up — WORKSPACE_FILE_PARSE_ERROR,
WORKSPACE_SCHEMA_INVALID, WORKSPACE_REPO_PATH_NOT_FOUND, and every
other WorkspaceConfigErrorCode — still surface loudly with the existing
red notify and ❌ startup failed (workspace config error) status line, so
real misconfigurations remain visible. Throw behavior of
buildExecutionContext is unchanged — only the display in extension.ts
changes. 6 new tests in orchestrator-startup-uxv2.test.ts (3 scenarios,
6 fine-grained checks). Thanks to @mwickens for the report.
Workers can now invoke review_step, notify_supervisor,
escalate_to_supervisor, and request_segment_expansion (TP-184, #530):
Previously these
engine-internal coordination tools were missing from the worker's
hardcoded --tools allowlist, so pi's tool gate filtered them out at the
worker. The visible symptom: plan/code/test reviews silently never fired
at Review Level >= 1, supervisor steering replies were impossible, and
multi-repo segment-expansion requests were unreachable. The bridge tools
are now always appended to the worker allowlist regardless of
taskRunner.worker.tools config; the user-tools default is unchanged.
Introduces three new exports in agent-host.ts: ENGINE_BRIDGE_TOOLS
(canonical list of engine-internal tools), DEFAULT_WORKER_USER_TOOLS
(the user-tools default literal), and buildWorkerToolsAllowlist()
(combines user portion with bridge tools, deduplicated). Called exactly
once at the lane-runner spawn site. Defense-in-depth: lane-runner now
warns (via logExecution) if any bridge tool is missing from the final
allowlist. 14 new tests in worker-tools-allowlist.test.ts.
Preflight pi check no longer misreports cold-start timeouts as "Pi not
found" (TP-185): execCheck now classifies failures by mode (not-found,
timeout, exit-code, signal, unknown) instead of treating every
failure as missing-binary. The pi preflight now uses a 30s timeout (up
from 10s) and retries once on timeout to absorb cold-start variance — mise
shim resolution, Node bootstrap, AV process-launch scanning, and pi's own
startup can together exceed 10s on a fresh first run, especially on Windows.
Failure messages and hints are now tailored to the actual error kind
(e.g. timeouts say "Pi did not respond within 30s" + diagnostic guidance,
rather than the misleading "Install pi" hint). Detects missing binaries on
both POSIX (ENOENT/exit 127) and Windows (cmd.exe "is not recognized")
shells. 9 new tests in exec-check-error-classification.test.ts covering
every classification path including regression guards against the original
bug. Backward compatible: existing callers reading { ok, stdout } are
unaffected.
Worker model/thinking/tools from preferences now flow through to spawned
workers (TP-181, #522): taskRunner.worker.{model,thinking,tools} in
preferences.json (and project config) are now threaded from
TaskRunnerConfig through executeWave → executeLaneV2 to the worker
subprocess via TASKPLANE_WORKER_{MODEL,THINKING,TOOLS} env vars. Previously
LaneRunnerConfig.workerModel was hardcoded to "" and the user-configured
worker model was silently ignored. Mirrors the existing reviewer pipeline
established in TP-160. New buildWorkerEnv() helper, plumbed through
engine.ts, execution.ts, and resume.ts. 11 new tests in
worker-model.test.ts. Thanks to @NerfEko.

Contributors

mwickens and NerfEko

Assets 2

21 Apr 00:31

HenryLach

v0.28.4

f4770ee

v0.28.4

Fixed

Settings TUI: Agent Extensions section shows "Toggle extensions per agent type" instead of generic "Read-only collection/record fields" label.

Assets 2

Releases: HenryLach/taskplane

v0.30.1

Enhanced

Fixed

Uh oh!

v0.30.0

Fixed

Internal

Uh oh!

v0.29.2

Internal

Uh oh!

v0.29.1

Fixed

Uh oh!

v0.29.0

New

Fixed

Docs

Internal

Uh oh!

v0.28.8

Enhanced

Fixed

Internal

Uh oh!

v0.28.7

Enhanced

Uh oh!

v0.28.6

Fixed

Uh oh!

v0.28.5

Fixed

Contributors

Uh oh!

v0.28.4

Fixed

Uh oh!