Releases: HenryLach/taskplane
v0.30.1
Enhanced
-
Dashboard segment-level progress indicators (TP-197, #464): Multi-segment
task rows now show a horizontal pill row of per-segment status badges —
one pill per segment with a status icon (✅ succeeded · ⏳ running · ⬚
pending · ❌ failed · ⏸ stalled · ↷ skipped) plus the segment’s repo
ID. The currently-executing segment is visually emphasized. This closes
the operator-visibility gap introduced by TP-145’s.DONEsuppression for
non-final segments: previously, multi-segment lanes sat “running” with
no segment-level signal during the suppression window, which made wave 2+
batches where all tasks were mid-segment appear stuck. With the pill row
in place, operators can see at a glance which segments have finished,
which is running, and which remain. The progress bar itself is unchanged
— TP-174 already made it segment-scoped via the V2 lane snapshot’s
per-segment counts; the new pill row provides the missing context that
makes the existing bar legible as “current segment’s progress.”Backwards-compatibility: single-segment tasks render an empty pill row
(auto-collapsed grid sub-row), so the DOM and visual layout for
non-segmented batches are identical to before. The pill row lives in a
new grid row 3 of.task-row(cols 3–7), mirroring the
task-title-subtitlepattern from TP-485, and is intentionally placed
outside the.task-stepcell so the existing@media (max-width: 900px)
rule that hides.task-stepdoes not hide segment context on narrow
viewports. Nodashboard/server.cjschange was required — the existing
API response already exposedbatch.segments[],task.segmentIds, and
runtimeLaneSnapshots[*].segmentId.
Fixed
-
Multi-segment engine hardening (TP-196, #462 + #502 + #503 + #508):
closes four follow-up issues from the multi-repo task execution rollout
with a single coherent hardening pass against the multi-segment engine.-
.DONEauthority guards (#462) — three defense-in-depth checks now
refuse to honor a stale or premature.DONEin multi-segment tasks:
(a)resolveTaskMonitorState(execution.ts) accepts an optional
multiSegmentContext: { isFinalSegment, segmentId }parameter; when
isFinalSegment === falseand.DONEis present, Priority 1 is
skipped and a WARN is logged viaexecLog;monitorLanespopulates
this context fromtask.segmentIds+task.activeSegmentId. (b)
collectDoneTaskIdsForResume(resume.ts) now refuses to add a
taskId to the done set when persisted segment records exist AND any
segment is notsucceeded/skipped— the task re-reconciles instead
of silently being marked complete. (c) A new exported
checkDoneAuthoritySafeguardhelper (discovery.ts) emits a
doctor-styleconsole.warnwhen.DONEcoexists with unchecked
STATUS.md checkboxes during area scans. The pre-existing TP-135
"keeps .DONE authoritative even when segment frontier is incomplete"
test was updated to assert the inverted (post-#462) contract. -
SegmentScopeMode unification (#502 + #503) — promotes the
FULL_TASK / SEGMENT_SCOPED decision to a first-class
SegmentScopeMode = "FULL_TASK" | "SEGMENT_SCOPED"type intypes.ts
plus acomputeSegmentScopeMode(stepSegmentMap, repoStepNumbers, currentRepoId, currentStepNumber)helper inlane-runner.ts. The
iteration loop now derives both the authoritativesegmentScopeMode
and the legacyisSegmentScopedboolean alias from one call, and
the segment-prompt injection block is gated onisSegmentScoped
instead of the previous scatteredstepSegmentMap && currentRepoId && repoStepNumbers && remainingSteps.length > 0composite. New
behavioural regression suite
(extensions/tests/segment-scope-mode-prompt.test.ts, 9 tests
across 4 describe blocks) mocksspawnAgentto capture the worker
prompt + env + system prompt and verifies the FULL_TASK,
SEGMENT_SCOPED, polyrepo single-segment, and legacy/partial-marker
contracts end-to-end. -
Wasted-iteration elimination (#508) — lane-runner now performs
an explicit pre-spawn segment-completion check between the existing
remainingSteps.length === 0guard and thetotalIterations++
increment, delegating to a new pure helper
shouldSkipSpawnForCompleteSegment(statusContent, repoStepNumbers, currentRepoId). When every segment-scoped step for the active repo
is already complete, the loop logs"Pre-spawn segment-completion check"and breaks before incurring a worker spawn. Behavioural
test (extensions/tests/early-exit-segment-spawn-skip.test.ts)
mocksagent-host.spawnAgentviamock.moduleand asserts
spawnAgentCallCount === 0for a fixture worktree whose checkboxes
are pre-checked. -
Validation: typecheck / lint / format:check all exit 0. Fast
test suite passes at 3678 / 0 fail / 1 skip — net +51 new tests
spread across 3 new test files plus targeted updates to
segment-scoped-lane-runner.test.ts,resume-segment-frontier.test.ts,
andengine-runtime-v2-routing.test.ts(slice-window widening for
the longerresolveTaskMonitorStatebody).
-
v0.30.0
Fixed
- Preflight cleanup feature now actually runs (TP-195):
runOrchBatch
inextensions/taskplane/engine.tsreferencedsweepStaleArtifacts,
formatPreflightSweep,rotateSupervisorLogs, andformatLogRotation
inside the preflight-cleanup try-block, but those identifiers were
never imported from./cleanup.ts. At runtime the first reference
threw a ReferenceError that the enclosing catch-all swallowed, so
Layers 2–5 of preflight cleanup (age-based artifact sweep, supervisor
log rotation, telemetry size cap, prior-batch artifact cleanup) had
been silently a no-op since TP-065 / #221 (․2024-09). The missing
imports were uncovered by the TP-191 typecheck script; this fix adds
them so the advertised cleanup runs on every batch. Regression test:
tests/lane-runner-v2.test.ts 3.10asserts the four helpers are
imported from./cleanup.ts. max_worker_minutesconfig field is honored (TP-195): Lane-runner
config inexecuteLaneV2(extensions/taskplane/execution.ts) was
reading a non-existentconfig.failure?.maxWorkerMinutescamelCase
alias — always undefined — silently ignoring any operator-set value
onOrchestratorConfig.failure.max_worker_minutesand always falling
through to the hard-coded120-minute default. Fixed to read the
canonical snake_case field. Operators withmax_worker_minutes
configured in.pi/taskplane-config.jsonwill now have their
configured limit honored; default of 120 preserved when the field is
unset. Regression test:tests/lane-runner-v2.test.ts 3.9asserts
the corrected accessor and absence of the typo.- Resume’s failed-task supervisor-alert path no longer crashes
(TP-195): When/orch-resumeencountered a failed task during a
wave, the supervisor-alert emission block inresume.tscalled
batchState.tasks.find(…), butOrchBatchRuntimeStatehas no
tasksfield (onlyPersistedBatchStatedoes). The runtime call
would throwTypeError: undefined.find is not a function. The
failed-task path was never covered by tests, so the crash never
surfaced. Replaced with a lookup againstlaneForTask?.tasks.find (…)?.task— the lane-allocatedParsedTaskpayload carries the
samesegmentIds/activeSegmentIddata the alert needs.
Regression test:tests/resume-bug-fixes.test.ts 4.1.
Internal
-
Code-quality gates active (TP-194)
The final task packet implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md,
section 6.4). Flips three static-analysis checks from advisory to
required CI gates:Typecheck(new —tsc --noEmitagainst
extensions/tsconfig.ci.json),Lint (Biome)(was already wired
but ran withcontinue-on-error: trueuntil now), and
Format check (Biome)(new —biome format --no-errors-on-unmatched .).
.github/workflows/ci.ymlruns the three steps in order before the
existingRun testsstep inside the singlecijob, so any failure
short-circuits the rest of the pipeline. The existing requiredci
branch-protection context already covers the new gates because a
step failure fails the whole job.Reviewer-agent activation: the TP-188 quality-check verification
section intemplates/agents/task-reviewer.mdis now fully active.
The temporary activation note added in TP-191 (which previously
surfaced quality-check failures as Issues Found without downgrading
the verdict) is removed; failing typecheck/lint/format:check now
unconditionally downgrades APPROVE → REVISE during code review.
Documentation updates:AGENTS.mdadds the three commands to the
validation checklist;docs/maintainers/release-process.mdadds
them to the pre-release checks and pre-release checklist;
docs/maintainers/development-setup.mdgets a new
"Code-quality gates (required for every PR)" section. The
long-missinglint:fixnpm script (referenced by these docs) is
added topackage.json.Operator handoff (verification-only): no branch-protection
changes are required. After this PR merges, verify via
gh api repos/HenryLach/taskplane/branches/main/protection
thatrequired_status_checks.contextsstill containsci(it
does today). If at some future point per-gate visibility in
branch protection is desirable, the follow-up is to split the
gates into separate jobs inci.yml— out of scope for TP-194
per the spec's Tier-1.5 follow-up list. -
Code-quality typecheck cleanup (TP-195): Fourth of four sequenced
packets implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md).
Cleaned up the 264 typecheck errors that TP-191 surfaced when it
first madenpm run typecheckrunnable, so TP-194’s gate flip can
promote typecheck from advisory to a CI gate. Final state:
npm run typecheckexits 0 againstextensions/tsconfig.ci.jsonat
the current strictness (strict: false,noImplicitAny: false).
Per-category breakdown of fixes (top categories at task start):
TS2339 (63) — property-not-exist; TS2741 (52) — mock-object missing
required fields; TS2345 (30) — caller-shape mismatch; TS2554 (23) —
signature drift; TS2367 (21) — unintentional comparison; TS2322 (19)
— assignment mismatch; TS2739 (12) — type missing properties; plus
smaller TS2769/TS2353/TS2352/TS2559/TS2347/TS2578/TS2304/TS2871/
TS2694 counts. Source-side highlights: 4 latent bugs uncovered
and fixed (preflight-cleanup-feature no-op,max_worker_minutes
typo, resume failed-task crash, plus an extension.ts dashboard
change-detection that was reading non-existent fields and only ever
refreshing oncurrentTaskId— dropped the dead comparisons,
observable behavior unchanged); widenedexecLog’sextra
parameter fromRecord<string, string\|number\|boolean>to
Record<string, unknown>(callers were already passing arrays/
objects; template-string stringification preserved); re-exported
RuntimeRegistryfromprocess-registry.ts; documented optional
batchId?field onOrchestratorConfig.orchestrator; added
EXEC_MISSING_TASK_FOLDERtoExecutionErrorCode; fixed
discriminated-union narrowing under non-strict mode by adding
reason?: undefined/error?: undefinedto success branches;
switchedloadProjectOverrides/migrateProjectOverrides/
loadJsonConfig/mergeProjectOverridesto
DeepPartial<TaskplaneConfig>; changedspawnMergeAgentV2return
type toPromise<void>(fire-and-forget). Test-side highlights:
introduced sharedtests/helpers/mock-orchestrator-config.ts
factories (makeOrchestratorConfig/makeTaskRunnerConfig) that
wrapDEFAULT_*_CONFIGdefaults fromtypes.tsso test mocks stay
in sync with the runtime schema; addedexpect.unreachable()and
optional 2ndmessagearg toexpect()(Vitest-compat surface that
~190 sites already relied on); fixed phase-narrowing in 9.x
launch-window suite via typedOrchBatchPhasecasts; updated
LaneRunnerConfig/PersistedTaskRecord/MergeResult/
BatchSummaryData/MinimalBatchState/WorkspaceRoutingConfig
fixtures to match current schemas; replaced legacyRuntimeAgentStatus
"complete"with canonical"exited"; convertedit(name, fn, 30000)calls toit(name, { timeout: 30000 }, fn)for node:test
compatibility; declaredmock.fn<(…args: any[]) => any>()so
mockImplementationaccepts non-undefined returns. Anti-shortcut
policy enforced: zero newas anycasts; zero@ts-expect-error
added (the 3 unused-directive errors were removed); only legitimate
2-stepas unknown as Xwidenings with justifying comments; no
garbage default values — every mock-object missing-field fix uses
a schema-defined value. Pi-shim extendedExtensionContext
fromanyto a structural interface soctx.ui.custom<T>()
typechecks at 4 settings-tui.ts call sites;uileft optional so
thin test mocks (e.g.,{ model: null }) still satisfy the type.
After the pass:npm run typecheckexits 0;
npm run lint/npm run format:checkunchanged from baseline;
test suite 3627 passing / 1 skipped / 0 failed (TP-191
baseline 3624 + 3 new TP-195 regression tests for the
fix-the-bug paths). Strict mode remains out of scope — the
strictness ratchet (enablingstrict: true/
noImplicitAny: true) is a separate post-TP-194 follow-up. With
this packet merged, TP-194’s typecheck-gate flip CRITICAL
pre-condition (“npm run typecheckexits 0 onmain”) is
satisfied. -
Code-quality formatter adoption (TP-193): Third of four sequenced
packets implementing the code-quality-gates spec
(docs/specifications/taskplane/code-quality-gates.md
section 6.3). Enabled the Biome formatter and applied it once across
the entire codebase in a single mechanical commit. Formatter rules
pinned inbiome.jsonper spec section 6.3.1:indentStyle: "tab",
indentWidth: 1,lineWidth: 100,lineEnding: "lf",
quoteStyle: "double",trailingCommas: "all",semicolons: "always",
arrowParentheses: "always". Format pass touched 161 files
(every TS/MJS file in scope) with cosmetic-only changes — line
wrapping, trailing-comma insertions, single-param arrow parens, and a
small number of quote-style switches where Biome's smart-quote rule
picked the alternative quote when the primary was inside the string.
No semantic changes. Test resilience prep preceded the format
pass in a separate commit: introducedexpect().toContainNormalized()
(whitespace + bracket-padding + trailing-comma normalized substring
match) and updated 22 distinct source-grep test assertions across
~20 test files to use the helper or pre-normalize source before
matching; bumped...
v0.29.2
Internal
-
Migrate
peerDependenciesfrom@mariozechner/*to@earendil-works/*and mark them optional: everypi updatewas printing fournpm warn deprecatedlines (one for each@mariozechner/pi-*package the new pi packages tell npm they are deprecating). Pi v0.74.0+ ships under the@earendil-worksscope; the legacy@mariozechnerpeer-dep entries in taskplane'spackage.jsonmade npm resolve the deprecated packages and surface the warnings on every install. Fix: switch the four pi-related entries inpeerDependenciesto@earendil-works/pi-coding-agent,@earendil-works/pi-tui,@earendil-works/pi-ai(kept@sinclair/typeboxunchanged — not pi-managed); add apeerDependenciesMetablock marking all three pi packagesoptional: trueso npm doesn't generate unmet-peer warnings for users in transitional setups, and so we don't tell users they MUST have pi globally installed at npm-install time (pi is the runtime, not a strict install-time peer).No source-code changes. The
importstatements inextensions/*.tscontinue to reference@mariozechner/*because Pi's runtime extension loader (<pi>/dist/core/extensions/loader.js) bundles aliases for BOTH scopes — imports resolve identically regardless of which scope name is used. Changing the import statements would break compat for users still on Pi < v0.74.0 (the alias map was added in v0.74.0). ThepeerDependenciesdeclaration is informational only; the runtime resolution is unaffected by either approach.No tests changed; no behavior changed. Tests pass at the v0.29.1 baseline (3624 passing / 1 skipped / 0 failed).
v0.29.1
Fixed
-
Runtime V2 spawn failures now visible (TP-190, #561): Previously,
when a Runtime V2 lane spawn failed at the very first call site (Pi CLI
not findable, worktree provisioning error, branch collision), the lane
was not transitioned tofailed. The engine continued polling
indefinitely, the dashboard showed green/running lanes that had no
actual worker process,orch_status()reportedexecuting, and no
supervisor alert fired. Recovery required the operator to manually
tailengine-worker stderr — not in any documented diagnostic place.
This bug masked the operator-side impact of #559 (orchestrator IPC
crash) and #560 (@earendil-worksrename), making both look like
hangs rather than immediate spawn errors. Fix has four parts:
(1) State transition — the existing per-task try/catch in
executeLaneV2now tags the failedLaneTaskOutcomewith
exitDiagnostic.classification = "spawn_failure"(a new
ExitClassificationvalue alongsideprocess_crash,stall_timeout,
etc.) and writes a synthetic terminalRuntimeLaneSnapshotso
monitorLanesresolves the lane to terminal state instead of looping
forever on the never-written snapshot file (the actual root cause of
the silent hang). (2) No-retry policy —spawn_failureis
intentionally NOT inTIER0_RETRYABLE_CLASSIFICATIONSbecause
spawn-stage errors are never transient; a defense-in-depth early
return inattemptWorkerCrashRetryproduces an operator-friendly log
line. (3) IPC alert — thetask-failuresupervisor alert payload
now carriescontext.exitCategory(and a "Spawn failure: … escalate
immediately" summary line when applicable) so the supervisor playbook
can branch on spawn-stage failures and escalate without retrying. The
same wiring is mirrored inresume.tsfor/orch-resumeparity.
(4) Phase transition — when every task in a wave fails with
classification === "spawn_failure",batchState.phasetransitions
from"executing"to"failed"(not"paused", because the operator
cannot un-stick spawn failures without changing something external).
Validation: 33 new behavioral + helper tests in
extensions/tests/spawn-failure-visibility.test.ts; full fast suite
3620 pass / 1 skipped / 0 failed (+33 from baseline 3587);
cross-platform Node 24 CI.Sage post-merge fold: two important correctness issues caught by
sage's review of the merged TP-190 work, both folded before public
release. (a) Residual hang on snapshot-write failure: the spawn-
failure catch'swriteLaneSnapshot()is best-effort, but the original
comment claimed a 30-second staleness fallback would recover — not
true whensnap == null(no file at all), becausesnap?.updatedAt
isundefinedsostaleMs == 0and the 30-second check never fires.
Snapshot-write failure (disk full, permission, transient I/O) would
have leftsessionAlive = trueindefinitely, reintroducing the same
#561 hang. Fix: added a null-snapshot tracker-age fallback in
resolveTaskMonitorState— whensnap == nullAND the tracker has
observed the task for ≥ 60s (past startup grace), consult the
registry liveness check instead of defaulting to alive. (b)
Multi-segment edge inisAllLanesSpawnFailedWave: the
succeededTaskIds.length !== 0gate was the terminal completion
projection, populated only when a multi-segment task reaches its
final segment. A wave with a multi-segment task succeeding on
segment 1 (with continuation scheduled) plus a single-segment task
spawn-failing would have falsely tripped phase=failed, burying real
progress. Fix: the helper now optionally acceptslaneResultsand
scans per-task outcomes for anystatus === "succeeded". 4 new
sage-fold tests inspawn-failure-visibility.test.tscover both
edge cases. Final test count: 3624 passing (+4 over the 3620
worker-batch baseline).Polyrepo end-to-end verified by operator in
C:/dev/tp-test-workspace. The bug class that previously left
lanes silently "running" forever now surfaces immediately as a
visible failure with a meaningfulphase=failedandtask-failure
IPC alert.
v0.29.0
New
supervisor_takeover(reason)tool (TP-187, #538): Non-destructive
escape hatch for misbehaving batches. Pauses the running wave, drains
every per-agent on-disk outbox for the current batch, and marks all
active lanes as terminated so any in-transit zombie alerts are dropped
before they reach the supervisor's user-message queue. Worktrees,
branches, batch state, and sessions are all preserved — distinct from
orch_abort, which kills sessions and deletes state. Use this when
the batch is producing alert spam or has hit a death-spiral pattern
but you may still want to resume the same batch. After takeover, call
orch_status()to inspect, then eitherorch_resume(force=true)to
continue (alert suppression is lifted automatically on resume) or
orch_abort()to escalate to destructive shutdown. Documented in
templates/agents/supervisor.mdalongside the existing orch_* tool
surface, plus a new section codifying the lane-runner's text-reply
parser semantics (close keywordsskip/let it fail/close/
abort/stopare only treated as session-close directives when
they appear in a reply under 30 characters; longer messages are
always treated as instructional re-prompts).
Fixed
-
Zombie supervisor alerts after lane termination (TP-187, #538):
Previously, when a worker lane was killed (no-progress threshold or
hard-fail), 3–5 "wants to exit" alerts that the worker emitted before
termination remained in the supervisor's user-message queue and the
agent's on-disk outbox, where they could be re-discovered later.
None of the documented operator responses (steer,skip,let it fail,orch_abort,orch_skip_task) reliably drained either path.
Fix has three parts: (1) at every lane-termination decision point
(no-progress kill inlane-runner.ts, hard-fail inengine.ts), the
agent's outbox is now synchronously drained — pending*.msg.json
files are moved tooutbox/processed/and other pending files (e.g.,
segment-expansion-*.json) are renamed to.drainedso they are
invisible to subsequent discovery scans; (2) the engine emits a new
lane-terminatedIPC message to the supervisor process, which keys
a per-batch suppression filter (terminatedLanes/
terminatedAgentsMaps) that drops any subsequent supervisor-alert
whosecontext.laneNumberorcontext.agentIdmatches before it
reachespi.sendUserMessage; (3) the engine emits a complementary
lane-respawnedIPC at the start of eachexecuteLaneV2invocation
so a fresh task on a re-allocated lane number lifts the suppression.
The filter is also cleared onorch_resume(), on a new batch start,
and onsupervisor_takeover()-then-resume. Implementation: new
drainAgentOutboxhelper inmailbox.ts,LaneTerminatedInfo/
LaneTerminatedCallbacktypes intypes.ts, callback threading
throughengine.ts/execution.ts/resume.ts/engine-worker.ts,
and IPC + filter wiring inextension.ts. -
orch_resume(force=true)cannot reattach afterorch_abort()
(TP-187, #539):executeAbort()deletes.pi/batch-state.jsonto
enforce its destructive contract, but the runtime registry, per-agent
manifests, lane snapshots, worktrees, and branches all survive. With
no batch-state.json,loadBatchState()returned null and force-resume
returned the generic "no batch found" error, forcing operators into
~15 minutes of manual git surgery (fast-forward feature branches,
push, remove worktrees, edit STATUS, re-orch_start) just to do what
force-resume should have done. Fix adds a smallbatch-meta.json
runtime artifact written at batch-start to
.pi/runtime/<batchId>/batch-meta.jsoncapturing the wave plan and
the few non-recoverable scalars (baseBranch, orchBranch, mode,
startedAt, totalWaves). On force-resume after abort, when
loadBatchState()returns null, the new
reconstructBatchStateFromRuntime()helper deterministically rebuilds
a validator-compliantPersistedBatchStatefrom the surviving
artifacts: most-recent batch dir wins by mtime (lex tiebreak),
batch-meta.jsonprovides wave topology and orchBranch, worker
manifests provide per-lane allocation, and the existing reconciliation
pass re-detects succeeded tasks via.DONEmarkers and STATUS.md.
When required artifacts are missing or validation fails, force-resume
fails loud with a newresumeNoStateAfterAbortmessage that names
the missing artifact and recommendsorch_start <PROMPT.md>as the
recovery path. The non-forceorch_resume()path is unchanged.
orch_abortitself remains semantically destructive — only
force-resume reads from the surviving runtime artifacts. -
Worker said:is empty in early no-progress alerts (TP-187, #540):
When a worker exits an iteration without producing a visible assistant
message (a known failure mode in the death-spiral pattern), the
worker-exit-intercept alert sent to the supervisor showed
Worker said: ""— leaving the supervisor with no signal about why
the worker is stuck on the iterations where intervention could still
help. By the time the field has content, the worker is already at
no-progress count 3 (kill threshold). Fix has two parts: (1)
templates/agents/task-worker.mdnow requires a one-sentence reason
before any silent exit-with-no-progress, with concrete examples; (2)
lane-runner.tsfalls back to walking the worker'sevents.jsonl
backward to find the most recent non-emptyassistant_message
payload when the current turn produced no visible output, and tags
the alert with which source (current-turn,
events-jsonl-fallback, orempty-sentinel) produced the
Worker said:field. The 500-character truncation invariant is
preserved. -
taskplane doctorno longer shows empty parens forpi installed ()
(TP-189-C / TP-185 follow-up): pi prints its--versionoutput to
stderr, butbin/taskplane.mjs'sgetVersion()only captured
stdout viaexecSync(... { stdio: 'pipe' }), so the doctor display was
✅ pi installed ()with empty parens. The fix extractsgetVersion
tobin/get-version.mjs(testable ESM helper) and switches it to
spawnSyncwithstdio: ['ignore', 'pipe', 'pipe']. The new logic
prefers stdout but falls back to stderr when stdout is empty, and
preserves the prior fail-safe contract (returnsnullon subprocess
failure or non-zero exit — critical so shell error text isn't surfaced
as a fake version string). Manual verification:taskplane doctornow
shows✅ pi installed (0.73.0). 7 new behavioral tests in
extensions/tests/cli-doctor-version-capture.test.tscover the
stdout-precedence, stderr-fallback, trim, and null-on-failure cases. -
isStepMarkedCompletedeath-spiral guard now skips fenced code
blocks (TP-189-A3 / TP-186 follow-up): the helper that powers the
review_stepREFUSED guard scanned STATUS.md line-by-line for the
literal**Status:** ✅ Completepattern. If a step's body documented
that pattern inside a fenced code block (legitimate authoring of the
format itself), the guard would false-positive and refuse a legitimate
code review. The helper now uses CommonMark-aware fence tracking:
recognizes both ``` and ~~~ fences, tracks the opener char + length,
and only closes on a matching delimiter (same char, length ≥ opener
length, no trailing non-whitespace text). Mixed-delimiter examples andclose it. Step-heading detection is gated on being outside a fence so a `### Step N:` line inside a code-block sample is treated as content rather than a step boundary. 6 new unit tests cover the edge cases.
Docs
templates/agents/task-worker.mdreconciled with TP-186's Order of
Operations rule (TP-189-E): two older sections were ambiguous when
read alongside the new review-gated step-completion contract from
TP-186. (1) Resume Algorithm step 6 ("all items checked → proceed to
next step") now splits behavior by Review Level: 0/1 may proceed,
but 2/3 must commit the implementation, call
review_step(type="code"), and only flip the per-step**Status:**
heading after APPROVE — with a cross-reference to the Order of
Operations section. (2) The Checkpoint Discipline / Git commits
example commit message changed fromfeat(TASK-ID): complete Step N — descriptiontofeat(TASK-ID): step N implementation, plus
explicit Level 0/1 vs Level 2/3 paragraphs and a separate
chore(TASK-ID): step N complete (code review APPROVE)example for
the post-APPROVE status-flip commit. Both edits reuse canonical
wording from the Order of Operations + Recovery Recipe sections so
the existing source-pattern tests in
extensions/tests/worker-step-completion-protocol.test.tscontinue to
pass; a new test 1.4b regression-guards the Resume Algorithm wording.skills/create-taskplane-task/SKILL.mdComplexity Assessment
augmented with Per-Step Reviews vs. Consolidated Reviews
(Checkpoint Markers) sub-section (TP-189-E): the existing rubric
documents Review Levels 0–3 but not the second axis — how many
reviews fire for a given level. PROMPT authors had been discovering
this empirically (e.g., TP-186 fired only 2 reviews via checkpoint
markers vs the default ~8 it would have fired without them). The new
sub-section makes the choice explicit: per-step is the default and
right for independent multi-feature work; consolidation via
**Plan-review checkpoint**/**Code review checkpoint**markers
is appropriate for single-deliverable tasks where the steps are
mechanical applications of one design. TP-186 is referenced as the
canonical consolidation example.
Internal
DEFAULT_WORKER_USER_TOOLSmigrated to a shared lightweight
constants module (TP-189-B / TP-184 follow-up): the literal
`"rea...
v0.28.8
Enhanced
- Dashboard: task title row widened to span cols 3–6 (#485 follow-up):
The task title subtitle introduced in v0.28.7 was constrained to the
100px-wide task-id column, which truncated most realistic titles
('Reviewer runs typec...') after just a few words. Restructured the
task-row grid to two rows: row 1 holds the primary cells (icon, actions,
task-id, status, duration, progress, step+telemetry), row 2 holds the
optional task-title-subtitle spanning cols 3–6 (~486px combined width
vs. the previous 100px). Stops before col 7 (task-step + telemetry) so
step info and worker stats stay visible alongside the title. Auto row 2
collapses to 0 height when no subtitle exists, so tasks with null
taskTitle look identical to the v0.28.7 single-line layout. Display-only
change — cannot affect orchestrator correctness.
Fixed
- Code reviewer now runs project quality checks (typecheck/lint/format)
before deciding (TP-188, #541): Previously, the reviewer agent spawned
viareview_step(type="code")evaluated changes through behavioural
inspection only. It did NOT runnpm run typecheck/npm run lint/
npm run format:check, so code with TypeScript strict-mode errors or
lint failures could receive APPROVE — those issues then surfaced at the
worker's Testing & Verification step, blocking the entire batch. In one
observed production batch, acodereview returned APPROVE for a step
that subsequently failednpm run typecheckwith 5 strict-mode errors
in the test code the reviewer had just signed off on. Cost of catching
these earlier: one extra typecheck per code review. Cost of NOT
catching them: the entire investment in the affected step plus all
dependents. Fix is a prompt-only change to
templates/agents/task-reviewer.md: a new Quality-check verification
section (between How You Work and Verdict Criteria) instructs the
reviewer to (1) discover commands by reading
.pi/taskplane-config.jsontaskRunner.testing.commandsfirst, then
fall back topackage.jsonscriptsfortypecheck/lint/
format:check; (2) run any matching commands using its existingbash
tool (no allowlist change required —bashis already in the default
reviewer tool list); (3) surface failures as Issues Found with
severityimportant; (4) downgrade an otherwise-APPROVE verdict to
REVISE when any quality check fails. Plan reviews skip the section
entirely (no code exists yet to typecheck). Skip-silently rule: if
neither config norpackage.jsonyields a relevant command, the
reviewer notes the skip in the Summary and proceeds normally rather
than blocking on absent infrastructure. 10 new source-pattern tests in
extensions/tests/reviewer-quality-checks.test.tslock the section
shape, the hybrid discovery wording, and the verdict-downgrade rule. - Windows worktree cleanup falls back to
cmd rd /s /qwhen git hits
MAX_PATH (TP-188, #543): On Windows with default
core.longpaths = false,git worktree remove --forcefails with
error: failed to delete '<path>': Filename too longwhen the worktree
contains a deepnode_modulestree (most non-trivial Node projects).
Previously the orchestrator surfaced cleanup-incomplete via the
post-integration banner but didn't recover — the operator had to run
cmd /c "rd /s /q <path>"manually. Observed twice during a single
recovery flow on the user's Windows machine working with
emailgistics-astro (700+ npm deps). Fix adds two new exported helpers
inextensions/taskplane/worktree.ts:isWindowsMaxPathError(stderr)
(returns true only on win32 +/filename too long/i) and
runWindowsCmdRd(absolutePath)(invokesexecFileSync("cmd", ["/c", "rd", "/s", "/q", winPath])with forward slashes normalized to
backslashes for native Windows path semantics). The fallback fires
insideremoveWorktree's retry loop when the predicate matches,
prunes git's bookkeeping on success so post-removal verification
passes, and falls through to the existing terminal/retry classification
on failure (with both git's stderr and cmd's stderr enriched into the
thrown error so operators can diagnose). Other error classes (lock
errors, permission denied, generic git errors) are unaffected. INFO-level
logs viaexecLog("cleanup", "worktree", ...)make the rescue path
visible in operator-facing output. 17 new tests in
extensions/tests/windows-worktree-cleanup-fallback.test.tscover the
source-pattern wiring (helpers exist;removeWorktreecalls them;
git worktree pruneruns on fallback success; failure path enriches
the error), platform guard (returns false on linux/macOS), regex
case-insensitivity, andrunWindowsCmdRd's mocked invocation. Tests
are platform-agnostic viachild_processmocking so the suite passes
on every CI runner.
Internal
- CI workflow upgraded to Node 24 LTS:
.github/workflows/ci.ymlwas
on Node 22;release.ymlhad moved to Node 24 LTS during the v0.28.5
release work but ci.yml was not aligned. Two motivations converged: the
Node 22 / Node 24mock.module()semantics divergence caused TP-188's
runWindowsCmdRdunit tests to fail on Node 22 CI while passing locally
on Node 24 (Node 24 aliases barechild_processandnode:child_process;
Node 22 treats them as separate modules). Bumping ci.yml to Node 24 fixes
the test mock portability AND completes TP-189's Cluster D ahead of
schedule.
v0.28.7
Enhanced
- Dashboard: lane parallelization visible in wave indicator chips (#484):
Wave chips at the top of the dashboard now group tasks by lane within each
wave, joining same-lane tasks with→(serial) and different-lane tasks
with|(parallel). For example,W1 [TP-165, TP-166, TP-168, TP-167]
now readsW1 [TP-165 → TP-166 | TP-168 | TP-167], immediately revealing
that TP-165→TP-166 are serialized on lane 1 while TP-168 and TP-167 run in
parallel on lanes 2 and 3. Within each lane, tasks render in execution
order (perlane.taskIds). Hover tooltip on the chip exposes the
expanded multi-line lane breakdown. Future waves with no lane assignment
data fall back to the previous flat comma-separated display — no
regression for unprovisioned waves. - Dashboard: task title under task ID in lane view (#485): The lane
view now renders the human-readable task title (extracted from PROMPT.md's
# Task: <ID> - <title>first-line heading) as a smaller muted subtitle
beneath the task ID. Operator no longer needs to remember what each
TP-XXX is. The title is read once from PROMPT.md and cached for the
server's lifetime (PROMPT.md is immutable above the---divider).
Surfaced via a newtaskTitlefield on/api/statetask records;
frontend falls back gracefully when the field is null.
v0.28.6
Fixed
- Worker death-spiral when code review returns REVISE on a step already
marked Complete in STATUS (TP-186, #537, #542): Previously, if a worker
set a step's**Status:** ✅ Completeheading in STATUS.md before calling
review_step(type="code"), and the reviewer returnedREVISE, the worker
was caught in a state contradiction (STATUS says done, reviewer says not)
with no recovery recipe in the prompt. The worker would loop through 3
no-progress iterations and the orch's safety mechanism would kill the
lane — the entire batch was a write-off, requiring ~15 min of manual git
surgery per occurrence. The fix is structural: (1) the base worker prompt
(templates/agents/task-worker.md) now contains an explicit Order of
Operations rule that mandates code review BEFORE marking a step
Complete, a Recovery Recipe for the case when the rule is
accidentally violated (revert STATUS → commit → handle REVISE through
the normal flow), and a Forbidden callout naming the death-spiral
anti-pattern alongside the existing "NEVER add, remove, or renumber
steps" family of MUST-NOT rules; (2) the engine-sidereview_steptool
now refuses to run on a step already marked**Status:** ✅ Complete,
returning aREFUSEDverdict that points the worker at the Recovery
Recipe (the refusal applies tocodeandtestreview types only — plan
reviews fire pre-implementation and are correctly exempt). Until this
fix shipped, Review Level ≥ 2 was effectively unsafe in production. 14
new tests inworker-step-completion-protocol.test.ts. Supersedes the
partial diagnosis in #510. Thanks to the production batch
20260506T105850againstemailgistics-astrofor surfacing the
reproducer.
v0.28.5
Fixed
- Pi no longer hard-blocks startup with a red error when run in directories
that aren't configured for Taskplane (TP-183, #523): Previously, launching
pi in any non-git directory (or any directory without
.pi/taskplane-workspace.yaml/taskplane-config.json) raised a verbose
redWORKSPACE_SETUP_REQUIREDnotification at session_start. For users who
only want Taskplane in some projects, this was wrong UX. The
orchestrator now soft-fails theWORKSPACE_SETUP_REQUIREDcase
specifically: no error notification, status line shows the quiet
🔀 Orchestrator · disabled (no taskplane config in workspace)indicator,
orchestrator commands stay gracefully disabled (and still explain why if
invoked, via the existingrequireExecCtxguard). Configuration errors in
workspaces that ARE set up —WORKSPACE_FILE_PARSE_ERROR,
WORKSPACE_SCHEMA_INVALID,WORKSPACE_REPO_PATH_NOT_FOUND, and every
otherWorkspaceConfigErrorCode— still surface loudly with the existing
red notify and❌ startup failed (workspace config error)status line, so
real misconfigurations remain visible. Throw behavior of
buildExecutionContextis unchanged — only the display inextension.ts
changes. 6 new tests inorchestrator-startup-uxv2.test.ts(3 scenarios,
6 fine-grained checks). Thanks to @mwickens for the report. - Workers can now invoke
review_step,notify_supervisor,
escalate_to_supervisor, andrequest_segment_expansion(TP-184, #530):
Previously these
engine-internal coordination tools were missing from the worker's
hardcoded--toolsallowlist, so pi's tool gate filtered them out at the
worker. The visible symptom: plan/code/test reviews silently never fired
at Review Level >= 1, supervisor steering replies were impossible, and
multi-repo segment-expansion requests were unreachable. The bridge tools
are now always appended to the worker allowlist regardless of
taskRunner.worker.toolsconfig; the user-tools default is unchanged.
Introduces three new exports inagent-host.ts:ENGINE_BRIDGE_TOOLS
(canonical list of engine-internal tools),DEFAULT_WORKER_USER_TOOLS
(the user-tools default literal), andbuildWorkerToolsAllowlist()
(combines user portion with bridge tools, deduplicated). Called exactly
once at the lane-runner spawn site. Defense-in-depth: lane-runner now
warns (vialogExecution) if any bridge tool is missing from the final
allowlist. 14 new tests inworker-tools-allowlist.test.ts. - Preflight
picheck no longer misreports cold-start timeouts as "Pi not
found" (TP-185):execChecknow classifies failures by mode (not-found,
timeout,exit-code,signal,unknown) instead of treating every
failure as missing-binary. Thepipreflight now uses a 30s timeout (up
from 10s) and retries once on timeout to absorb cold-start variance — mise
shim resolution, Node bootstrap, AV process-launch scanning, and pi's own
startup can together exceed 10s on a fresh first run, especially on Windows.
Failure messages and hints are now tailored to the actual error kind
(e.g. timeouts say "Pi did not respond within 30s" + diagnostic guidance,
rather than the misleading "Install pi" hint). Detects missing binaries on
both POSIX (ENOENT/exit 127) and Windows (cmd.exe"is not recognized")
shells. 9 new tests inexec-check-error-classification.test.tscovering
every classification path including regression guards against the original
bug. Backward compatible: existing callers reading{ ok, stdout }are
unaffected. - Worker model/thinking/tools from preferences now flow through to spawned
workers (TP-181, #522):taskRunner.worker.{model,thinking,tools}in
preferences.json(and project config) are now threaded from
TaskRunnerConfigthroughexecuteWave→executeLaneV2to the worker
subprocess viaTASKPLANE_WORKER_{MODEL,THINKING,TOOLS}env vars. Previously
LaneRunnerConfig.workerModelwas hardcoded to""and the user-configured
worker model was silently ignored. Mirrors the existing reviewer pipeline
established in TP-160. NewbuildWorkerEnv()helper, plumbed through
engine.ts,execution.ts, andresume.ts. 11 new tests in
worker-model.test.ts. Thanks to @NerfEko.