Skip to content

Releases: HenryLach/taskplane

v0.30.1

13 May 02:32

Choose a tag to compare

Enhanced

  • Dashboard segment-level progress indicators (TP-197, #464): Multi-segment
    task rows now show a horizontal pill row of per-segment status badges —
    one pill per segment with a status icon (✅ succeeded · ⏳ running · ⬚
    pending · ❌ failed · ⏸ stalled · ↷ skipped) plus the segment’s repo
    ID. The currently-executing segment is visually emphasized. This closes
    the operator-visibility gap introduced by TP-145’s .DONE suppression for
    non-final segments: previously, multi-segment lanes sat “running” with
    no segment-level signal during the suppression window, which made wave 2+
    batches where all tasks were mid-segment appear stuck. With the pill row
    in place, operators can see at a glance which segments have finished,
    which is running, and which remain. The progress bar itself is unchanged
    — TP-174 already made it segment-scoped via the V2 lane snapshot’s
    per-segment counts; the new pill row provides the missing context that
    makes the existing bar legible as “current segment’s progress.”

    Backwards-compatibility: single-segment tasks render an empty pill row
    (auto-collapsed grid sub-row), so the DOM and visual layout for
    non-segmented batches are identical to before. The pill row lives in a
    new grid row 3 of .task-row (cols 3–7), mirroring the
    task-title-subtitle pattern from TP-485, and is intentionally placed
    outside the .task-step cell so the existing @media (max-width: 900px)
    rule that hides .task-step does not hide segment context on narrow
    viewports. No dashboard/server.cjs change was required — the existing
    API response already exposed batch.segments[], task.segmentIds, and
    runtimeLaneSnapshots[*].segmentId.

Fixed

  • Multi-segment engine hardening (TP-196, #462 + #502 + #503 + #508):
    closes four follow-up issues from the multi-repo task execution rollout
    with a single coherent hardening pass against the multi-segment engine.

    • .DONE authority guards (#462) — three defense-in-depth checks now
      refuse to honor a stale or premature .DONE in multi-segment tasks:
      (a) resolveTaskMonitorState (execution.ts) accepts an optional
      multiSegmentContext: { isFinalSegment, segmentId } parameter; when
      isFinalSegment === false and .DONE is present, Priority 1 is
      skipped and a WARN is logged via execLog; monitorLanes populates
      this context from task.segmentIds + task.activeSegmentId. (b)
      collectDoneTaskIdsForResume (resume.ts) now refuses to add a
      taskId to the done set when persisted segment records exist AND any
      segment is not succeeded/skipped — the task re-reconciles instead
      of silently being marked complete. (c) A new exported
      checkDoneAuthoritySafeguard helper (discovery.ts) emits a
      doctor-style console.warn when .DONE coexists with unchecked
      STATUS.md checkboxes during area scans. The pre-existing TP-135
      "keeps .DONE authoritative even when segment frontier is incomplete"
      test was updated to assert the inverted (post-#462) contract.

    • SegmentScopeMode unification (#502 + #503) — promotes the
      FULL_TASK / SEGMENT_SCOPED decision to a first-class
      SegmentScopeMode = "FULL_TASK" | "SEGMENT_SCOPED" type in types.ts
      plus a computeSegmentScopeMode(stepSegmentMap, repoStepNumbers, currentRepoId, currentStepNumber) helper in lane-runner.ts. The
      iteration loop now derives both the authoritative segmentScopeMode
      and the legacy isSegmentScoped boolean alias from one call, and
      the segment-prompt injection block is gated on isSegmentScoped
      instead of the previous scattered stepSegmentMap && currentRepoId && repoStepNumbers && remainingSteps.length > 0 composite. New
      behavioural regression suite
      (extensions/tests/segment-scope-mode-prompt.test.ts, 9 tests
      across 4 describe blocks) mocks spawnAgent to capture the worker
      prompt + env + system prompt and verifies the FULL_TASK,
      SEGMENT_SCOPED, polyrepo single-segment, and legacy/partial-marker
      contracts end-to-end.

    • Wasted-iteration elimination (#508) — lane-runner now performs
      an explicit pre-spawn segment-completion check between the existing
      remainingSteps.length === 0 guard and the totalIterations++
      increment, delegating to a new pure helper
      shouldSkipSpawnForCompleteSegment(statusContent, repoStepNumbers, currentRepoId). When every segment-scoped step for the active repo
      is already complete, the loop logs "Pre-spawn segment-completion check" and breaks before incurring a worker spawn. Behavioural
      test (extensions/tests/early-exit-segment-spawn-skip.test.ts)
      mocks agent-host.spawnAgent via mock.module and asserts
      spawnAgentCallCount === 0 for a fixture worktree whose checkboxes
      are pre-checked.

    • Validation: typecheck / lint / format:check all exit 0. Fast
      test suite passes at 3678 / 0 fail / 1 skip — net +51 new tests
      spread across 3 new test files plus targeted updates to
      segment-scoped-lane-runner.test.ts, resume-segment-frontier.test.ts,
      and engine-runtime-v2-routing.test.ts (slice-window widening for
      the longer resolveTaskMonitorState body).

v0.30.0

10 May 23:10

Choose a tag to compare

Fixed

  • Preflight cleanup feature now actually runs (TP-195): runOrchBatch
    in extensions/taskplane/engine.ts referenced sweepStaleArtifacts,
    formatPreflightSweep, rotateSupervisorLogs, and formatLogRotation
    inside the preflight-cleanup try-block, but those identifiers were
    never imported from ./cleanup.ts. At runtime the first reference
    threw a ReferenceError that the enclosing catch-all swallowed, so
    Layers 2–5 of preflight cleanup (age-based artifact sweep, supervisor
    log rotation, telemetry size cap, prior-batch artifact cleanup) had
    been silently a no-op since TP-065 / #221 (․2024-09). The missing
    imports were uncovered by the TP-191 typecheck script; this fix adds
    them so the advertised cleanup runs on every batch. Regression test:
    tests/lane-runner-v2.test.ts 3.10 asserts the four helpers are
    imported from ./cleanup.ts.
  • max_worker_minutes config field is honored (TP-195): Lane-runner
    config in executeLaneV2 (extensions/taskplane/execution.ts) was
    reading a non-existent config.failure?.maxWorkerMinutes camelCase
    alias — always undefined — silently ignoring any operator-set value
    on OrchestratorConfig.failure.max_worker_minutes and always falling
    through to the hard-coded 120-minute default. Fixed to read the
    canonical snake_case field. Operators with max_worker_minutes
    configured in .pi/taskplane-config.json will now have their
    configured limit honored; default of 120 preserved when the field is
    unset. Regression test: tests/lane-runner-v2.test.ts 3.9 asserts
    the corrected accessor and absence of the typo.
  • Resume’s failed-task supervisor-alert path no longer crashes
    (TP-195):
    When /orch-resume encountered a failed task during a
    wave, the supervisor-alert emission block in resume.ts called
    batchState.tasks.find(…), but OrchBatchRuntimeState has no
    tasks field (only PersistedBatchState does). The runtime call
    would throw TypeError: undefined.find is not a function. The
    failed-task path was never covered by tests, so the crash never
    surfaced. Replaced with a lookup against laneForTask?.tasks.find (…)?.task — the lane-allocated ParsedTask payload carries the
    same segmentIds/activeSegmentId data the alert needs.
    Regression test: tests/resume-bug-fixes.test.ts 4.1.

Internal

  • Code-quality gates active (TP-194)

    The final task packet implementing the code-quality-gates spec
    (docs/specifications/taskplane/code-quality-gates.md,
    section 6.4). Flips three static-analysis checks from advisory to
    required CI gates: Typecheck (new — tsc --noEmit against
    extensions/tsconfig.ci.json), Lint (Biome) (was already wired
    but ran with continue-on-error: true until now), and
    Format check (Biome) (new — biome format --no-errors-on-unmatched .).
    .github/workflows/ci.yml runs the three steps in order before the
    existing Run tests step inside the single ci job, so any failure
    short-circuits the rest of the pipeline. The existing required ci
    branch-protection context already covers the new gates because a
    step failure fails the whole job.

    Reviewer-agent activation: the TP-188 quality-check verification
    section in templates/agents/task-reviewer.md is now fully active.
    The temporary activation note added in TP-191 (which previously
    surfaced quality-check failures as Issues Found without downgrading
    the verdict) is removed; failing typecheck/lint/format:check now
    unconditionally downgrades APPROVE → REVISE during code review.
    Documentation updates: AGENTS.md adds the three commands to the
    validation checklist; docs/maintainers/release-process.md adds
    them to the pre-release checks and pre-release checklist;
    docs/maintainers/development-setup.md gets a new
    "Code-quality gates (required for every PR)" section. The
    long-missing lint:fix npm script (referenced by these docs) is
    added to package.json.

    Operator handoff (verification-only): no branch-protection
    changes are required. After this PR merges, verify via
    gh api repos/HenryLach/taskplane/branches/main/protection
    that required_status_checks.contexts still contains ci (it
    does today). If at some future point per-gate visibility in
    branch protection is desirable, the follow-up is to split the
    gates into separate jobs in ci.yml — out of scope for TP-194
    per the spec's Tier-1.5 follow-up list.

  • Code-quality typecheck cleanup (TP-195): Fourth of four sequenced
    packets implementing the code-quality-gates spec
    (docs/specifications/taskplane/code-quality-gates.md).
    Cleaned up the 264 typecheck errors that TP-191 surfaced when it
    first made npm run typecheck runnable, so TP-194’s gate flip can
    promote typecheck from advisory to a CI gate. Final state:
    npm run typecheck exits 0 against extensions/tsconfig.ci.json at
    the current strictness (strict: false, noImplicitAny: false).
    Per-category breakdown of fixes (top categories at task start):
    TS2339 (63) — property-not-exist; TS2741 (52) — mock-object missing
    required fields; TS2345 (30) — caller-shape mismatch; TS2554 (23) —
    signature drift; TS2367 (21) — unintentional comparison; TS2322 (19)
    — assignment mismatch; TS2739 (12) — type missing properties; plus
    smaller TS2769/TS2353/TS2352/TS2559/TS2347/TS2578/TS2304/TS2871/
    TS2694 counts. Source-side highlights: 4 latent bugs uncovered
    and fixed (preflight-cleanup-feature no-op, max_worker_minutes
    typo, resume failed-task crash, plus an extension.ts dashboard
    change-detection that was reading non-existent fields and only ever
    refreshing on currentTaskId — dropped the dead comparisons,
    observable behavior unchanged); widened execLog’s extra
    parameter from Record<string, string\|number\|boolean> to
    Record<string, unknown> (callers were already passing arrays/
    objects; template-string stringification preserved); re-exported
    RuntimeRegistry from process-registry.ts; documented optional
    batchId? field on OrchestratorConfig.orchestrator; added
    EXEC_MISSING_TASK_FOLDER to ExecutionErrorCode; fixed
    discriminated-union narrowing under non-strict mode by adding
    reason?: undefined / error?: undefined to success branches;
    switched loadProjectOverrides / migrateProjectOverrides /
    loadJsonConfig / mergeProjectOverrides to
    DeepPartial<TaskplaneConfig>; changed spawnMergeAgentV2 return
    type to Promise<void> (fire-and-forget). Test-side highlights:
    introduced shared tests/helpers/mock-orchestrator-config.ts
    factories (makeOrchestratorConfig/makeTaskRunnerConfig) that
    wrap DEFAULT_*_CONFIG defaults from types.ts so test mocks stay
    in sync with the runtime schema; added expect.unreachable() and
    optional 2nd message arg to expect() (Vitest-compat surface that
    ~190 sites already relied on); fixed phase-narrowing in 9.x
    launch-window suite via typed OrchBatchPhase casts; updated
    LaneRunnerConfig / PersistedTaskRecord / MergeResult /
    BatchSummaryData / MinimalBatchState / WorkspaceRoutingConfig
    fixtures to match current schemas; replaced legacy RuntimeAgentStatus
    "complete" with canonical "exited"; converted it(name, fn, 30000) calls to it(name, { timeout: 30000 }, fn) for node:test
    compatibility; declared mock.fn<(…args: any[]) => any>() so
    mockImplementation accepts non-undefined returns. Anti-shortcut
    policy enforced:
    zero new as any casts; zero @ts-expect-error
    added (the 3 unused-directive errors were removed); only legitimate
    2-step as unknown as X widenings with justifying comments; no
    garbage default values — every mock-object missing-field fix uses
    a schema-defined value. Pi-shim extended ExtensionContext
    from any to a structural interface so ctx.ui.custom<T>()
    typechecks at 4 settings-tui.ts call sites; ui left optional so
    thin test mocks (e.g., { model: null }) still satisfy the type.
    After the pass: npm run typecheck exits 0;
    npm run lint / npm run format:check unchanged from baseline;
    test suite 3627 passing / 1 skipped / 0 failed (TP-191
    baseline 3624 + 3 new TP-195 regression tests for the
    fix-the-bug paths). Strict mode remains out of scope — the
    strictness ratchet (enabling strict: true /
    noImplicitAny: true) is a separate post-TP-194 follow-up. With
    this packet merged, TP-194’s typecheck-gate flip CRITICAL
    pre-condition (“npm run typecheck exits 0 on main”) is
    satisfied.

  • Code-quality formatter adoption (TP-193): Third of four sequenced
    packets implementing the code-quality-gates spec
    (docs/specifications/taskplane/code-quality-gates.md
    section 6.3). Enabled the Biome formatter and applied it once across
    the entire codebase in a single mechanical commit. Formatter rules
    pinned in biome.json per spec section 6.3.1: indentStyle: "tab",
    indentWidth: 1, lineWidth: 100, lineEnding: "lf",
    quoteStyle: "double", trailingCommas: "all", semicolons: "always",
    arrowParentheses: "always". Format pass touched 161 files
    (every TS/MJS file in scope) with cosmetic-only changes — line
    wrapping, trailing-comma insertions, single-param arrow parens, and a
    small number of quote-style switches where Biome's smart-quote rule
    picked the alternative quote when the primary was inside the string.
    No semantic changes. Test resilience prep preceded the format
    pass in a separate commit: introduced expect().toContainNormalized()
    (whitespace + bracket-padding + trailing-comma normalized substring
    match) and updated 22 distinct source-grep test assertions across
    ~20 test files to use the helper or pre-normalize source before
    matching; bumped...

Read more

v0.29.2

10 May 14:32

Choose a tag to compare

Internal

  • Migrate peerDependencies from @mariozechner/* to @earendil-works/* and mark them optional: every pi update was printing four npm warn deprecated lines (one for each @mariozechner/pi-* package the new pi packages tell npm they are deprecating). Pi v0.74.0+ ships under the @earendil-works scope; the legacy @mariozechner peer-dep entries in taskplane's package.json made npm resolve the deprecated packages and surface the warnings on every install. Fix: switch the four pi-related entries in peerDependencies to @earendil-works/pi-coding-agent, @earendil-works/pi-tui, @earendil-works/pi-ai (kept @sinclair/typebox unchanged — not pi-managed); add a peerDependenciesMeta block marking all three pi packages optional: true so npm doesn't generate unmet-peer warnings for users in transitional setups, and so we don't tell users they MUST have pi globally installed at npm-install time (pi is the runtime, not a strict install-time peer).

    No source-code changes. The import statements in extensions/*.ts continue to reference @mariozechner/* because Pi's runtime extension loader (<pi>/dist/core/extensions/loader.js) bundles aliases for BOTH scopes — imports resolve identically regardless of which scope name is used. Changing the import statements would break compat for users still on Pi < v0.74.0 (the alias map was added in v0.74.0). The peerDependencies declaration is informational only; the runtime resolution is unaffected by either approach.

    No tests changed; no behavior changed. Tests pass at the v0.29.1 baseline (3624 passing / 1 skipped / 0 failed).

v0.29.1

10 May 14:14

Choose a tag to compare

Fixed

  • Runtime V2 spawn failures now visible (TP-190, #561): Previously,
    when a Runtime V2 lane spawn failed at the very first call site (Pi CLI
    not findable, worktree provisioning error, branch collision), the lane
    was not transitioned to failed. The engine continued polling
    indefinitely, the dashboard showed green/running lanes that had no
    actual worker process, orch_status() reported executing, and no
    supervisor alert fired. Recovery required the operator to manually
    tail engine-worker stderr — not in any documented diagnostic place.
    This bug masked the operator-side impact of #559 (orchestrator IPC
    crash) and #560 (@earendil-works rename), making both look like
    hangs rather than immediate spawn errors. Fix has four parts:
    (1) State transition — the existing per-task try/catch in
    executeLaneV2 now tags the failed LaneTaskOutcome with
    exitDiagnostic.classification = "spawn_failure" (a new
    ExitClassification value alongside process_crash, stall_timeout,
    etc.) and writes a synthetic terminal RuntimeLaneSnapshot so
    monitorLanes resolves the lane to terminal state instead of looping
    forever on the never-written snapshot file (the actual root cause of
    the silent hang). (2) No-retry policyspawn_failure is
    intentionally NOT in TIER0_RETRYABLE_CLASSIFICATIONS because
    spawn-stage errors are never transient; a defense-in-depth early
    return in attemptWorkerCrashRetry produces an operator-friendly log
    line. (3) IPC alert — the task-failure supervisor alert payload
    now carries context.exitCategory (and a "Spawn failure: … escalate
    immediately" summary line when applicable) so the supervisor playbook
    can branch on spawn-stage failures and escalate without retrying. The
    same wiring is mirrored in resume.ts for /orch-resume parity.
    (4) Phase transition — when every task in a wave fails with
    classification === "spawn_failure", batchState.phase transitions
    from "executing" to "failed" (not "paused", because the operator
    cannot un-stick spawn failures without changing something external).
    Validation: 33 new behavioral + helper tests in
    extensions/tests/spawn-failure-visibility.test.ts; full fast suite
    3620 pass / 1 skipped / 0 failed (+33 from baseline 3587);
    cross-platform Node 24 CI.

    Sage post-merge fold: two important correctness issues caught by
    sage's review of the merged TP-190 work, both folded before public
    release. (a) Residual hang on snapshot-write failure: the spawn-
    failure catch's writeLaneSnapshot() is best-effort, but the original
    comment claimed a 30-second staleness fallback would recover — not
    true when snap == null (no file at all), because snap?.updatedAt
    is undefined so staleMs == 0 and the 30-second check never fires.
    Snapshot-write failure (disk full, permission, transient I/O) would
    have left sessionAlive = true indefinitely, reintroducing the same
    #561 hang. Fix: added a null-snapshot tracker-age fallback in
    resolveTaskMonitorState — when snap == null AND the tracker has
    observed the task for ≥ 60s (past startup grace), consult the
    registry liveness check instead of defaulting to alive. (b)
    Multi-segment edge in isAllLanesSpawnFailedWave: the
    succeededTaskIds.length !== 0 gate was the terminal completion
    projection, populated only when a multi-segment task reaches its
    final segment. A wave with a multi-segment task succeeding on
    segment 1 (with continuation scheduled) plus a single-segment task
    spawn-failing would have falsely tripped phase=failed, burying real
    progress. Fix: the helper now optionally accepts laneResults and
    scans per-task outcomes for any status === "succeeded". 4 new
    sage-fold tests in spawn-failure-visibility.test.ts cover both
    edge cases. Final test count: 3624 passing (+4 over the 3620
    worker-batch baseline).

    Polyrepo end-to-end verified by operator in
    C:/dev/tp-test-workspace. The bug class that previously left
    lanes silently "running" forever now surfaces immediately as a
    visible failure with a meaningful phase=failed and task-failure
    IPC alert.

v0.29.0

10 May 03:10

Choose a tag to compare

New

  • supervisor_takeover(reason) tool (TP-187, #538): Non-destructive
    escape hatch for misbehaving batches. Pauses the running wave, drains
    every per-agent on-disk outbox for the current batch, and marks all
    active lanes as terminated so any in-transit zombie alerts are dropped
    before they reach the supervisor's user-message queue. Worktrees,
    branches, batch state, and sessions are all preserved — distinct from
    orch_abort, which kills sessions and deletes state. Use this when
    the batch is producing alert spam or has hit a death-spiral pattern
    but you may still want to resume the same batch. After takeover, call
    orch_status() to inspect, then either orch_resume(force=true) to
    continue (alert suppression is lifted automatically on resume) or
    orch_abort() to escalate to destructive shutdown. Documented in
    templates/agents/supervisor.md alongside the existing orch_* tool
    surface, plus a new section codifying the lane-runner's text-reply
    parser semantics (close keywords skip / let it fail / close /
    abort / stop are only treated as session-close directives when
    they appear in a reply under 30 characters; longer messages are
    always treated as instructional re-prompts).

Fixed

  • Zombie supervisor alerts after lane termination (TP-187, #538):
    Previously, when a worker lane was killed (no-progress threshold or
    hard-fail), 3–5 "wants to exit" alerts that the worker emitted before
    termination remained in the supervisor's user-message queue and the
    agent's on-disk outbox, where they could be re-discovered later.
    None of the documented operator responses (steer, skip, let it fail, orch_abort, orch_skip_task) reliably drained either path.
    Fix has three parts: (1) at every lane-termination decision point
    (no-progress kill in lane-runner.ts, hard-fail in engine.ts), the
    agent's outbox is now synchronously drained — pending *.msg.json
    files are moved to outbox/processed/ and other pending files (e.g.,
    segment-expansion-*.json) are renamed to .drained so they are
    invisible to subsequent discovery scans; (2) the engine emits a new
    lane-terminated IPC message to the supervisor process, which keys
    a per-batch suppression filter (terminatedLanes /
    terminatedAgents Maps) that drops any subsequent supervisor-alert
    whose context.laneNumber or context.agentId matches before it
    reaches pi.sendUserMessage; (3) the engine emits a complementary
    lane-respawned IPC at the start of each executeLaneV2 invocation
    so a fresh task on a re-allocated lane number lifts the suppression.
    The filter is also cleared on orch_resume(), on a new batch start,
    and on supervisor_takeover()-then-resume. Implementation: new
    drainAgentOutbox helper in mailbox.ts, LaneTerminatedInfo /
    LaneTerminatedCallback types in types.ts, callback threading
    through engine.ts / execution.ts / resume.ts / engine-worker.ts,
    and IPC + filter wiring in extension.ts.

  • orch_resume(force=true) cannot reattach after orch_abort()
    (TP-187, #539):
    executeAbort() deletes .pi/batch-state.json to
    enforce its destructive contract, but the runtime registry, per-agent
    manifests, lane snapshots, worktrees, and branches all survive. With
    no batch-state.json, loadBatchState() returned null and force-resume
    returned the generic "no batch found" error, forcing operators into
    ~15 minutes of manual git surgery (fast-forward feature branches,
    push, remove worktrees, edit STATUS, re-orch_start) just to do what
    force-resume should have done. Fix adds a small batch-meta.json
    runtime artifact written at batch-start to
    .pi/runtime/<batchId>/batch-meta.json capturing the wave plan and
    the few non-recoverable scalars (baseBranch, orchBranch, mode,
    startedAt, totalWaves). On force-resume after abort, when
    loadBatchState() returns null, the new
    reconstructBatchStateFromRuntime() helper deterministically rebuilds
    a validator-compliant PersistedBatchState from the surviving
    artifacts: most-recent batch dir wins by mtime (lex tiebreak),
    batch-meta.json provides wave topology and orchBranch, worker
    manifests provide per-lane allocation, and the existing reconciliation
    pass re-detects succeeded tasks via .DONE markers and STATUS.md.
    When required artifacts are missing or validation fails, force-resume
    fails loud with a new resumeNoStateAfterAbort message that names
    the missing artifact and recommends orch_start <PROMPT.md> as the
    recovery path. The non-force orch_resume() path is unchanged.
    orch_abort itself remains semantically destructive — only
    force-resume reads from the surviving runtime artifacts.

  • Worker said: is empty in early no-progress alerts (TP-187, #540):
    When a worker exits an iteration without producing a visible assistant
    message (a known failure mode in the death-spiral pattern), the
    worker-exit-intercept alert sent to the supervisor showed
    Worker said: "" — leaving the supervisor with no signal about why
    the worker is stuck on the iterations where intervention could still
    help. By the time the field has content, the worker is already at
    no-progress count 3 (kill threshold). Fix has two parts: (1)
    templates/agents/task-worker.md now requires a one-sentence reason
    before any silent exit-with-no-progress, with concrete examples; (2)
    lane-runner.ts falls back to walking the worker's events.jsonl
    backward to find the most recent non-empty assistant_message
    payload when the current turn produced no visible output, and tags
    the alert with which source (current-turn,
    events-jsonl-fallback, or empty-sentinel) produced the
    Worker said: field. The 500-character truncation invariant is
    preserved.

  • taskplane doctor no longer shows empty parens for pi installed ()
    (TP-189-C / TP-185 follow-up):
    pi prints its --version output to
    stderr, but bin/taskplane.mjs's getVersion() only captured
    stdout via execSync(... { stdio: 'pipe' }), so the doctor display was
    ✅ pi installed () with empty parens. The fix extracts getVersion
    to bin/get-version.mjs (testable ESM helper) and switches it to
    spawnSync with stdio: ['ignore', 'pipe', 'pipe']. The new logic
    prefers stdout but falls back to stderr when stdout is empty, and
    preserves the prior fail-safe contract (returns null on subprocess
    failure or non-zero exit — critical so shell error text isn't surfaced
    as a fake version string). Manual verification: taskplane doctor now
    shows ✅ pi installed (0.73.0). 7 new behavioral tests in
    extensions/tests/cli-doctor-version-capture.test.ts cover the
    stdout-precedence, stderr-fallback, trim, and null-on-failure cases.

  • isStepMarkedComplete death-spiral guard now skips fenced code
    blocks (TP-189-A3 / TP-186 follow-up):
    the helper that powers the
    review_step REFUSED guard scanned STATUS.md line-by-line for the
    literal **Status:** ✅ Complete pattern. If a step's body documented
    that pattern inside a fenced code block (legitimate authoring of the
    format itself), the guard would false-positive and refuse a legitimate
    code review. The helper now uses CommonMark-aware fence tracking:
    recognizes both ``` and ~~~ fences, tracks the opener char + length,
    and only closes on a matching delimiter (same char, length ≥ opener
    length, no trailing non-whitespace text). Mixed-delimiter examples and

    close it. Step-heading detection is gated on being outside a fence so
    a `### Step N:` line inside a code-block sample is treated as content
    rather than a step boundary. 6 new unit tests cover the edge cases.
    
    

Docs

  • templates/agents/task-worker.md reconciled with TP-186's Order of
    Operations rule (TP-189-E):
    two older sections were ambiguous when
    read alongside the new review-gated step-completion contract from
    TP-186. (1) Resume Algorithm step 6 ("all items checked → proceed to
    next step") now splits behavior by Review Level: 0/1 may proceed,
    but 2/3 must commit the implementation, call
    review_step(type="code"), and only flip the per-step **Status:**
    heading after APPROVE — with a cross-reference to the Order of
    Operations section. (2) The Checkpoint Discipline / Git commits
    example commit message changed from feat(TASK-ID): complete Step N — description to feat(TASK-ID): step N implementation, plus
    explicit Level 0/1 vs Level 2/3 paragraphs and a separate
    chore(TASK-ID): step N complete (code review APPROVE) example for
    the post-APPROVE status-flip commit. Both edits reuse canonical
    wording from the Order of Operations + Recovery Recipe sections so
    the existing source-pattern tests in
    extensions/tests/worker-step-completion-protocol.test.ts continue to
    pass; a new test 1.4b regression-guards the Resume Algorithm wording.
  • skills/create-taskplane-task/SKILL.md Complexity Assessment
    augmented with Per-Step Reviews vs. Consolidated Reviews
    (Checkpoint Markers) sub-section (TP-189-E):
    the existing rubric
    documents Review Levels 0–3 but not the second axis — how many
    reviews fire for a given level. PROMPT authors had been discovering
    this empirically (e.g., TP-186 fired only 2 reviews via checkpoint
    markers vs the default ~8 it would have fired without them). The new
    sub-section makes the choice explicit: per-step is the default and
    right for independent multi-feature work; consolidation via
    **Plan-review checkpoint** / **Code review checkpoint** markers
    is appropriate for single-deliverable tasks where the steps are
    mechanical applications of one design. TP-186 is referenced as the
    canonical consolidation example.

Internal

  • DEFAULT_WORKER_USER_TOOLS migrated to a shared lightweight
    constants module (TP-189-B / TP-184 follow-up):
    the literal
    `"rea...
Read more

v0.28.8

07 May 02:46

Choose a tag to compare

Enhanced

  • Dashboard: task title row widened to span cols 3–6 (#485 follow-up):
    The task title subtitle introduced in v0.28.7 was constrained to the
    100px-wide task-id column, which truncated most realistic titles
    ('Reviewer runs typec...') after just a few words. Restructured the
    task-row grid to two rows: row 1 holds the primary cells (icon, actions,
    task-id, status, duration, progress, step+telemetry), row 2 holds the
    optional task-title-subtitle spanning cols 3–6 (~486px combined width
    vs. the previous 100px). Stops before col 7 (task-step + telemetry) so
    step info and worker stats stay visible alongside the title. Auto row 2
    collapses to 0 height when no subtitle exists, so tasks with null
    taskTitle look identical to the v0.28.7 single-line layout. Display-only
    change — cannot affect orchestrator correctness.

Fixed

  • Code reviewer now runs project quality checks (typecheck/lint/format)
    before deciding (TP-188, #541):
    Previously, the reviewer agent spawned
    via review_step(type="code") evaluated changes through behavioural
    inspection only. It did NOT run npm run typecheck / npm run lint /
    npm run format:check, so code with TypeScript strict-mode errors or
    lint failures could receive APPROVE — those issues then surfaced at the
    worker's Testing & Verification step, blocking the entire batch. In one
    observed production batch, a code review returned APPROVE for a step
    that subsequently failed npm run typecheck with 5 strict-mode errors
    in the test code the reviewer had just signed off on. Cost of catching
    these earlier: one extra typecheck per code review. Cost of NOT
    catching them: the entire investment in the affected step plus all
    dependents. Fix is a prompt-only change to
    templates/agents/task-reviewer.md: a new Quality-check verification
    section (between How You Work and Verdict Criteria) instructs the
    reviewer to (1) discover commands by reading
    .pi/taskplane-config.json taskRunner.testing.commands first, then
    fall back to package.json scripts for typecheck / lint /
    format:check; (2) run any matching commands using its existing bash
    tool (no allowlist change required — bash is already in the default
    reviewer tool list); (3) surface failures as Issues Found with
    severity important; (4) downgrade an otherwise-APPROVE verdict to
    REVISE when any quality check fails. Plan reviews skip the section
    entirely (no code exists yet to typecheck). Skip-silently rule: if
    neither config nor package.json yields a relevant command, the
    reviewer notes the skip in the Summary and proceeds normally rather
    than blocking on absent infrastructure. 10 new source-pattern tests in
    extensions/tests/reviewer-quality-checks.test.ts lock the section
    shape, the hybrid discovery wording, and the verdict-downgrade rule.
  • Windows worktree cleanup falls back to cmd rd /s /q when git hits
    MAX_PATH (TP-188, #543):
    On Windows with default
    core.longpaths = false, git worktree remove --force fails with
    error: failed to delete '<path>': Filename too long when the worktree
    contains a deep node_modules tree (most non-trivial Node projects).
    Previously the orchestrator surfaced cleanup-incomplete via the
    post-integration banner but didn't recover — the operator had to run
    cmd /c "rd /s /q <path>" manually. Observed twice during a single
    recovery flow on the user's Windows machine working with
    emailgistics-astro (700+ npm deps). Fix adds two new exported helpers
    in extensions/taskplane/worktree.ts: isWindowsMaxPathError(stderr)
    (returns true only on win32 + /filename too long/i) and
    runWindowsCmdRd(absolutePath) (invokes execFileSync("cmd", ["/c", "rd", "/s", "/q", winPath]) with forward slashes normalized to
    backslashes for native Windows path semantics). The fallback fires
    inside removeWorktree's retry loop when the predicate matches,
    prunes git's bookkeeping on success so post-removal verification
    passes, and falls through to the existing terminal/retry classification
    on failure (with both git's stderr and cmd's stderr enriched into the
    thrown error so operators can diagnose). Other error classes (lock
    errors, permission denied, generic git errors) are unaffected. INFO-level
    logs via execLog("cleanup", "worktree", ...) make the rescue path
    visible in operator-facing output. 17 new tests in
    extensions/tests/windows-worktree-cleanup-fallback.test.ts cover the
    source-pattern wiring (helpers exist; removeWorktree calls them;
    git worktree prune runs on fallback success; failure path enriches
    the error), platform guard (returns false on linux/macOS), regex
    case-insensitivity, and runWindowsCmdRd's mocked invocation. Tests
    are platform-agnostic via child_process mocking so the suite passes
    on every CI runner.

Internal

  • CI workflow upgraded to Node 24 LTS: .github/workflows/ci.yml was
    on Node 22; release.yml had moved to Node 24 LTS during the v0.28.5
    release work but ci.yml was not aligned. Two motivations converged: the
    Node 22 / Node 24 mock.module() semantics divergence caused TP-188's
    runWindowsCmdRd unit tests to fail on Node 22 CI while passing locally
    on Node 24 (Node 24 aliases bare child_process and node:child_process;
    Node 22 treats them as separate modules). Bumping ci.yml to Node 24 fixes
    the test mock portability AND completes TP-189's Cluster D ahead of
    schedule.

v0.28.7

07 May 01:52

Choose a tag to compare

Enhanced

  • Dashboard: lane parallelization visible in wave indicator chips (#484):
    Wave chips at the top of the dashboard now group tasks by lane within each
    wave, joining same-lane tasks with (serial) and different-lane tasks
    with | (parallel). For example, W1 [TP-165, TP-166, TP-168, TP-167]
    now reads W1 [TP-165 → TP-166 | TP-168 | TP-167], immediately revealing
    that TP-165→TP-166 are serialized on lane 1 while TP-168 and TP-167 run in
    parallel on lanes 2 and 3. Within each lane, tasks render in execution
    order (per lane.taskIds). Hover tooltip on the chip exposes the
    expanded multi-line lane breakdown. Future waves with no lane assignment
    data fall back to the previous flat comma-separated display — no
    regression for unprovisioned waves.
  • Dashboard: task title under task ID in lane view (#485): The lane
    view now renders the human-readable task title (extracted from PROMPT.md's
    # Task: <ID> - <title> first-line heading) as a smaller muted subtitle
    beneath the task ID. Operator no longer needs to remember what each
    TP-XXX is. The title is read once from PROMPT.md and cached for the
    server's lifetime (PROMPT.md is immutable above the --- divider).
    Surfaced via a new taskTitle field on /api/state task records;
    frontend falls back gracefully when the field is null.

v0.28.6

07 May 00:11

Choose a tag to compare

Fixed

  • Worker death-spiral when code review returns REVISE on a step already
    marked Complete in STATUS (TP-186, #537, #542):
    Previously, if a worker
    set a step's **Status:** ✅ Complete heading in STATUS.md before calling
    review_step(type="code"), and the reviewer returned REVISE, the worker
    was caught in a state contradiction (STATUS says done, reviewer says not)
    with no recovery recipe in the prompt. The worker would loop through 3
    no-progress iterations and the orch's safety mechanism would kill the
    lane — the entire batch was a write-off, requiring ~15 min of manual git
    surgery per occurrence. The fix is structural: (1) the base worker prompt
    (templates/agents/task-worker.md) now contains an explicit Order of
    Operations
    rule that mandates code review BEFORE marking a step
    Complete, a Recovery Recipe for the case when the rule is
    accidentally violated (revert STATUS → commit → handle REVISE through
    the normal flow), and a Forbidden callout naming the death-spiral
    anti-pattern alongside the existing "NEVER add, remove, or renumber
    steps" family of MUST-NOT rules; (2) the engine-side review_step tool
    now refuses to run on a step already marked **Status:** ✅ Complete,
    returning a REFUSED verdict that points the worker at the Recovery
    Recipe (the refusal applies to code and test review types only — plan
    reviews fire pre-implementation and are correctly exempt). Until this
    fix shipped, Review Level ≥ 2 was effectively unsafe in production. 14
    new tests in worker-step-completion-protocol.test.ts. Supersedes the
    partial diagnosis in #510. Thanks to the production batch
    20260506T105850 against emailgistics-astro for surfacing the
    reproducer.

v0.28.5

06 May 21:16

Choose a tag to compare

Fixed

  • Pi no longer hard-blocks startup with a red error when run in directories
    that aren't configured for Taskplane (TP-183, #523):
    Previously, launching
    pi in any non-git directory (or any directory without
    .pi/taskplane-workspace.yaml / taskplane-config.json) raised a verbose
    red WORKSPACE_SETUP_REQUIRED notification at session_start. For users who
    only want Taskplane in some projects, this was wrong UX. The
    orchestrator now soft-fails the WORKSPACE_SETUP_REQUIRED case
    specifically: no error notification, status line shows the quiet
    🔀 Orchestrator · disabled (no taskplane config in workspace) indicator,
    orchestrator commands stay gracefully disabled (and still explain why if
    invoked, via the existing requireExecCtx guard). Configuration errors in
    workspaces that ARE set up — WORKSPACE_FILE_PARSE_ERROR,
    WORKSPACE_SCHEMA_INVALID, WORKSPACE_REPO_PATH_NOT_FOUND, and every
    other WorkspaceConfigErrorCode — still surface loudly with the existing
    red notify and ❌ startup failed (workspace config error) status line, so
    real misconfigurations remain visible. Throw behavior of
    buildExecutionContext is unchanged — only the display in extension.ts
    changes. 6 new tests in orchestrator-startup-uxv2.test.ts (3 scenarios,
    6 fine-grained checks). Thanks to @mwickens for the report.
  • Workers can now invoke review_step, notify_supervisor,
    escalate_to_supervisor, and request_segment_expansion (TP-184, #530):

    Previously these
    engine-internal coordination tools were missing from the worker's
    hardcoded --tools allowlist, so pi's tool gate filtered them out at the
    worker. The visible symptom: plan/code/test reviews silently never fired
    at Review Level >= 1, supervisor steering replies were impossible, and
    multi-repo segment-expansion requests were unreachable. The bridge tools
    are now always appended to the worker allowlist regardless of
    taskRunner.worker.tools config; the user-tools default is unchanged.
    Introduces three new exports in agent-host.ts: ENGINE_BRIDGE_TOOLS
    (canonical list of engine-internal tools), DEFAULT_WORKER_USER_TOOLS
    (the user-tools default literal), and buildWorkerToolsAllowlist()
    (combines user portion with bridge tools, deduplicated). Called exactly
    once at the lane-runner spawn site. Defense-in-depth: lane-runner now
    warns (via logExecution) if any bridge tool is missing from the final
    allowlist. 14 new tests in worker-tools-allowlist.test.ts.
  • Preflight pi check no longer misreports cold-start timeouts as "Pi not
    found" (TP-185):
    execCheck now classifies failures by mode (not-found,
    timeout, exit-code, signal, unknown) instead of treating every
    failure as missing-binary. The pi preflight now uses a 30s timeout (up
    from 10s) and retries once on timeout to absorb cold-start variance — mise
    shim resolution, Node bootstrap, AV process-launch scanning, and pi's own
    startup can together exceed 10s on a fresh first run, especially on Windows.
    Failure messages and hints are now tailored to the actual error kind
    (e.g. timeouts say "Pi did not respond within 30s" + diagnostic guidance,
    rather than the misleading "Install pi" hint). Detects missing binaries on
    both POSIX (ENOENT/exit 127) and Windows (cmd.exe "is not recognized")
    shells. 9 new tests in exec-check-error-classification.test.ts covering
    every classification path including regression guards against the original
    bug. Backward compatible: existing callers reading { ok, stdout } are
    unaffected.
  • Worker model/thinking/tools from preferences now flow through to spawned
    workers (TP-181, #522):
    taskRunner.worker.{model,thinking,tools} in
    preferences.json (and project config) are now threaded from
    TaskRunnerConfig through executeWaveexecuteLaneV2 to the worker
    subprocess via TASKPLANE_WORKER_{MODEL,THINKING,TOOLS} env vars. Previously
    LaneRunnerConfig.workerModel was hardcoded to "" and the user-configured
    worker model was silently ignored. Mirrors the existing reviewer pipeline
    established in TP-160. New buildWorkerEnv() helper, plumbed through
    engine.ts, execution.ts, and resume.ts. 11 new tests in
    worker-model.test.ts. Thanks to @NerfEko.

v0.28.4

21 Apr 00:31

Choose a tag to compare

Fixed

  • Settings TUI: Agent Extensions section shows "Toggle extensions per agent type" instead of generic "Read-only collection/record fields" label.