Skip to content

ci(release): require npm trusted publishing#505

Merged
tomdps merged 201 commits into
mainfrom
dev
Jun 17, 2026
Merged

ci(release): require npm trusted publishing#505
tomdps merged 201 commits into
mainfrom
dev

Conversation

@tomdps

@tomdps tomdps commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • promote the OIDC-only npm publishing policy from dev to main
  • release workflow no longer passes NPM_TOKEN
  • publishing docs now require the safe bootstrap path: one manual interactive 2FA publish, then npm trusted publishing

Verification already completed

  • PR ci(release): require npm trusted publishing #504 CI passed
  • dev merge queue CI passed
  • npm publish --dry-run passed locally
  • GitHub Actions NPM_TOKEN secret was deleted
  • @the-open-engine/zeroshot@5.4.0 exists on npm with latest pointing at 5.4.0

Hold before merge

Package creation is complete. Do not merge this PR until npm trusted publishing is configured for:

  • owner: the-open-engine
  • repo: zeroshot
  • workflow filename: release.yml
  • action: npm publish

EivMeyer and others added 30 commits January 16, 2026 14:59
## Summary
- Add `eslint-plugin-security` and `eslint-plugin-sonarjs` for code
quality
- Set `noInlineConfig: true` - prevents eslint-disable (fix code, not
rules)
- Fix 26 warnings (92 → 66 remaining)

## Changes
- **no-param-reassign (9 → 0)**: Use local variables instead of mutating
params
- **no-nested-conditional (11 → 0)**: Extract nested ternaries to helper
functions
- **detect-unsafe-regex (1 fixed)**: Replace ReDoS-vulnerable nested
quantifier
- **no-invariant-returns (2 → 0)**: File override for message handler
pattern
- **Misc (3)**: Dead eslint-disable, collection size, character class

## Remaining Warnings (66)
Architectural (complexity, max-lines) or false positives for CLI
patterns.

Closes #100

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Summary

Reduces planner verbosity from 12KB to ~3KB, fixing "No messages
returned" Claude CLI bug.

## Problem

Planner generates verbose plans (12KB+) that bloat worker context:
- Paragraphs explaining WHY instead of WHAT
- Redundant file descriptions
- Code examples for trivial changes
- 200+ word acceptance criteria

This triggers Claude CLI bug: "No messages returned"

## Solution

Added **OUTPUT CONCISENESS (CRITICAL)** section to planner prompt:
- Target: <3000 words total
- Forbidden: paragraphs, background, redundant descriptions
- Required: bullet points, imperative commands, <50 word criteria
- Examples: verbose (bad) vs concise (good)

## Expected Impact

- PLAN_READY message: 12KB → 3-4KB (~70% reduction)
- Worker context stays under threshold
- No more "No messages returned" errors

## Testing

- ✅ Template parses correctly (no JSON errors)
- ✅ Pre-commit validation passes
- ✅ All emojis preserved (UTF-8 encoding correct)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Fixes CI security audit failures by:

- Adding `--omit=dev` flag to `npm audit` command (only audit production
dependencies)
- Adding npm overrides for vulnerable dev dependencies (diff, tar,
undici)

These vulnerabilities are in dev dependencies (semantic-release, mocha)
and don't affect the published package. The `--omit=dev` flag allows CI
to pass while still catching real production security issues.

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
…r status reports (#107)

## Summary
- Remove phase-focused instructions from planner - output flat numbered
steps instead
- Tighten CRITICAL classification - only when DIRECTLY modifying
auth/billing/secrets/destructive ops
- Fix worker outputting status reports instead of doing work - add "DO
THE WORK. DON'T REPORT STATUS."

## Changes

### Planner
- Remove SCOPE REDUCTION, SILENT PHASE OMISSION sections
- Remove delegation/sub-agent schema
- Add simple flat numbered steps format
- Forbid: "Phase 1", "Phase 2", "Future work", delegation

### Worker  
- Add prominent "DO THE WORK. DON'T REPORT STATUS." warning
- Forbid outputs like "Infrastructure exists but 0% completed..."
- Require every response to include tool calls that MAKE CHANGES

### Conductor
- Tighten CRITICAL: only when code DIRECTLY modifies auth logic, payment
processing, secrets, destructive DB ops
- Reverse bias: "If unsure between STANDARD and CRITICAL, choose
STANDARD"
- Add cost context: CRITICAL uses Opus + 4 validators = expensive
- Add NOT CRITICAL examples: refactoring, types, tests, code
organization

## Test plan
- [x] JSON syntax valid
- [x] All tests pass
- [ ] Test planner produces flat steps (manual)
- [ ] Test conductor classifies refactors as STANDARD (manual)
- [ ] Test worker executes instead of reporting (manual)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Problem

The adversarial-tester validator caused 21-minute Claude API calls that
ended with `No messages returned` errors when validating large
implementations (197+ routes).

**Incident:** Cluster `rushing-sphinx-39` crashed at 15:41:57 with:
```
Error: No messages returned
    at aAB (/$bunfs/root/claude:5327:813)
```

**Root cause:** Prompt too large for Claude API context window.

## Solution

Removed adversarial-tester agent definition from full-workflow.json
(lines 580-705).

**New validator lineup:**
- validator-requirements (always)
- validator-code (validator_count >= 2)
- validator-security (validator_count >= 3)
- validator-tester (validator_count >= 4)

**With default validator_count=2:**
- 4 agents total: planner, worker, validator-requirements,
validator-code
- Down from 5 agents (removed adversarial-tester)

## Verification

- ✅ Template validation passed
- ✅ JSON structure valid
- ✅ All 6 remaining agents present
- ✅ Pre-commit hooks passed (prettier, typecheck, template validation)
Add single-session execution scope constraint to planner
Branch protection requires PRs, but semantic-release git plugin pushes
directly. Remove it - npm publish and GitHub release still work.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Eivind Meyer <eivind.meyer@ksat.no>
Co-authored-by: Michael Eichelbeck <141341133+mkceichelbeck@users.noreply.github.com>
Co-authored-by: tomdps <tom.dupuis24@gmail.com>
Co-authored-by: tomdps <60640908+tomdps@users.noreply.github.com>
Co-authored-by: Michael Eichelbeck <michael.eichelbeck.ext@wtsde.onmicrosoft.de>
…rt (#116)

Updates the README announcement to highlight:
- OpenCode CLI support
- Multi-platform issue backends (GitHub, GitLab, Jira, Azure DevOps)
Removes 'mix providers in multi-agent workflows' claim - technically
supported but not practically used or tested.
Resolves conflicts between main and dev, keeps fixed provider claim

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Eivind Meyer <eivind.meyer@ksat.no>
Co-authored-by: Michael Eichelbeck <141341133+mkceichelbeck@users.noreply.github.com>
Co-authored-by: tomdps <tom.dupuis24@gmail.com>
Co-authored-by: tomdps <60640908+tomdps@users.noreply.github.com>
Co-authored-by: Michael Eichelbeck <michael.eichelbeck.ext@wtsde.onmicrosoft.de>
## Summary
- add context metrics helper and emit JSON metrics with truncation
tracking
- refactor context builder to compute section breakdowns without
changing prompt output
- add unit + integration tests for context metrics and ledger emission

## Testing
- npm run lint (warnings only, existing)
- npm run typecheck
- npm run test:all (fails in this environment: existing suite failures +
env issues; ran before fixing the new integration test)
- npx mocha --no-config tests/integration/context-metrics.test.js
--timeout 180000
Closes #128.

## Summary
- Implement context source selection semantics (latest/oldest/all) with
amount/limit handling and updated validation.
- Apply the same selection logic to sub-cluster parent topic selection
and templates.
- Fix PR-mode completion for git-pusher, output extraction, and
orchestrator storage dir handling.
- Avoid blessed-contrib picture widget import in TUI layouts to prevent
Node 20 MemoryReadableStream crash.
- Add/adjust tests for new selection behavior and template expectations.

## Testing
- npm test
- npm run lint (warnings only)
## Summary
- replace legacy truncation with context pack budgeting
- add pack metrics + priority/compact validation
- update templates and add tests for pack behavior

## Testing
- npm run lint
- npm run test:all

Fixes #131
## Summary
- add state snapshot builder and publisher with durable STATE_SNAPSHOT
updates
- wire snapshotter into orchestrator start/load/resume and stop/kill
paths
- update base templates/context validation and add tests

## Testing
- npm run lint
- npm run test:all
## Summary
- document context selection, packs, state snapshot, and metrics with
Mermaid diagrams
- align contextStrategy sources to explicit latest semantics and add
STATE_SNAPSHOT for debug investigator
- update contributor example and link new docs

## Testing
- npm run lint (warnings only)
- npm run test:all
- npm run validate:templates
- npm run typecheck (pre-commit hook)
## Summary
- only apply provider override when explicitly set (CLI flag/env)
- add unit test covering provider override resolution

## Testing
- npx mocha tests/unit/cli-provider-override.test.js

Fixes #140
## Summary
- detect platform-mismatch CANNOT_VALIDATE results and retry validators
in docker isolation
- skip platform-mismatch reasons when validator runs in docker
- add platform mismatch detection tests

## Testing
- npm run lint
- npm run test

Fixes #142
)

## Summary
- Bump codex provider reasoning effort levels for better quality on
complex tasks
- level1: low → medium
- level2: medium → high  
- level3: high → xhigh

The `xhigh` reasoning effort allows the model to think longer for better
answers on complex tasks.

## Test plan
- [x] Existing tests pass (reasoning effort validation allows all four
values)
- [ ] Manual test with `zeroshot run --provider codex` to verify xhigh
is passed correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Summary
- fail fast on Windows in preflight with clear guidance
- document Windows deferral rationale in README

## Testing
- npm run typecheck
- npm run validate:templates
- npm run lint (warnings only)
## Summary\n- load persisted clusters before CLI resume fallback to
task\n- add unit test to guard resume loading\n\n## Testing\n- npx mocha
tests/unit/cli-resume-loads-clusters.test.js\n\nFixes #103
## Summary\n- detect "No messages returned" from Claude CLI in task
watchers, terminate the child, and mark the task failed\n- surface the
error in agent error context and grant a one-time retry for this
transient failure\n- add unit coverage for fatal error detection\n\n##
Testing\n- `npx mocha tests/unit/claude-fatal-error-detection.test.js`
(failed: mocha picked up the full suite due to .mocharc; failures
include missing better-sqlite3 and existing test failures in this
environment)
Adds PRD and multi-stage implementation plan for the Ink-based Zeroshot
TUI replacement.

Docs:
- docs/tui-v2/PRD.md
- docs/tui-v2/IMPLEMENTATION_PLAN.md
## Summary\n- log detailed diagnostics when Claude CLI returns "No
messages returned"\n- include latest Claude debug file path + tail,
status output tail, and task metadata\n\n## Testing\n- pre-commit hooks
(eslint/prettier, typecheck, template validation)\n- pre-push lint +
typecheck
…160)

## Summary

The `validator-requirements` agent was crashing with
`error_max_structured_output_retries` because its JSON schema was too
complex for Claude CLI's `--json-schema` structured output feature.

**Root cause:** The nested `criteriaResults` array (objects with nested
`evidence` object and enum constraints) was too hard for the model to
produce reliably. After 5 internal retries, the CLI threw the error.

**Fix:** Make `criteriaResults` optional (removed from `required` array)
in both:
- `full-workflow.json`
- `quick-validation.json`

This means:
- `approved` and `summary` are still required (model produces these
correctly)
- `criteriaResults` becomes best-effort (produced when model can,
gracefully omitted when not)
- Downstream consumers already handle missing/partial `criteriaResults`

Fixes #159

## Test plan
- [x] Validated templates pass (`npm run validate:templates`)
- [ ] Re-run a STANDARD task to verify validator-requirements completes
without crashing

---
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Summary
- Provider-level retryable error detection (Anthropic, OpenAI, Google)
- Orchestrator robustness improvements with proper error propagation
- Agent lifecycle improvements with better state management
- TUI renderer enhancements for error visibility
- Settings handling improvements

## Test plan
- [x] Unit tests for provider retryable errors
- [x] Orchestrator tests for error scenarios

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
## Summary
- avoid treating "Task not found"/"Process terminated" substrings inside
JSON logs as fatal
- only treat standalone fatal lines as no-output
- add output-extraction unit tests for fatal-string handling

## Testing
- npm test

Fixes #165
tomdps and others added 11 commits June 16, 2026 20:29
## Summary
- switch package metadata, docs, update checker, and provider helper
metadata to @the-open-engine/zeroshot
- update the release smoke cleanup to uninstall the new scoped package
- keep NPM_TOKEN as fallback when trusted publishing cannot publish the
first package version

## Verification
- npm run check:agent-cli-provider:ci
- npx eslint cli/lib/update-checker.js
- temp-prefix npm pack/install smoke: zeroshot --version, --help, list
- npm publish --dry-run
## Summary
- Record origin/main as merged into current dev so the release PR can
compare cleanly.
- No file content changes relative to dev; this is an ancestry-only sync
commit.

## Verification
- PR #498 CI passed and merged via merge queue.
- dev push CI passed.
- push hook lint and typecheck passed for this branch.

---------

Co-authored-by: Eivind Meyer <eiv.meyer@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Eivind Meyer <eivind.meyer@ksat.no>
Co-authored-by: Michael Eichelbeck <141341133+mkceichelbeck@users.noreply.github.com>
Co-authored-by: Michael Eichelbeck <michael.eichelbeck.ext@wtsde.onmicrosoft.de>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-53.eu-north-1.compute.internal>
Co-authored-by: Eivind <eivind@covibes.ai>
Co-authored-by: CI Test <ci-test@covibes.ai>
Co-authored-by: Codex <codex@example.com>
## Summary
- Preserve origin/main as actual ancestry of dev so the protected
dev-to-main release PR can merge cleanly.
- No file content changes relative to current dev; the resolved tree
keeps the @the-open-engine package metadata and the trusted-publishing
release workflow.

## Why
- The dev merge queue squash-merges PRs, which strips merge parentage.
- The release PR requires dev to be cleanly mergeable into main, so this
sync must land as a merge commit rather than a squash.

## Verification
- push hook lint and typecheck passed for this branch.
- origin/dev package metadata is @the-open-engine/zeroshot.
- origin/main currently still has @covibes/zeroshot, which this release
PR will replace.

---------

Co-authored-by: Eivind Meyer <eiv.meyer@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Eivind Meyer <eivind.meyer@ksat.no>
Co-authored-by: Michael Eichelbeck <141341133+mkceichelbeck@users.noreply.github.com>
Co-authored-by: Michael Eichelbeck <michael.eichelbeck.ext@wtsde.onmicrosoft.de>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-53.eu-north-1.compute.internal>
Co-authored-by: Eivind <eivind@covibes.ai>
Co-authored-by: CI Test <ci-test@covibes.ai>
Co-authored-by: Codex <codex@example.com>
Preserve origin/main ancestry on current dev so the protected dev-to-main release PR can merge cleanly.
…-merge-release-2

chore(release): merge main into dev
## Summary
- remove the NPM_TOKEN fallback from the release workflow so npm
publishing is OIDC-only
- document the safe bootstrap path: one manual 2FA publish to create the
package, then trusted publishing
- delete references to automation-token fallback setup

## Verification
- npm publish --dry-run
- commit hook: typecheck + template validation
- push hook: lint + typecheck

## Notes
- GitHub Actions secret NPM_TOKEN has been deleted from this repository.
- @the-open-engine/zeroshot is still not present on npm; npm requires
the package to exist before trusted publishing can be attached.
tomdps and others added 2 commits June 16, 2026 23:01
## Summary
- no-content sync merge that records current main as an ancestor of dev
- keeps dev's OIDC-only release workflow/docs tree intact
- unblocks the dev -> main promotion PR #505 without reintroducing token
fallback

## Verification
- git diff origin/dev..HEAD is empty
- git merge-base --is-ancestor origin/main HEAD passes
- push hook lint/typecheck passed with existing warnings

---------

Co-authored-by: Eivind Meyer <eiv.meyer@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Eivind Meyer <eivind.meyer@ksat.no>
Co-authored-by: Michael Eichelbeck <141341133+mkceichelbeck@users.noreply.github.com>
Co-authored-by: Michael Eichelbeck <michael.eichelbeck.ext@wtsde.onmicrosoft.de>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-38-53.eu-north-1.compute.internal>
Co-authored-by: Eivind <eivind@covibes.ai>
Co-authored-by: CI Test <ci-test@covibes.ai>
Co-authored-by: Codex <codex@example.com>
…-engine

chore(release): preserve main ancestry for trusted publishing
@tomdps tomdps added this pull request to the merge queue Jun 17, 2026
## Summary
- keep the main-only install matrix behavior
- force the matrix job to evaluate after the check job succeeds even
when the PR-source guard is skipped on merge_group events
- unblocks the main merge queue required install-matrix contexts for
release PR #505

## Verification
- git diff --check -- .github/workflows/ci.yml
- npx prettier --check .github/workflows/ci.yml
- pre-commit: typecheck
- pre-commit: validate:templates
- push hook: lint
- push hook: typecheck
@tomdps tomdps removed this pull request from the merge queue due to a manual request Jun 17, 2026
@tomdps tomdps added this pull request to the merge queue Jun 17, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to invalid changes in the merge commit Jun 17, 2026
@tomdps

tomdps commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator Author

Temporarily closing/reopening to clear a stale merge-queue entry and trigger fresh pull_request checks for the updated dev head.

@tomdps tomdps closed this Jun 17, 2026
@tomdps tomdps reopened this Jun 17, 2026
@tomdps tomdps added this pull request to the merge queue Jun 17, 2026
Merged via the queue into main with commit 3405074 Jun 17, 2026
17 checks passed
@github-actions

Copy link
Copy Markdown

🎉 This PR is included in version 6.0.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants