Releases: opena2a-org/hackmyagent
v0.22.0
Highlights
Security: Integrity verifier ACTIVE for the first time
The README claim "Self-securing -- verifies its own binary on startup. Tampered binaries enter QUARANTINE mode (exit code 3)" had been false in every prior published version. The integrity-verifier was silently no-op due to a manifest-path mismatch (loadManifest looked at <root>/.integrity-manifest.json while the build wrote to <root>/dist/.integrity-manifest.json). 0.22.0 activates the gate end-to-end. Tampered installs now exit 3 with INTEGRITY CHECK FAILED: stderr on --version, --help, no-args, secure, and check.
Verify your install:
npm view hackmyagent@0.22.0 dist.attestations --json
# predicateType: "https://slsa.dev/provenance/v1"UX: Concept-explainer registry under findings
Findings whose Fix line recommends primitives like harden-soul, opena2a protect, opena2a mcp audit, hash-pinning, or A2A signing now show a curated educational block on first occurrence per scan, collapsing to a one-line back-reference on subsequent same-concept findings. Closes the "what does that even mean" gap on first-time scans.
UX: check skill: / check mcp: / check <local-path> labelled "Quick scan"
The check local-path orchestrator runs the NanoMind semantic matrix only, not the full 209-static-check suite. Score line now renders "Quick scan" with a follow-up pointer to secure for the full audit. Misleading "Path forward: N -> M" recovery-math line suppressed on quick-scan output.
Detection: OWASP LLM01 imperative-override directives now fire
Malicious skills with "IGNORE PRIOR INSTRUCTIONS" in frontmatter or body now emit AST-PROMPT-001 Jailbreak Attack Surface + CRITICAL Prompt Injection Surface from AST-INJECT-001, both with verbatim evidence. Score on the malicious exfil-skill fixture: 39 -> 31.
Severity: Hygiene HIGH severity rework
Skill governance hygiene findings (SKILL-020, SUPPLY-001, SUPPLY-004, AST-PROMPT-003, AST-PROMPT-004) default to MEDIUM and upgrade to HIGH only on malice-signal co-occurrence. Benign clean-skill scan: 0 HIGH findings (was 5).
JSON schema: every static-check finding carries attackClass
116 emission paths previously slipped through enrichWithTaxonomy() and shipped with attackClass: null. Threat-matrix counters, OASB attack-class indexing, and NanoMind training labels now fully populated on the wire.
Render: NanoMind threat-analysis cap filter
Confidence-capped CRITICAL findings (capped to HIGH) no longer render in the analyst block. Static AST findings still render via their own path.
Tier-aware finding sort
secure findings list now sorts by attack-class tier first (active malice -> capability sprawl -> missing defense-in-depth -> hygiene -> chrome), then severity. Top-3 findings are now distinct across benign / buggy / malicious surfaces.
Trusted Publishing
Published via npm Trusted Publishing (OIDC). No NPM_TOKEN. SLSA v1 provenance attached. Verify:
npm view hackmyagent@0.22.0 dist.attestations --jsonCumulative PR list
#155 #156 #157 #158 #159 in this release dance, plus #143 #144 #145 #146 #149 #150 #153 #154 from earlier in the bundle.
Known issues (filed for follow-up)
- #160: integrity-verifier hardening (baked-in signing key, symlink rejection, accumulate-all-tampered-files)
- #161:
check <bare-name>routes to skill-id parser before npm -- misleading error + breaks--jsonNotFoundOutputcontract on bare names
Install
# npm
npm install -g hackmyagent
# Homebrew
brew install opena2a-org/tap/hackmyagentFull changelog: https://github.com/opena2a-org/hackmyagent/blob/main/CHANGELOG.md
v0.21.1
hackmyagent 0.21.1
Closes the data-layer half of opena2a-parity F3 + F4 (PR #3 + PR #4) by routing all check --json not-found paths through buildNotFoundOutput from @opena2a/check-core.
Fixed
- F3 — bare-name reclass on npm 404. Bare names like
hackmyagent check totally-nonexistent-pkg --jsonpreviously emittedInvalid skill identifieron stderr with no JSON, breaking the--jsoncontract. They now emit the canonicalNotFoundOutputshape{ name, found: false, error, ecosystem: "npm" }and exit 1. Scoped names (@scope/name) still fall through to the skill-identifier fallback on npm 404 — that path is unchanged. - F4 — GitHub 404 errorHint.
checkGitHubRepo--jsonbranch now populateserrorHint(Verify the URL: https://github.com/<displayName>) viabuildNotFoundOutput. The human-rendered path was already populating it; the JSON branch had drifted. - PyPI 404 and the npm
translateDownloadError(did-you-mean) path also route throughbuildNotFoundOutputfor shape consistency.
Tests
- 1791 passed, 16 skipped, 10 todo (1817 total). +6 vs 0.21.0 (the new not-found regression tests in
__tests__/checker/check-not-found-json.test.ts). - opena2a-parity local run: PR #3 4/4 fixtures byte-identical green; PR #4 5/5 fixtures byte-identical green.
Trusted Publishing
Published via GitHub Actions OIDC. Verify provenance:
```
npm view hackmyagent@0.21.1 dist.attestations --json
```
Expected: `predicateType: https://slsa.dev/provenance/v1\`.
Full Changelog: v0.21.0...v0.21.1
v0.21.0
What's Changed
- docs(testing): add release-smoke walkthrough by @thebenignhacker in #125
- feat(check): rich-block render for skill: / mcp: + cli-ui 0.5.0 consume (0.21.0) by @thebenignhacker in #126
Full Changelog: v0.20.0...v0.21.0
v0.20.0
hackmyagent 0.20.0
Emit PackageNarrative to the OpenA2A registry on secure --publish for skill / mcp artifacts. First release of the rich-context data pipeline behind the new check skill+mcp v1 view.
Added
secure --publishPOSTs a narrative payload toPOST /api/v1/trust/narrativefor skill or MCP scan targets. Wire shape mirrors@opena2a/check-core@0.2.0'sPackageNarrativetypes and the registry'spackage_narrativesrow (migration 223). Failure is non-fatal — parent publish always succeeds first; narrative emission status is reported underpublish.narrativein JSON output, only logged in--verbosetext mode.src/narrative/module. Skill+MCP builders, NanoMind v3 graceful-degrade gate, registry HTTP client, single-call wire helper. ~900 LOC, 35 new unit tests.- Static threat-model questions ship inline with each narrative per [CHIEF-CSR] decision.
Changed
@opena2a/check-coreexact-pinned at0.2.0(was0.1.0).
Engineering notes
- Detection rules (v1):
SKILL.mdat scan root → skill,projectType==='mcp'→ mcp, else skip. Auditable surface; richer multi-artifact-per-scan detection lands when [CHIEF-CA] decides on the convention. - NanoMind v3 summary path is gated by an input-classifier v3.1 stub. v3 is OOD on comprehension tasks; v1 returns empty strings on every code path so the renderer gracefully degrades to "Comprehension data not yet available."
Provenance
Published via npm Trusted Publishing (OIDC). SLSA v1 attestations: npm view hackmyagent@0.20.0 dist.attestations --json
Brief: opena2a-org/briefs/check-rich-context-skills-mcp-v1.md (§4-§7, §8 task 2c-2e)
v0.19.0
What's Changed
- feat(check): consume @opena2a/check-core@0.1.0 (CA-034 M3) by @thebenignhacker in #122
Full Changelog: v0.18.3...v0.19.0
v0.18.3
What's Changed
- feat(check): consume cli-ui 0.3.0 + emit registry fields in --json (F1 close) by @thebenignhacker in #121
Full Changelog: v0.18.2...v0.18.3
v0.18.2
What's Changed
- fix(test): skip E2E-003 network detection on CI by @thebenignhacker in #119
- release: hackmyagent 0.18.2 — skip E2E-003 on CI by @thebenignhacker in #120
Full Changelog: v0.18.1...v0.18.2
v0.17.11
Highlights
hackmyagent detectShadow AI audit command (CISO/security-engineer focused: inventory of running AI tools, MCP servers, AI configs, governance gaps)--nanomindopt-in flag for AI-powered threat narratives (default off; static analyzers always run)- Unified output formatter across
secure/scan-soul/harden-soul/explain— single visual language across repo-style commands - CISO-grade UX: every finding has a one-line
Verify:and one-lineFix:. No env-var-shame on credential findings. Capability-abuse uses runnableharden-soulinstead of wall-of-names
Quality fixes
- Bug-bounty target descriptors no longer cause analyzer pileups.
salesforce-mcp.json-style files used to trigger 6 overlapping findings (governance + capability + scope all misfiring on prose). MCP classifier tightened to a known-basename allowlist + content fallback requiring an actual\"mcpServers\":key. - TOCTOU-001 stops flagging legitimate
existsSync → readFileSyncconfig-load patterns. Now requires a write or exec between check and read. Addsimport(varPath)to exec sinks so dynamic-import abuse is still caught. Eliminates 11 FPs on secretless. - NanoMind generative findings cap at HIGH when confidence < 0.80. Previously low-confidence findings rendered as CRITICAL with hardcoded 60% confidence stamp. Now CRITICAL only when model is genuinely confident.
- NanoMind
max_tokens512 → 2048: descriptions no longer truncate mid-word. - Self-scan: 100/100, no findings.
Dynamic counts
CLI text now derives check + category counts from the same taxonomy map check-metadata reads. Eliminates the long-running drift where help text said "60 categories" while actual was 44.
Tests
1723 tests passing (3 new regression tests covering the analyzer-pileup fix, the bug-bounty descriptor classification, and the dedup invariant).
See CHANGELOG.md for the full 0.17.10 + 0.17.11 entry.
`npm install -g hackmyagent@0.17.11`
`brew install opena2a/tap/hackmyagent`
v0.17.10
v0.17.10
Oracle P0 Fixes
P0-1: Benign FPR 90.9% → 0% (oracle hard-negative gate)
- Semantic compiler: consensus threshold 0.65 → 0.45 (
src/nanomind-core/compiler/semantic-compiler.ts) - Raises
hasHighBenignContext/isExplicitlyRestrictedBenignintent confidence thresholds so well-governed skills with high-risk capabilities (e.g. shell helpers with SOUL.md) are no longer false-positived
P0-2: Scanner bridge skips label.json
src/nanomind-core/scanner-bridge.tsnow ignoreslabel.jsonfiles during artifact compilation- Prevents oracle fixture metadata from being misclassified as agent artifacts
Oracle-Verified Metrics (TME v5, 2026-04-15, 50-fixture oracle)
- Recall: 100%
- Precision: 79.6%
- F1: 88.7%
- Benign FPR: 9.1% (b08 borderline — execute_shell + no SOUL.md legitimately fires GOV-004)
Tests
1599 passing | 0 failures
Also includes (since v0.17.9)
- MCP-004 runtime tool-override pattern (ARP)
- ML-DSA-44 benchmark + noble drift guard (AIComply)
- Dependency security patches (vite, hono)
v0.17.9
Bug fix
Fix kill enforcement on SIGSTOPped processes. EnforcementEngine.kill() sent SIGTERM directly even when the target pid was paused via SIGSTOP. A stopped process cannot handle signals, so SIGTERM stayed queued and the graceful-kill path silently never fired — only the 5s SIGKILL fallback actually terminated it. Now sends SIGCONT first when the target is in the paused set, so graceful termination works as intended.
Adds a focused EnforcementEngine regression test (pause → kill → process dead within 1s).
Unblocks pre-existing OASB CI failure AT-ENF-004: should remove a killed PID from paused list if it was paused.