Skip to content

Releases: opena2a-org/hackmyagent

v0.22.0

29 Apr 17:44

Choose a tag to compare

Highlights

Security: Integrity verifier ACTIVE for the first time

The README claim "Self-securing -- verifies its own binary on startup. Tampered binaries enter QUARANTINE mode (exit code 3)" had been false in every prior published version. The integrity-verifier was silently no-op due to a manifest-path mismatch (loadManifest looked at <root>/.integrity-manifest.json while the build wrote to <root>/dist/.integrity-manifest.json). 0.22.0 activates the gate end-to-end. Tampered installs now exit 3 with INTEGRITY CHECK FAILED: stderr on --version, --help, no-args, secure, and check.

Verify your install:

npm view hackmyagent@0.22.0 dist.attestations --json
# predicateType: "https://slsa.dev/provenance/v1"

UX: Concept-explainer registry under findings

Findings whose Fix line recommends primitives like harden-soul, opena2a protect, opena2a mcp audit, hash-pinning, or A2A signing now show a curated educational block on first occurrence per scan, collapsing to a one-line back-reference on subsequent same-concept findings. Closes the "what does that even mean" gap on first-time scans.

UX: check skill: / check mcp: / check <local-path> labelled "Quick scan"

The check local-path orchestrator runs the NanoMind semantic matrix only, not the full 209-static-check suite. Score line now renders "Quick scan" with a follow-up pointer to secure for the full audit. Misleading "Path forward: N -> M" recovery-math line suppressed on quick-scan output.

Detection: OWASP LLM01 imperative-override directives now fire

Malicious skills with "IGNORE PRIOR INSTRUCTIONS" in frontmatter or body now emit AST-PROMPT-001 Jailbreak Attack Surface + CRITICAL Prompt Injection Surface from AST-INJECT-001, both with verbatim evidence. Score on the malicious exfil-skill fixture: 39 -> 31.

Severity: Hygiene HIGH severity rework

Skill governance hygiene findings (SKILL-020, SUPPLY-001, SUPPLY-004, AST-PROMPT-003, AST-PROMPT-004) default to MEDIUM and upgrade to HIGH only on malice-signal co-occurrence. Benign clean-skill scan: 0 HIGH findings (was 5).

JSON schema: every static-check finding carries attackClass

116 emission paths previously slipped through enrichWithTaxonomy() and shipped with attackClass: null. Threat-matrix counters, OASB attack-class indexing, and NanoMind training labels now fully populated on the wire.

Render: NanoMind threat-analysis cap filter

Confidence-capped CRITICAL findings (capped to HIGH) no longer render in the analyst block. Static AST findings still render via their own path.

Tier-aware finding sort

secure findings list now sorts by attack-class tier first (active malice -> capability sprawl -> missing defense-in-depth -> hygiene -> chrome), then severity. Top-3 findings are now distinct across benign / buggy / malicious surfaces.

Trusted Publishing

Published via npm Trusted Publishing (OIDC). No NPM_TOKEN. SLSA v1 provenance attached. Verify:

npm view hackmyagent@0.22.0 dist.attestations --json

Cumulative PR list

#155 #156 #157 #158 #159 in this release dance, plus #143 #144 #145 #146 #149 #150 #153 #154 from earlier in the bundle.

Known issues (filed for follow-up)

  • #160: integrity-verifier hardening (baked-in signing key, symlink rejection, accumulate-all-tampered-files)
  • #161: check <bare-name> routes to skill-id parser before npm -- misleading error + breaks --json NotFoundOutput contract on bare names

Install

# npm
npm install -g hackmyagent

# Homebrew
brew install opena2a-org/tap/hackmyagent

Full changelog: https://github.com/opena2a-org/hackmyagent/blob/main/CHANGELOG.md

v0.21.1

28 Apr 10:46
fca9d75

Choose a tag to compare

hackmyagent 0.21.1

Closes the data-layer half of opena2a-parity F3 + F4 (PR #3 + PR #4) by routing all check --json not-found paths through buildNotFoundOutput from @opena2a/check-core.

Fixed

  • F3 — bare-name reclass on npm 404. Bare names like hackmyagent check totally-nonexistent-pkg --json previously emitted Invalid skill identifier on stderr with no JSON, breaking the --json contract. They now emit the canonical NotFoundOutput shape { name, found: false, error, ecosystem: "npm" } and exit 1. Scoped names (@scope/name) still fall through to the skill-identifier fallback on npm 404 — that path is unchanged.
  • F4 — GitHub 404 errorHint. checkGitHubRepo --json branch now populates errorHint (Verify the URL: https://github.com/<displayName>) via buildNotFoundOutput. The human-rendered path was already populating it; the JSON branch had drifted.
  • PyPI 404 and the npm translateDownloadError (did-you-mean) path also route through buildNotFoundOutput for shape consistency.

Tests

  • 1791 passed, 16 skipped, 10 todo (1817 total). +6 vs 0.21.0 (the new not-found regression tests in __tests__/checker/check-not-found-json.test.ts).
  • opena2a-parity local run: PR #3 4/4 fixtures byte-identical green; PR #4 5/5 fixtures byte-identical green.

Trusted Publishing

Published via GitHub Actions OIDC. Verify provenance:

```
npm view hackmyagent@0.21.1 dist.attestations --json
```

Expected: `predicateType: https://slsa.dev/provenance/v1\`.

Full Changelog: v0.21.0...v0.21.1

v0.21.0

28 Apr 02:25
93b1f3b

Choose a tag to compare

What's Changed

Full Changelog: v0.20.0...v0.21.0

v0.20.0

27 Apr 14:48
d47b4d6

Choose a tag to compare

hackmyagent 0.20.0

Emit PackageNarrative to the OpenA2A registry on secure --publish for skill / mcp artifacts. First release of the rich-context data pipeline behind the new check skill+mcp v1 view.

Added

  • secure --publish POSTs a narrative payload to POST /api/v1/trust/narrative for skill or MCP scan targets. Wire shape mirrors @opena2a/check-core@0.2.0's PackageNarrative types and the registry's package_narratives row (migration 223). Failure is non-fatal — parent publish always succeeds first; narrative emission status is reported under publish.narrative in JSON output, only logged in --verbose text mode.
  • src/narrative/ module. Skill+MCP builders, NanoMind v3 graceful-degrade gate, registry HTTP client, single-call wire helper. ~900 LOC, 35 new unit tests.
  • Static threat-model questions ship inline with each narrative per [CHIEF-CSR] decision.

Changed

  • @opena2a/check-core exact-pinned at 0.2.0 (was 0.1.0).

Engineering notes

  • Detection rules (v1): SKILL.md at scan root → skill, projectType==='mcp' → mcp, else skip. Auditable surface; richer multi-artifact-per-scan detection lands when [CHIEF-CA] decides on the convention.
  • NanoMind v3 summary path is gated by an input-classifier v3.1 stub. v3 is OOD on comprehension tasks; v1 returns empty strings on every code path so the renderer gracefully degrades to "Comprehension data not yet available."

Provenance

Published via npm Trusted Publishing (OIDC). SLSA v1 attestations: npm view hackmyagent@0.20.0 dist.attestations --json

Brief: opena2a-org/briefs/check-rich-context-skills-mcp-v1.md (§4-§7, §8 task 2c-2e)

v0.19.0

23 Apr 01:43
d0d5c2d

Choose a tag to compare

What's Changed

Full Changelog: v0.18.3...v0.19.0

v0.18.3

23 Apr 00:30
29e194d

Choose a tag to compare

What's Changed

  • feat(check): consume cli-ui 0.3.0 + emit registry fields in --json (F1 close) by @thebenignhacker in #121

Full Changelog: v0.18.2...v0.18.3

v0.18.2

22 Apr 20:45
c66d4cd

Choose a tag to compare

What's Changed

Full Changelog: v0.18.1...v0.18.2

v0.17.11

18 Apr 01:29

Choose a tag to compare

Highlights

  • hackmyagent detect Shadow AI audit command (CISO/security-engineer focused: inventory of running AI tools, MCP servers, AI configs, governance gaps)
  • --nanomind opt-in flag for AI-powered threat narratives (default off; static analyzers always run)
  • Unified output formatter across secure / scan-soul / harden-soul / explain — single visual language across repo-style commands
  • CISO-grade UX: every finding has a one-line Verify: and one-line Fix:. No env-var-shame on credential findings. Capability-abuse uses runnable harden-soul instead of wall-of-names

Quality fixes

  • Bug-bounty target descriptors no longer cause analyzer pileups. salesforce-mcp.json-style files used to trigger 6 overlapping findings (governance + capability + scope all misfiring on prose). MCP classifier tightened to a known-basename allowlist + content fallback requiring an actual \"mcpServers\": key.
  • TOCTOU-001 stops flagging legitimate existsSync → readFileSync config-load patterns. Now requires a write or exec between check and read. Adds import(varPath) to exec sinks so dynamic-import abuse is still caught. Eliminates 11 FPs on secretless.
  • NanoMind generative findings cap at HIGH when confidence < 0.80. Previously low-confidence findings rendered as CRITICAL with hardcoded 60% confidence stamp. Now CRITICAL only when model is genuinely confident.
  • NanoMind max_tokens 512 → 2048: descriptions no longer truncate mid-word.
  • Self-scan: 100/100, no findings.

Dynamic counts

CLI text now derives check + category counts from the same taxonomy map check-metadata reads. Eliminates the long-running drift where help text said "60 categories" while actual was 44.

Tests

1723 tests passing (3 new regression tests covering the analyzer-pileup fix, the bug-bounty descriptor classification, and the dedup invariant).

See CHANGELOG.md for the full 0.17.10 + 0.17.11 entry.


`npm install -g hackmyagent@0.17.11`
`brew install opena2a/tap/hackmyagent`

v0.17.10

15 Apr 17:27

Choose a tag to compare

v0.17.10

Oracle P0 Fixes

P0-1: Benign FPR 90.9% → 0% (oracle hard-negative gate)

  • Semantic compiler: consensus threshold 0.65 → 0.45 (src/nanomind-core/compiler/semantic-compiler.ts)
  • Raises hasHighBenignContext / isExplicitlyRestrictedBenign intent confidence thresholds so well-governed skills with high-risk capabilities (e.g. shell helpers with SOUL.md) are no longer false-positived

P0-2: Scanner bridge skips label.json

  • src/nanomind-core/scanner-bridge.ts now ignores label.json files during artifact compilation
  • Prevents oracle fixture metadata from being misclassified as agent artifacts

Oracle-Verified Metrics (TME v5, 2026-04-15, 50-fixture oracle)

  • Recall: 100%
  • Precision: 79.6%
  • F1: 88.7%
  • Benign FPR: 9.1% (b08 borderline — execute_shell + no SOUL.md legitimately fires GOV-004)

Tests

1599 passing | 0 failures

Also includes (since v0.17.9)

  • MCP-004 runtime tool-override pattern (ARP)
  • ML-DSA-44 benchmark + noble drift guard (AIComply)
  • Dependency security patches (vite, hono)

v0.17.9

14 Apr 04:27

Choose a tag to compare

Bug fix

Fix kill enforcement on SIGSTOPped processes. EnforcementEngine.kill() sent SIGTERM directly even when the target pid was paused via SIGSTOP. A stopped process cannot handle signals, so SIGTERM stayed queued and the graceful-kill path silently never fired — only the 5s SIGKILL fallback actually terminated it. Now sends SIGCONT first when the target is in the paused set, so graceful termination works as intended.

Adds a focused EnforcementEngine regression test (pause → kill → process dead within 1s).

Unblocks pre-existing OASB CI failure AT-ENF-004: should remove a killed PID from paused list if it was paused.