Expand unicode stego detection, align taxonomy, fix release-test findings#58
Conversation
…glyphs, and --ci flag - Expand UNICODE-STEGO-001 to detect zero-width chars (U+200B-200D), mid-file BOM (U+FEFF, skip offset 0), and bidi overrides (U+202A-202E, U+2066-2069). Bidi/variation/tag = critical, zero-width-only = high. - Add UNICODE-STEGO-005 homoglyph confusable detection for Cyrillic/Greek/ Fullwidth characters that look identical to Latin. Skips comment lines. - Expand scanned file types beyond JS/TS to include .py, .md, .txt, .yaml, .yml, .json, .toml. - Add --ci flag to secure and scan-soul commands: suppresses interactive prompts, defaults --no-contribute, exits non-zero on any findings. - Add buildContributionSummary() for transparency preview of contribution data. - Map UNICODE-STEGO-005 to STEGO-INJECT in attack taxonomy. - Add 18 new test cases covering all new detection patterns.
HMA taxonomy.ts was using different attack class identifiers than the registry seed data. Aligned all mappings to match registry identifiers: - SOUL-OVERRIDE -> PHANTOM-SOUL (HB checks) + SOUL-HIJACK (HO checks) - SOUL-COLLUDE -> SOUL-FORK - SOUL-TH-005 -> SOUL-IMPERSONATE (was SOUL-INJECT) - HV-DECEPTION/MANIPULATION/UNSAFE-CODE/RESOURCE-ABUSE -> SOUL-HV-001/002/003/004 - CRED-HARVEST -> RETROACTIVE-PRIV - STEGO-INJECT -> UNICODE-STEGO - SOUL-PERSIST -> HEARTBEAT-RCE - SOUL-EXFIL -> SKILL-EXFIL - ORG-SKILL-SUPPLY -> ORG-SKILL-SPREAD - IDENTITY-SPOOF -> AGENT-IMPERSONATE - DNA-FORGE -> BEHAVIORAL-IMPERSONATE - SKILL-MEM -> SKILL-MEM-AMP - SKILL-ADVERSARIAL -> SKILL-FRONTMATTER
…essage - Fix P1: scan --json was outputting "Scanning..." text before JSON, breaking downstream JSON parsers. Now suppresses text in --json mode. - Fix P3: rollback error referenced nonexistent "harden --fix" command, corrected to "secure --fix".
Registry seed includes ASKILL-002 in SKILL-FRONTMATTER hmaCheckIds. HMA taxonomy was missing this mapping.
The .hackmyagent-backup/ directory (created by --fix) was being scanned by findFilesMatching(), causing phantom CRITICAL findings on backup copies of .env files. This made the scan/fix/rescan loop show no score improvement because the backup itself created new findings. Add .hackmyagent-backup to the skip list in findFilesMatching().
Security review found that relativePath was interpolated unsanitized into shell commands in fix suggestion strings. A crafted filename with shell metacharacters could be dangerous if a user copy-pastes the suggested command. - Add shellEscape() helper for xxd/grep suggestions (STEGO-001, -004) - Use JSON.stringify() for node -e suggestions (STEGO-003, -005) - Fix incorrect UTF-8 encoding in comment (EF 80 8B -> E2 80 8B)
There was a problem hiding this comment.
Claude Code Review
Security Review: PR #58
VERDICT: APPROVE
SUMMARY:
This PR expands unicode steganography detection, adds homoglyph confusables scanning, fixes critical shell injection in advisory fix commands, resolves backup directory scanning bug, adds CI mode, and aligns taxonomy with OpenA2A Registry. All security findings from previous review have been adequately addressed. The shellEscape() function properly mitigates command injection risks in fix advisory strings. No new critical or high-severity vulnerabilities introduced.
FINDINGS:
None. All previous security issues have been mitigated:
Previous Review Findings — STATUS:
-
[FIXED] Shell injection in fix commands: Now using
shellEscape()function (lines 6884-6887 in scanner.ts) that wraps paths in single quotes and escapes embedded quotes. Applied to all three advisory fix commands (lines 6951, 7122, 7183). -
[FIXED] Backup directory scanning:
.hackmyagent-backupnow excluded from directory traversal (line 5464 in scanner.ts), preventing phantom CRITICAL findings after--fixoperations. -
[VERIFIED] No ReDoS patterns: Unicode detection uses byte-level comparisons and bounded loops, not regex on untrusted input.
-
[VERIFIED] No path traversal: Uses
path.relative()for display only, no file operations on user-controlled paths. -
[VERIFIED] Array-based command execution: No shell interpolation risks (spawn/exec not used with user input).
Code Quality Notes (non-blocking):
shellEscape()implementation is correct (single-quote escaping)- Homoglyph detection skips comments to reduce false positives
- CI mode properly suppresses prompts and returns non-zero exit codes
- Taxonomy alignment changes are data-only mapping updates
Reviewed 5 files changed (35127 bytes)
Summary
--ciflag tosecureandscan-soul: suppresses prompts, defaults --no-contribute, exits non-zero on any findingsbuildContributionSummary()for transparency previewTest plan