-
Notifications
You must be signed in to change notification settings - Fork 0
Architecture
vexes uses a layered detection architecture. Each layer catches different attack patterns, and composite scoring combines signals from all layers:
+------------------+
| Signal |
| Orchestrator | Composite scoring
| (signals.js) | with context multipliers
+--------+---------+
|
+--------+-------+-------+--------+
| | | |
+-----+--+ +---+-----+ +------+---+ +--+-------+
| Layer 1 | | Layer 2 | | Layer 3 | | Layer 4 |
| AST | | Dep | | Behavior | | Registry |
| Analysis| | Graph | | Profile | | Metadata |
+---------+ +---------+ +----------+ +----------+
acorn AST npm registry Version Publish
JS + Python metadata diffing history
Parses JavaScript source using acorn (vendored -- zero deps) and walks the AST to detect dangerous call patterns. For Python, uses pattern matching on joined source lines (handling line continuations).
Key design decisions:
-
Tracks
require()andimportbindings.const { exec } = require('child_process'); exec('cmd')is correctly traced. The inspector maintains a binding map so destructured imports are caught. - Handles both module and script parse modes. Falls back to script mode if module parse fails.
- Error recovery. If the walker crashes mid-traversal, partial findings are preserved.
-
Obfuscation detection. Computed property calls (
obj[expr]()) and string concatenation inrequire()are flagged as evasion attempts.
Profiles newly added dependencies by fetching their registry metadata:
- Phantom dependencies: Brand-new packages (< 7 days old) with no ecosystem presence
- Circular staging: New dep published by the same account within 48 hours (the axios RAT pattern)
- Typosquatting: Levenshtein distance comparison against popular package databases
- Install scripts on new deps: New dependencies that run code on install
Builds capability profiles per package version and diffs them. The diff is what matters -- a utility library that suddenly gains process_spawn + network + credential_access capabilities is flagged regardless of what those capabilities do individually.
Fastest layer -- analyzes publish history, maintainer changes, timing anomalies:
- Maintainer change with time-decay (recent = CRITICAL, old = MODERATE)
- Rapid publish with CI awareness (0s interval with multiple maintainers = legitimate)
- Version anomaly (major jumps, dormancy followed by sudden publish)
- Provenance (Sigstore attestation check via npm attestations API)
Base Score = Sum of (signal_weight * context_multipliers)
Context multipliers:
- Package age < 30 days: 2.0x
- Single maintainer: 1.5x
- Known-good package: 0.2x
Combination bonus:
- 3+ unique signals: 1.5x total
- 5+ unique signals: 2.0x total
Risk Levels:
NONE: 0 LOW: > 0 MODERATE: >= 5 HIGH: >= 15 CRITICAL: >= 30
Lockfiles/Manifests
|
v
[ Parsers ] npm.js, pypi.js, cargo.js, brew.js
|
v
Dependency List
|
+----+----+
| |
v v
[ OSV ] [ Registry ]
Batch Metadata
Query Fetch
| |
+----+----+
|
v
[ Signal Engine ]
4-layer analysis
|
v
[ Composite Score ]
|
v
Output
(text/json/sarif)
Uses Node.js built-in node:sqlite (DatabaseSync):
| Table | Key | TTL | Content |
|---|---|---|---|
advisories |
ecosystem + name + version | 1 hour | Vulnerability data from OSV |
metadata |
ecosystem + name | 24 hours | Registry metadata |
signals |
ecosystem + name + version | 24 hours | Analysis results |
Corruption resilience: Corrupted JSON entries are detected, deleted, and treated as cache misses. If the database file is unreadable, vexes falls back to a NoOpCache (all misses, no writes) and continues scanning.
Cache poisoning prevention: Degraded analysis results (metadata unavailable, warnings present) are never cached. A transient network failure cannot poison the cache with a false-clean result.
For --deep mode, downloads npm tarballs and PyPI sdists, then inspects the actual source:
- Download with compressed size limit (5MB)
- Gunzip with decompressed size limit (50MB -- gzip bomb protection)
- Parse raw POSIX tar headers (zero-dependency tar reader)
- Extract files matching inspection patterns (index.js, setup.py, etc.)
- Per-file size limit (512KB) and file count limit (10)
- Run through AST inspector
- External data sanitization: All package names, vulnerability summaries, and other external strings are sanitized before terminal output to prevent terminal injection.
-
Config merge protection:
__proto__,constructor,prototypekeys are rejected. -
Command injection prevention: Guard command uses
execFileSync(no shell) with package manager allowlist. - Input validation: Ecosystem names, severity levels, and paths are validated before use.