-
Notifications
You must be signed in to change notification settings - Fork 21
feat(tarball): add Worker Threads support for parallel analysis #613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat(tarball): add Worker Threads support for parallel analysis #613
Conversation
Implement Worker Threads to parallelize JavaScript file analysis, delivering measurable performance improvements for large codebases. Performance Results (280 files): - Average speedup: +10-15% - Event loop responsiveness: +36% - Memory usage: -90-94% Key Features: - Dynamic load balancing (40-file batches) - Intelligent threshold (250+ files activation) - Persistent worker pool (5min idle timeout) - 4-level fallback system for robustness - Zero breaking changes (backward compatible) Implementation: - WorkerPool singleton with dynamic thread calculation - Batch processing to minimize communication overhead - JIT warmup to eliminate cold-start latency - Configurable via NODE_SECURE_DISABLE_WORKERS env var Testing: - Full unit test coverage (scanner.worker.spec.ts) - Integration tests (WorkerPool.spec.ts) - Comprehensive benchmarking suite Closes NodeSecure#578 Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
|
|
All alerts resolved. Learn more about Socket for GitHub. This PR previously contained dependency changes with security issues that have been resolved, removed, or ignored. |
|
Make sure the linter (ESLint) is ok everywhere |
- Add proper import section comments (@openally/imports) - Fix inline comments to JSDoc format (no-inline-comments) - Fix duplicate imports (no-duplicate-imports) - Fix line lengths and operator precedence - Fix empty block statements Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
| // Import Internal Dependencies | ||
| import { WorkerPool } from "../dist/class/WorkerPool.class.js"; | ||
|
|
||
| const __dirname = path.dirname(fileURLToPath(import.meta.url)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use ESM so import.meta.dirname is available
| await new Promise((resolve) => { | ||
| setTimeout(resolve, 100); | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer node:timers/promises instead
| interface WorkerResponse { | ||
| /** | ||
| * Success flag | ||
| */ | ||
| s: boolean; | ||
| /** | ||
| * Result data | ||
| */ | ||
| r?: ReportOnFile; | ||
| /** | ||
| * Error details | ||
| */ | ||
| e?: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of interfaces with one letter properties and the comment explaining what it is.
| const { WorkerPool } = await import("./WorkerPool.class.js"); | ||
| const pool = WorkerPool.getInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we need a singleton here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure why we need a singleton here
The main reason for using a singleton here is to keep the worker pool
alive across multiple scanPackage() calls (e.g. when scanning many
packages in a pipeline), which avoids paying the worker startup cost
on each invocation.
That said, I agree this is mostly an implementation detail.
If you’d prefer a simpler approach (e.g. a module-scoped instance or
a per-scanner lifecycle), I’m happy to refactor it accordingly.
- Use import.meta.dirname instead of path.dirname(fileURLToPath()) - Use node:timers/promises setTimeout instead of Promise wrapper - Use @nodesecure/fs-walk instead of custom findJavaScriptFiles - Rename properties to full names (s->success, r->result, e->error) - Update tests for new property names Signed-off-by: Hamed Mohamed <hamedmohamed22w@gmail.com>
Summary
This PR implements Worker Threads to parallelize JavaScript file analysis in
@nodesecure/tarball, delivering 10-15% average speedup and 36% better responsiveness for large codebases (250+ files).Performance Benchmarks
Tested on Intel Core i7 (8 cores), 16GB RAM with 280 JavaScript files:
Key Features
Statistical Note
Key Features
Configuration
Testing
Documentation
Closes
#578