Problem
Currently, npm run index runs the main CLI via ts-node, but full + incremental indexing spawn worker threads from dist/utils/producer_worker.js.
This creates two user-facing issues:
-
README / Quick Start ambiguity
- A user can reasonably expect
npm install + npm run index ... to work.
- In practice, indexing fails unless
dist/ exists (i.e., you must run npm run build first), because worker threads are hard-coded to load from dist/.
-
Stale artifact / mixed-version hazard
- If
dist/ exists but is stale, a single run can execute new code in the main process (ts-node) while executing old code in the worker (dist JS).
- This is subtle and hard to diagnose.
Evidence in code (paths are repo-relative):
src/commands/full_index_producer.ts and src/commands/incremental_index_command.ts build worker path via:
path.join(process.cwd(), 'dist', 'utils', 'producer_worker.js')
Why this matters
- Consumers want deterministic behavior: one run should not mix sources of truth.
- Docs shouldn’t imply
npm run build is optional if runtime depends on dist/.
Proposed direction: explicit execution modes (avoid auto-detect)
Recommend enforcing a strict invariant: within one run, main + workers come from the same source.
Option 1 (consumer-first / production default): dist-only
- Make the default consumer entrypoint run compiled output end-to-end.
- Change scripts so
npm run index runs node dist/index.js index ... (or a wrapper that requires npm run build).
- Workers continue to load from
dist/.
- Add a separate dev command for contributors.
Option 2 (dev convenience): explicit dev/source mode
- Add
npm run index:dev (or similar) that runs main via ts-node AND runs workers via TS as well.
- Crucially: dev mode should never load from
dist/ (even if it exists), to avoid the stale artifact hazard.
Suggested script split
npm run build -> tsc
npm run index -> node dist/index.js index (requires build)
npm run index:dev -> ts-node src/index.ts index (and worker threads also use TS)
Doc follow-up
- Update README Quick Start to reflect the chosen default (likely dist-only), and optionally mention the dev command for contributors.
Acceptance criteria
- Running the tool from source does not accidentally consume stale
dist/.
- Running the tool in production (Docker/systemd/cron) uses compiled JS consistently.
- README Quick Start matches actual runtime requirements.
Problem
Currently,
npm run indexruns the main CLI viats-node, but full + incremental indexing spawn worker threads fromdist/utils/producer_worker.js.This creates two user-facing issues:
README / Quick Start ambiguity
npm install+npm run index ...to work.dist/exists (i.e., you must runnpm run buildfirst), because worker threads are hard-coded to load fromdist/.Stale artifact / mixed-version hazard
dist/exists but is stale, a single run can execute new code in the main process (ts-node) while executing old code in the worker (dist JS).Evidence in code (paths are repo-relative):
src/commands/full_index_producer.tsandsrc/commands/incremental_index_command.tsbuild worker path via:path.join(process.cwd(), 'dist', 'utils', 'producer_worker.js')Why this matters
npm run buildis optional if runtime depends ondist/.Proposed direction: explicit execution modes (avoid auto-detect)
Recommend enforcing a strict invariant: within one run, main + workers come from the same source.
Option 1 (consumer-first / production default): dist-only
npm run indexrunsnode dist/index.js index ...(or a wrapper that requiresnpm run build).dist/.Option 2 (dev convenience): explicit dev/source mode
npm run index:dev(or similar) that runs main viats-nodeAND runs workers via TS as well.dist/(even if it exists), to avoid the stale artifact hazard.Suggested script split
npm run build->tscnpm run index->node dist/index.js index(requires build)npm run index:dev->ts-node src/index.ts index(and worker threads also use TS)Doc follow-up
Acceptance criteria
dist/.