Skip to content

fix(test): prevent batch-run scheduler from serializing after first wave#6

Open
crypticsaiyan wants to merge 1 commit into
TestSprite:mainfrom
crypticsaiyan:fix/batch-run-concurrency
Open

fix(test): prevent batch-run scheduler from serializing after first wave#6
crypticsaiyan wants to merge 1 commit into
TestSprite:mainfrom
crypticsaiyan:fix/batch-run-concurrency

Conversation

@crypticsaiyan

Copy link
Copy Markdown

Summary

The create-batch --run scheduler (runBatchRun in src/commands/test.ts) correctly launches the initial concurrencyLimit jobs in parallel. However, the steady-state loop inadvertently awaits each subsequent job's full completion (both the trigger and the --wait poll) before launching the next one.

As a result, the effective concurrency drops to 1 after the first wave completes, ignoring the --max-concurrency flag for the remainder of the run.

Note: 3 of the 4 fan-outs in this file already use the correct launch-then-relaunch pattern; this specific scheduler was the outlier.

The Fix

Replaced the blocking drainOne() semaphore logic with a non-blocking startNext() callback pattern.

This removes the blocking await that was stalling the Promise.race queue. It mirrors the implementation used by its sibling pollFreshAccepted, ensuring we launch up to the concurrency limit, immediately relaunch a new job into a slot as soon as one completes, and never block the scheduler on a single job's completion.

Testing & Validation

Regression Test: Added a new test case in src/commands/test.test.ts using equal-delay mock responses.

  • Before fix: Fails (active job count drops to 1 for tail jobs).
  • After fix: Passes (active job count correctly remains at 3).

CI Gates: All local checks pass.

  • npm run lint & npm run format:check
  • npm run typecheck
  • npm test (1448 tests passing)
  • npm run test:coverage (≥80% threshold met)

Files Changed:

  • src/commands/test.ts
  • src/commands/test.test.ts

create-batch --run launched its first concurrencyLimit triggers in
parallel, but the steady-state loop awaited each subsequent job to
fully finish (trigger + full --wait poll) before launching the next
one. Effective concurrency dropped to 1 after the initial wave
regardless of --max-concurrency.

Switch to the launch-then-race pattern already used by the other
three fan-outs in this file: launch up to the limit, relaunch on
each completion via startNext(), never await a whole job inline.

Add a regression test using equal-delay trigger responses so the
first wave settles in the same microtask batch, which is the exact
condition that exposed the bug.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant