Skip to content

fix(interpreter): squad-shared mutable state has lost-update race #128

@humancto

Description

@humancto

Background

The unit test interpreter::tests::spawn_task_still_shares_lambda_closure_with_parent (added in PR #114) was designed to verify that PR #110's closure-deep-clone change did NOT break spawn_task's shared-mutable-closure semantics. It runs:

let mut count = 0
let bump = fn() {
    count = count + 1
    return count
}
squad {
    spawn { bump() }
    spawn { bump() }
    spawn { bump() }
}

And asserted count == 3.

CI on Linux observed count == 2. Local runs (faster macOS) reliably see 3. The race: count = count + 1 is NOT atomic across spawns. Each Forge expression takes the closure scope mutex separately:

  1. Get count → take lock, read 0, release.
  2. Compute count + 1 → arithmetic outside the lock.
  3. Set count → take lock, write 1, release.

Two threads can interleave: T1 reads 0, T2 reads 0, T1 writes 1, T2 writes 1. Lost update.

PR #124 worked around this by relaxing the assert to count >= 1 (proves sharing happened, even if updates were lost).

The real fix

squad's shared mutable state has no atomic read-modify-write primitive today. Three options:

  1. Atomic builtin -- count.atomic_add(1) that holds the scope lock across read+write.
  2. shared { } block syntax -- explicit cross-task state with intrinsic atomicity. (Already on the roadmap.)
  3. Document and warn -- squad-shared mutables are eventually-consistent, not atomic; if you need atomicity, channel between tasks.

Option 2 is the long-term answer. In the meantime, option 3 should land in CLAUDE.md so users don't write counter-increment patterns expecting them to work under squad.

Origin: discovered while debugging CI flake on PR #124.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions