fix(join): declare loop-spawned closure locals to avoid soft-scope race#71
Merged
Conversation
In `_join(::Symbol, ::Any, ::DTable)` the `process_one_chunk` closure
assigned to `inner_l`, `inner_r`, and `outer_l`. Those same identifiers
are also assigned later in the enclosing `_join` body (the
`intersect`/`Vector{UInt}()` initialization for the outer-row merge).
Under Julia's soft-scope rules that turns them into captured bindings
shared between the enclosing function and every concurrent
`Dagger.@spawn`ed invocation of the closure.
With N r-chunks the closure runs N times in parallel, all writing to
the same three binding cells. The tuple destructure
`inner_l, inner_r = match_inner_indices(...)` is two stores, so a
context switch between them lets one task observe another task's
`inner_l` paired with its own `inner_r` — mismatched lengths trigger a
`BoundsError` in `build_joined_table`, and even when lengths happen to
match, `find_outer_indices` runs against the wrong `inner_l` and the
final merge ends up with extra rows.
Declaring `local inner_l, inner_r, outer_l` in the closure body forces
fresh per-invocation bindings and resolves the race. Re-enables the
previously commented-out `lj10`/`ij10` `DTable`-on-`DTable` join tests.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
3d21233 to
80ecd3a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
_join(::Symbol, ::Any, ::DTable)'sprocess_one_chunkclosure assigns toinner_l,inner_r, andouter_l— names that are also assigned later in the enclosing_joinbody (theintersect/Vector{UInt}()initialization for the outer-row merge). Julia's soft-scope rules therefore make those identifiers captured bindings shared between the enclosing function and everyDagger.@spawned invocation of the closure.With N r-chunks the closure runs N times in parallel, all reading/writing the same three binding cells. The tuple destructure
is two stores; a context switch between them lets one task observe another task's
inner_lpaired with its owninner_r. Mismatched lengths trigger aBoundsErrorinbuild_joined_table, and even when lengths happen to match,find_outer_indices(l, inner_l)runs against the wronginner_land the final merge ends up with extra rows. The bug manifested asDTable-on-DTableleftjoin/innerjoinreturning the wrong row count under-t>1.The fix is one line inside the closure:
local inner_l, inner_r, outer_lThat forces fresh per-invocation bindings and resolves the race. No other
Dagger.@spawnsite insrc/has the same trap — they all use_-prefixed parameter names, have inner locals that don't collide with the enclosing function, or are only spawned once.This also re-enables the previously commented-out
lj10/ij10assertions intest/table.jlthat exerciseDTable-on-DTablejoins.Test plan
368/368underjulia -t8lj10/ij10DTable-on-DTablejoin asserts re-enabled and passleftjoin+innerjointrials on-t8—0/50mismatches (vs reliably wrong onmain)determine_schemacache per iteration —0/30mismatches🤖 Generated with Claude Code