fix: don't mutate shared matrix AST when resolving ref: rows#2894
Open
amitmishra11 wants to merge 2 commits into
Open
fix: don't mutate shared matrix AST when resolving ref: rows#2894amitmishra11 wants to merge 2 commits into
ref: rows#2894amitmishra11 wants to merge 2 commits into
Conversation
resolveMatrixRefs mutated the *ast.Matrix passed to it in place. That Matrix is part of the cached, shared Task AST (the same *ast.For/*ast.Matrix is reused for every call to a task), so concurrent invocations of a task with a `for: matrix: ref:` loop - e.g. via parallel deps that call the same internal task with different vars - raced on the row.Value assignment. The result was silent cross-contamination: one caller's resolved ref value could leak into another caller's matrix expansion, non-deterministically. Matrix.DeepCopy() already existed but didn't actually deep-copy anything: deepcopy.OrderedMap only deep-copies a value if it implements the Copier interface, and *MatrixRow did not, so the "copy" still shared the same *MatrixRow pointers as the original. Fix: - Add MatrixRow.DeepCopy(), so Matrix.DeepCopy() produces independent rows. - resolveMatrixRefs now deep-copies the matrix before resolving refs into it, and returns the copy instead of mutating its input. Added a regression test that drives resolveMatrixRefs concurrently from many goroutines against the same shared matrix with different ref values, and asserts both that each caller gets its own value back and that the original shared matrix is never mutated. Reverting the fix reliably reproduces the exact cross-contamination described in the issue. Fixes go-task#2890
There was a problem hiding this comment.
Pull request overview
Fixes a concurrency bug where resolveMatrixRefs mutated the shared/cached Task AST matrix in-place, allowing ref: row values to leak across concurrent invocations of the same task.
Changes:
- Make matrix
ref:resolution operate on a deep-copied matrix and return the resolved copy to callers. - Implement
MatrixRow.DeepCopy()soMatrix.DeepCopy()produces truly independent row objects. - Add a regression test that drives concurrent
resolveMatrixRefscalls against the same shared*ast.Matrixand asserts no cross-contamination / no input mutation.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| variables.go | Updates itemsFromFor to use a resolved matrix copy and changes resolveMatrixRefs to return a deep-copied resolved matrix. |
| variables_internal_test.go | Adds a concurrent regression test ensuring matrix ref resolution doesn’t mutate shared AST and doesn’t leak values across goroutines. |
| taskfile/ast/matrix.go | Adds MatrixRow.DeepCopy() so Matrix.DeepCopy() actually deep-copies rows (avoiding shared row pointers). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Skip the deep-copy in resolveMatrixRefs when the matrix has no ref: rows, avoiding unnecessary allocation on every invocation of a static matrix loop. - Check cache.Err() right after templater.ResolveRef, matching the existing pattern in resolveEnumRefs, so a real templating error isn't masked as "must resolve to a list". - In the regression test, record an explicit error instead of silently leaving a result empty when a goroutine's row lookup fails, so such failures aren't misreported as cross-contamination.
Author
|
Addressed all three review comments in 4641a56:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #2890
Bug
resolveMatrixRefsmutated the*ast.Matrixpassed to it in place. ThatMatrixlives on the cached, shared*ast.Task(the same*ast.For/*ast.Matrixobject is reused on every invocation of a task), so when a task with afor: matrix: ref:loop is invoked concurrently — e.g. by two paralleldepsthat call the same internal task with different vars — both calls race on therow.Value = valueassignment inresolveMatrixRefs.Reproduction from the issue:
Intermittently,
ARCH_VAR/BINS_VARresolved for one caller leak into the other caller's matrix expansion, silently producing wrong output with no error.Root cause (two layers)
resolveMatrixRefsresolvedref:rows directly on the matrix it was given, instead of a copy.Matrix.DeepCopy()already existed, but didn't actually deep-copy the rows:deepcopy.OrderedMaponly deep-copies a value if it implements theCopierinterface (DeepCopy() T), and*MatrixRowdidn't implement it. SoMatrix.DeepCopy()'s "copy" still pointed at the exact same*MatrixRowobjects as the original — using it as-is would not have fixed the race.Fix
MatrixRow.DeepCopy()soMatrix.DeepCopy()produces independent rows.resolveMatrixRefsnow deep-copies the matrix first, resolves refs into the copy, and returns the copy. The caller (itemsFromFor) uses the returned, resolved matrix instead of the original.Tests
Added
TestResolveMatrixRefsDoesNotMutateSharedMatrix(variables_internal_test.go), which:resolveMatrixRefsfrom 200 goroutines against the same shared*ast.Matrix, each with a different ref value, and asserts every caller gets back its own value (no cross-contamination).RefandValueunchanged after all concurrent calls).I verified the test actually catches the regression: temporarily reverted the fix (mutate-in-place + no deep copy) and reran the test — it failed reliably and reproduced cross-contamination with the exact symptom from the issue (e.g. "got ARCH=amd64, want arm64"), plus a direct assertion that the shared matrix had been mutated. Restoring the fix makes it pass consistently across repeated runs.
Note: I could not run this under
go test -racein my environment (no 64-bit cgo-capable C compiler available), so I can't show the race detector flagging it directly — the regression test instead proves the bug via direct mutation/cross-contamination assertions, which catch the issue deterministically regardless of goroutine scheduling.Verification
go build ./...— cleango vet ./...— cleango test ./go test ./taskfile/...— all pass (one pre-existing, unrelated failure inTestEvaluateSymlinksInPathsdue to no symlink permissions on Windows; confirmed it fails identically on unmodifiedmain)TestForCmds/TestForDepsmatrix-ref test cases (loop-matrix,loop-matrix-ref,loop-matrix-ref-error) still pass unchanged