Increase test coverage#6
Open
mt-caret wants to merge 5 commits into
Open
Conversation
The in-bounds test `dim < dims_length || dims_length + dim >= 0` accepted every integer: a positive out-of-range dim satisfies the second clause and a negative out-of-range dim the first. Out-of-range reduction dims thus passed shape inference, which returned a wrong (unchanged) output shape instead of an error. Switch to `&&` so the range is actually enforced. Add test/test_op.ml exercising Op.infer_shape across sum (bounds, duplicates, keep_dims, bool), broadcast (rank padding + dim stretching), reshape (-1 inference), matmul (rank/contraction/type), transpose (rank-2 only), and elementwise rank-0/dim-agreement/bool typing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt checks Reverse-mode grad of `matmul x v` with the matrix `x` as the differentiated variable raised "unsupported transpose dimensions": the transpose rule did `matmul cotangent (transpose v)`, but `transpose` only supports rank 2, so a rank-1 `v` failed. When `v` is a vector, form the outer product `cotangent[n,1] @ v[1,m]` instead. Add test/test_gradcheck.ml: a central-difference gradient-check harness and a battery over every op plus compositions, validating the vjp transpose rules independently of the forward-mode jvp rules (includes the matmul-vector case). Record findings so far in PLAN.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extend the eager-vs-XLA differential tests (forward and gradient) from single ops to random multi-op programs (op_nums 1..4), and let the Sum generator emit keep_dims=false as well as true. Add a periodic Gc.full_major to keep XLA's per-compile thread leak from exhausting pthread_create over the larger trial count. Two test-harness fixes the broadening surfaced (not library bugs): - The Sum/Matmul generator arms could build float-only ops over the bool shapes that now appear as intermediates (from eq/gt/lt); filter candidate operand shapes through Op.infer_shape, as the other arms already do. - Tensor.allclose compares with an absolute 1e-7 tolerance (Float.robustly_compare), which large-magnitude reductions like sum(exp x) legitimately exceed by ~1 ULP due to summation order differing between backends. Compare with a relative tolerance in the differential tests; leave Tensor.allclose untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Op.infer_shape ran its duplicate-dimension check on the raw reduction dims, before normalizing negative indices. So `sum ~dims:(`Just [0; -3])` on a rank-3 tensor (0 and -3 are the same axis) passed with a wrong output shape, while the literal duplicate `[0; 0]` was correctly rejected. Downstream the eager backend double-reduced the axis (tripping the internal shape-consistency check) and XLA raised "Duplicate reduction dimension". Normalize before the dedup check. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…o limits Two JIT bugs found by the multi-agent sweep: - A single failed XLA compilation poisoned all later ones: fox_jit used one process-global builder, and an XLA builder replays its first error from every later build. Create a fresh builder per compilation. (HLO op-ids are now numbered per-compile; the fox_jit HLO expect tests were re-promoted, same HLO.) - reshape with a -1 placeholder aborted the process (C++ CHECK -> SIGABRT) because raw dims were passed to Xla.Op.reshape; resolve -1 via infer_shape first, as eager already does. Two limitations documented with CR notes + tracking tests (test_robustness.ml), as both fixes are larger/riskier: - bool constants are unsupported under JIT (const params hard-coded to F64, and the literal path is float-only); - grad/vjp raises when an output is independent of the input (empty return_vals vs Expr's Nonempty_list; forward mode already yields a zero tangent). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.