From 8751bb9ba4e8de07512b55203e1ccf4090ef5757 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Wed, 10 Jun 2026 23:06:18 -0700 Subject: [PATCH 01/11] =?UTF-8?q?each=5Fkv(table)=20=E2=80=94=20kv-pair=20?= =?UTF-8?q?iteration=20as=20named=20tuples;=20fix=20generator=20zip=20lowe?= =?UTF-8?q?ring=20over=20an=20empty=20source?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit each_kv(tab) yields (key, value) named tuples — read-only copies, strict can_copy gate on the value type (no clone fallback; matches insert's own gate). Explicit reject overloads for table void values ("iterate keys() instead") and dim-array values, which otherwise mis-bind and cascade. Pure daslib: a generator zipping the keys/values builtin slot-walk iterators. PR1 of the LINQ table-source arc (plan: benchmarks/sql/LINQ_TO_TABLE.md). Also fixes a pre-existing generator-lowering bug exposed by the empty-table test: the yield-for lowering emitted `loop &&= _builtin_iterator_first(...)` per source, short-circuiting first() on later sources when an earlier one came up empty — but end_loop closes ALL sources, and closing a never-opened container iterator unlocks a container whose lock magic was already cleared ("table/array magic mismatch on unlock"). Reachable on master by any generator zipping two lockable containers with the first one empty. Now emits `loop = first(...) && loop`, matching SimNode_ForWithIterator's always-evaluate-first semantics. Regression: tests/language/generator_zip_empty.das (written first, failed, now green). Validation: full INTERP suite 10891/0 fail; AOT tests/language 1054 + tests/linq 1893; JIT lane green on new files; lint (MCP + CI) clean; das2rst no stubs/Uncategorized; Sphinx clean. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 107 ++++++++++++++++++ daslib/builtin.das | 46 ++++++++ doc/reflections/das2rst.das | 2 +- ...ion-builtin-each_kv-0xdb81e5ca7a0e3baa.rst | 1 + src/ast/ast_generate.cpp | 11 +- tests/language/failed_each_kv.das | 23 ++++ tests/language/generator_zip_empty.das | 65 +++++++++++ tests/language/table_each_kv.das | 84 ++++++++++++++ tests/linq/test_linq_table_source.das | 32 ++++++ 9 files changed, 367 insertions(+), 4 deletions(-) create mode 100644 benchmarks/sql/LINQ_TO_TABLE.md create mode 100644 doc/source/stdlib/handmade/function-builtin-each_kv-0xdb81e5ca7a0e3baa.rst create mode 100644 tests/language/failed_each_kv.das create mode 100644 tests/language/generator_zip_empty.das create mode 100644 tests/language/table_each_kv.das create mode 100644 tests/linq/test_linq_table_source.das diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md new file mode 100644 index 000000000..579356b8b --- /dev/null +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -0,0 +1,107 @@ +# LINQ → TABLE — arc plan + +Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of record for +`table` / `table` as the 6th `_fold` source, plus the `to_table` sink. +Edited in-place as PRs land. + +Status: **PR1 in flight** (`each_kv` builtin). + +PR1 findings: +- **Pre-existing generator-lowering bug, fixed in PR1**: the yield-for lowering emitted + `loop &&= _builtin_iterator_first(...)` per source — short-circuiting `first()` on later + sources when an earlier one came up empty, while the end-of-loop path closes ALL sources. + Closing a never-opened container iterator unlocks a container whose lock magic was already + cleared → "table/array magic mismatch on unlock". Reachable before each_kv (any generator + zipping two lockable containers, first one empty). Fix: `loop = first(...) && loop` + (ast_generate.cpp), matching SimNode_ForWithIterator. Regression: + `tests/language/generator_zip_empty.das`. +- `each_kv` needs explicit reject overloads for `table` (void values → "iterate keys() + instead") and dim-array values — the bare generic otherwise mis-binds (valT drops the dim) + and cascades confusing errors from inside builtin.das. +- Tier-2 chain heads need `unsafe(each_kv(tab))` — same `[unsafe_outside_of_for]` contract as + `each(arr)`; fused chains (PR2) rewrite the head before inference so the wrap disappears. +- builtin module documents via handmade RST (it is a `get_module` C++-flow module in das2rst), + so each_kv has both `//!` in-source docs and a filled handmade file. + +## Settled decisions + +- **kv surface** = `kv.key` / `kv.value` named tuple, **read-only** (a by-value tuple has no + write-through; in-place mutation stays the domain of `for (k, v in keys(t), values(t))`). +- **Pipe head** = `each_kv(tab)`; `keys(tab)` / `values(tab)` are recognized as table sources too. +- **`each_kv` is pure daslib** with a strict `can_copy` gate on the value type — no clone + fallback, ever (a hidden per-element `clone` of an `array<…>` value is the exact sadness the + gate bans). Matches existing language ergonomics: plain `insert` already concept-asserts + `can_copy` on values ([builtin.das ~921](../../daslib/builtin.das)), so non-copyable-valued + tables only arise via `insert_clone` / `tab[k] <- v`. +- **Uniform gate enforcement falls out free**: for non-copyable values `extract_table_source` + returns null → the chain defers to tier-2 → the real `each_kv` instantiates → `concept_assert` + fires (error 31400). One error source; deferral never silently changes semantics. +- Shape (probe-validated): two const/var overloads mirroring `keys`, + `generator -const> capture(<- kit, <- vit)` zipping the two builtin + slot-walk iterators. Multi-source `for` + `yield` works in *generators*; iterator + *comprehensions* reject it ("can't yield from inside the block") — hence the generator form. +- No profiling pre-PR; straight to m7 bench lanes. Scan lanes before the join probe. Sink in + this arc. + +## PR sequence + +1. **`each_kv` in builtin.das** — the validated shape next to `keys`/`values` + (`[unsafe_outside_of_for, nodiscard]`); das2rst "Containers" group; tests + (`tests/language/table_each_kv.das` + `failed_` can_copy compile-fail); INTERP/AOT/JIT. + Standalone value: a kv iterator for plain `for` loops. +2. **`TableAdapter` core (`daslib/linq_fold_table.das`) + m7.** `extract_table_source` + name-matches `each_kv`/`keys`/`values` at the spine head, **type-gated on the arg being a + table** (names too generic to trust bare). Three lanes: keys (by value), values (by ref), + kv — `wrap_source_loop` emits `for (k, v in keys(t), values(t))`, `RowFieldFlattener` + rewrites the field reads, and **usage pruning** drops to a keys-only / values-only + single-iterator walk when the body touches one side (the table analog of XML field-pruning). + Capabilities: `can_reserve_by_length` / `supports_direct_return` = true, `count_shortcut` → + `length(tab)`, any/empty → `!empty(tab)`, `distinct` on keys/kv → identity (keys unique by + construction; values-lane distinct stays real). `can_group_by`/`can_join` = false → tier-2. + New `benchmarks/sql/table.das` with `_m7` runners (fixture `table`; + expected values order-insensitive — slot order ≠ insertion order), results.md re-sweep, + linq_fold_patterns.rst rows, linq_fold.md module-layout update, fused-vs-tier-2 agreement + tests. +3. **`%linq!` `from_in` arm.** `from kv in tab` → `each_kv(tab)` (table-typed value dispatch, + no annotation needed — like arrays); set form `from k in s` over `table` → `keys(s)`. + linq_das.rst update. +4. **Point-lookup folds.** `where(kv.key == X)` + terminator, X loop-invariant: + `any`/`contains` → `key_exists`, `first`/`first_or_default` (± trailing + `select(kv.value…)`) → `tab?[X]` probe, `count` → `key_exists ? 1 : 0`; set-form + `contains(x)` → `key_exists`. The table analog of the JSON const-key fold. m7 point-lookup + bench lane vs linear scan. +5. **Join probe.** `emit_join_hook`: when srcB is `each_kv(tab)`/`keys(tab)` and the b-key + selector is bare `kv.key`, probe the user's table instead of building the join's internal + `table`. Semantics are exactly inner-equi-join with unique B keys — which a das table + guarantees. Bench vs the build-side baseline. +6. **`to_table` / `to_table_move` terminators.** Chain of `tuple` (incl. kv elements) → + `table`; chain of bare hashable K → `table` set. Selector-free — key/value shaping + composes via a preceding `select(k => v)`, matching the existing `to_table` vocabulary over + tuple arrays ([builtin.das ~1664](../../daslib/builtin.das)). Tier-2 generic in linq.das + + fused insert-loop emit (reserve when count is known). Duplicate-key policy: das `insert` + semantics (last-wins), documented — not C#'s throw. + +End of arc: `skills/linq.md` + linq docs mention the table source. + +## Risks / watch items + +- **Mangler ICE 50609** (iterator element-const collision) — `each_kv` yields `-const` non-ref + tuples; the known footgun lives in iterator-typed generic params on the tier-2 side; + mitigation (const-qualify) is known. +- **Lock semantics unchanged**: fused loops use the same builtin iterators as hand code — + mutating the table mid-chain panics exactly as today. +- `values()` on `table` already concept-asserts, so set-form `each_kv` errors cleanly for + free. + +## Deferred edges (named, not built) + +- **Key-as-handle deferred materialization**: for `order_by` over kv with large (copyable) + values, buffer `(orderKey, key)` surrogates and materialize survivors via `tab?[key]` — K + probes instead of N value copies. The table handle is its key; clean fit for the existing + 4-hook surface. Revisit once m7 numbers show whether it matters. +- Set-ops probe (`except`/`intersect` where the *other* side is a `table`) — rides the + engine-wide set-ops edge. +- Fused-kv-over-non-copyable values (loosening the uniform gate) — only if a real use case + begs. +- Dim-array-valued tables (`table`) in `each_kv` — `keys`/`values` carry dedicated + overloads; add an `each_kv` one only on demand. diff --git a/daslib/builtin.das b/daslib/builtin.das index 0ae5b349c..389955c11 100644 --- a/daslib/builtin.das +++ b/daslib/builtin.das @@ -1392,6 +1392,52 @@ def values(var a : table ==const | #) : iterator ==const | #) { + concept_assert(false, "can't each_kv a table<...; void> — iterate keys() instead") +} + +def each_kv(var a : table ==const | #) { + concept_assert(false, "can't each_kv a table<...; void> — iterate keys() instead") +} + +def each_kv(a : table ==const | #) { + concept_assert(false, "each_kv of a table with dim-array values is not supported") +} + +def each_kv(var a : table ==const | #) { + concept_assert(false, "each_kv of a table with dim-array values is not supported") +} + +[unsafe_outside_of_for, nodiscard] +def each_kv(a : table ==const | #) : iterator -const> { + //! Iterates over a table as `(key, value)` named tuples. Both fields are copies (read-only view); + //! requires a copyable value type — non-copyable values (arrays, tables) are rejected at compile time. + concept_assert(typeinfo can_copy(type), "each_kv requires a copyable value type") + var kit <- unsafe(keys(a)) + var vit <- unsafe(values(a)) + return <- generator -const> capture(<- kit, <- vit) { + for (k, v in kit, vit) { + yield (key = k, value = v) + } + return false + } +} + +[unsafe_outside_of_for, nodiscard] +def each_kv(var a : table ==const | #) : iterator -const> { + //! Iterates over a table as `(key, value)` named tuples. Both fields are copies (read-only view); + //! requires a copyable value type — non-copyable values (arrays, tables) are rejected at compile time. + concept_assert(typeinfo can_copy(type), "each_kv requires a copyable value type") + var kit <- unsafe(keys(a)) + var vit <- unsafe(values(a)) + return <- generator -const> capture(<- kit, <- vit) { + for (k, v in kit, vit) { + yield (key = k, value = v) + } + return false + } +} + def get_key(a : table ==const; value) { concept_assert(false, "can't get_key of a table<...; void>") } diff --git a/doc/reflections/das2rst.das b/doc/reflections/das2rst.das index 8c2e94975..80400e330 100644 --- a/doc/reflections/das2rst.das +++ b/doc/reflections/das2rst.das @@ -161,7 +161,7 @@ def document_module_builtin(_root : string) { hide_group(group_by_regex("Internal pointer arithmetics", mod, %regex~i_das_%%)), hide_group(group_by_regex("Internal clone infrastructure", mod, %regex~clone%%)), hide_group(group_by_regex("Internal finalize infrastructure", mod, %regex~finalize%%)), - group_by_regex("Containers", mod, %regex~(capacity|clear|length|resize|resize_no_init|reserve|each|emplace|emplace_from|erase|find| + group_by_regex("Containers", mod, %regex~(capacity|clear|length|resize|resize_no_init|reserve|each|each_kv|emplace|emplace_from|erase|find| find_for_edit|find_if_exists|find_index|find_index_if|has_value|key_exists|keys|values|get_key|lock|each_enum|each_ref| find_for_edit_if_exists|lock_forever|next|nothing|pop|push|push_from|push_clone|push_clone_from|back|sort|stable_sort|to_array|to_table|to_array_move| to_table_move|empty|subarray|insert|move_to_ref|copy_to_local|move_to_local|get|remove_value|erase_if|resize_and_init| diff --git a/doc/source/stdlib/handmade/function-builtin-each_kv-0xdb81e5ca7a0e3baa.rst b/doc/source/stdlib/handmade/function-builtin-each_kv-0xdb81e5ca7a0e3baa.rst new file mode 100644 index 000000000..bc34a40cd --- /dev/null +++ b/doc/source/stdlib/handmade/function-builtin-each_kv-0xdb81e5ca7a0e3baa.rst @@ -0,0 +1 @@ +Iterates over a table as ``(key, value)`` named tuples. Both fields are copies — a read-only view in unspecified (slot) order; the value type must be copyable, and non-copyable values (arrays, tables) are rejected at compile time. diff --git a/src/ast/ast_generate.cpp b/src/ast/ast_generate.cpp index 9325f3f03..55c934521 100644 --- a/src/ast/ast_generate.cpp +++ b/src/ast/ast_generate.cpp @@ -1631,13 +1631,18 @@ namespace das { vvar->init = rein; veqt->variables.push_back(vvar); blk->list.push_back(veqt); - // loop &= _builtin_iterator_first(it0,pvar0) + // loop = _builtin_iterator_first(it0,pvar0) && loop + // first() on the LEFT so it runs for EVERY source even when an earlier one came up + // empty (matches SimNode_ForWithIterator) — end_loop closes all sources, and closing + // a never-opened container iterator unlocks a container whose lock was never taken. auto cbif = new ExprCall(expr->at, "_builtin_iterator_first"); cbif->generated = true; cbif->arguments.push_back(new ExprVar(expr->at, srcName)); cbif->arguments.push_back(new ExprVar(expr->at, pVarName)); - auto lande = new ExprOp2(expr->at,"&&=", - new ExprVar(expr->at,loopVar),cbif); + auto land = new ExprOp2(expr->at,"&&", + cbif,new ExprVar(expr->at,loopVar)); + auto lande = new ExprCopy(expr->at, + new ExprVar(expr->at,loopVar),land); blk->list.push_back(lande); } auto bll = new ExprLabel(expr->at, begin_loop_label, diff --git a/tests/language/failed_each_kv.das b/tests/language/failed_each_kv.das new file mode 100644 index 000000000..f151fe0b7 --- /dev/null +++ b/tests/language/failed_each_kv.das @@ -0,0 +1,23 @@ +// each_kv compile-time rejections: non-copyable value type, void values (set form — use keys()), +// and dim-array values (no dedicated overload). One statement per reject; the void/dim arms +// cascade the standard "void iteration" pair (30192/30107) behind the concept_assert. +options gen2 + +expect 31400:3, 30192:2, 30107:2 + +[export] +def main() { + var nonCopyable : table> + var n = 0 + for (kv in each_kv(nonCopyable)) { + n++ + } + var voidValues : table + for (kv in each_kv(voidValues)) { + n++ + } + var dimValues : table + for (kv in each_kv(dimValues)) { + n++ + } +} diff --git a/tests/language/generator_zip_empty.das b/tests/language/generator_zip_empty.das new file mode 100644 index 000000000..6db1ca3f2 --- /dev/null +++ b/tests/language/generator_zip_empty.das @@ -0,0 +1,65 @@ +options gen2 + +require dastest/testing_boost public + +// Regression: the generator for-loop lowering emitted `loop &&= _builtin_iterator_first(...)`, +// short-circuiting first() on later sources when an earlier one came up empty — but the +// end-of-loop path closes ALL sources, and closing a never-opened container iterator unlocks +// a container whose lock magic was already cleared ("magic mismatch on unlock"). +// first() must run for every source, matching SimNode_ForWithIterator. + +[test] +def test_generator_zip_empty_source(t : T?) { + t |> run("two array iterators, first empty") @(t : T?) { + let a : array + var b : array + b |> push(1) + var ait <- unsafe(each(a)) + var bit <- unsafe(each(b)) + var g <- generator capture(<- ait, <- bit) { + for (x, y in ait, bit) { + yield x + y + } + return false + } + var n = 0 + for (_v in g) { + n++ + } + t |> equal(n, 0) + } + t |> run("two array iterators, second empty") @(t : T?) { + var a : array + let b : array + a |> push(1) + var ait <- unsafe(each(a)) + var bit <- unsafe(each(b)) + var g <- generator capture(<- ait, <- bit) { + for (x, y in ait, bit) { + yield x + y + } + return false + } + var n = 0 + for (_v in g) { + n++ + } + t |> equal(n, 0) + } + t |> run("two table iterators, table empty") @(t : T?) { + let tab : table + var kit <- unsafe(keys(tab)) + var vit <- unsafe(values(tab)) + var g <- generator -const> capture(<- kit, <- vit) { + for (k, v in kit, vit) { + yield (k = k, v = v) + } + return false + } + var n = 0 + for (_kv in g) { + n++ + } + t |> equal(n, 0) + } +} diff --git a/tests/language/table_each_kv.das b/tests/language/table_each_kv.das new file mode 100644 index 000000000..5e7529130 --- /dev/null +++ b/tests/language/table_each_kv.das @@ -0,0 +1,84 @@ +options gen2 + +require dastest/testing_boost public + +struct Pt { + x : int + y : int +} + +[test] +def test_each_kv(t : T?) { + t |> run("int keys, int values") @(t : T?) { + var tab : table + for (i in range(10)) { + tab |> insert(i, i * 10) + } + var n = 0 + var ksum = 0 + var vsum = 0 + for (kv in each_kv(tab)) { + n++ + ksum += kv.key + vsum += kv.value + } + t |> equal(n, 10) + t |> equal(ksum, 45) + t |> equal(vsum, 450) + } + t |> run("string keys, float values") @(t : T?) { + var tab : table + tab |> insert("a", 1.5) + tab |> insert("b", 2.5) + var s = 0.0 + for (kv in each_kv(tab)) { + s += kv.value + } + t |> equal(s, 4.0) + } + t |> run("struct values") @(t : T?) { + var tab : table + tab |> insert(1, Pt(x = 10, y = 20)) + tab |> insert(2, Pt(x = 30, y = 40)) + var xs = 0 + var ys = 0 + for (kv in each_kv(tab)) { + xs += kv.value.x + ys += kv.value.y + } + t |> equal(xs, 40) + t |> equal(ys, 60) + } + t |> run("empty table yields nothing") @(t : T?) { + let tab : table + var n = 0 + for (_kv in each_kv(tab)) { + n++ + } + t |> equal(n, 0) + } + t |> run("agrees with zipped keys/values") @(t : T?) { + var tab : table + tab |> insert(1, "one") + tab |> insert(2, "two") + tab |> insert(3, "three") + var roundTrip : table + for (kv in each_kv(tab)) { + roundTrip |> insert(kv.key, kv.value) + } + t |> equal(length(roundTrip), length(tab)) + for (k, v in keys(tab), values(tab)) { + t |> success(key_exists(roundTrip, k)) + t |> equal(roundTrip?[k] ?? "", v) + } + } + t |> run("element is a copy — table unchanged") @(t : T?) { + var tab : table + tab |> insert(1, Pt(x = 10, y = 20)) + for (kv in each_kv(tab)) { + var local = kv + local.value.x = 999 + } + t |> equal((tab?[1] ?? Pt()).x, 10) + } +} diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das new file mode 100644 index 000000000..e85407216 --- /dev/null +++ b/tests/linq/test_linq_table_source.das @@ -0,0 +1,32 @@ +options gen2 + +require dastest/testing_boost public +require daslib/linq_boost + +// Tier-2 LINQ over a table source via each_kv (the fused TableAdapter lands separately). +// each_kv is [unsafe_outside_of_for], so a chain head needs the explicit unsafe(...) wrap — +// same contract as each(arr) outside a fused chain. + +[test] +def test_each_kv_tier2(t : T?) { + t |> run("where/select/to_array over each_kv") @(t : T?) { + var tab : table + tab |> insert("a", 1) + tab |> insert("b", 2) + tab |> insert("c", 3) + var vals <- unsafe(each_kv(tab)) |> _where(_.value > 1) |> _select(_.value) |> to_array() + vals |> sort() // slot order is unspecified + t |> equal(length(vals), 2) + t |> equal(vals[0], 2) + t |> equal(vals[1], 3) + delete vals + } + t |> run("keys participate in the chain") @(t : T?) { + var tab : table + for (i in range(6)) { + tab |> insert(i, i * i) + } + let n = unsafe(each_kv(tab)) |> _where(_.key % 2 == 0) |> count() + t |> equal(n, 3) + } +} From 7b93056d7552cd3a614e5f988f169dc92f81ffce Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Wed, 10 Jun 2026 23:11:25 -0700 Subject: [PATCH 02/11] linq-table arc: whole story stays on this branch; PR after the fixed-array rework merges in Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 579356b8b..7206bda75 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,7 +4,14 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **PR1 in flight** (`each_kv` builtin). +Status: **stage 1 committed** (`each_kv` builtin, 8751bb9ba). + +**Branch strategy (Boris, 2026-06-10):** the ENTIRE arc stays on `bbatkin/linq-table-each-kv` +as stacked stage commits — no per-stage PRs. A major fixed-array rework is in flight on master; +merging that INTO this branch once (after it lands) beats making every rework merge fight this +work. Cut the PR only after the rework has landed and been merged in here. At that merge, +re-validate the `each_kv` dim-array-value reject overload and `auto(valT)[]` matching — fixed +arrays are exactly what is being reworked. PR1 findings: - **Pre-existing generator-lowering bug, fixed in PR1**: the yield-for lowering emitted @@ -43,7 +50,7 @@ PR1 findings: - No profiling pre-PR; straight to m7 bench lanes. Scan lanes before the join probe. Sink in this arc. -## PR sequence +## Stage sequence (commits on this branch) 1. **`each_kv` in builtin.das** — the validated shape next to `keys`/`values` (`[unsafe_outside_of_for, nodiscard]`); das2rst "Containers" group; tests From 571fe879e5a4d487d885fbea92987e1a6e14b266 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Wed, 10 Jun 2026 23:58:44 -0700 Subject: [PATCH 03/11] =?UTF-8?q?linq=5Ffold:=20TableAdapter=20=E2=80=94?= =?UTF-8?q?=20table/table=20as=20the=206th=20source,=20m7=20bench?= =?UTF-8?q?=20lane?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit daslib/linq_fold_table.das: TableAdapter + extract_table_source. each_kv/keys/values chain heads (name + table-typed-arg match) emit fused slot walks inside a single-param invoke binding the table. The kv lane usage-prunes the walk from the body's it.key/it.value reads: one side touched -> single-iterator keys()/values() walk (half the slot-skip work), both -> zipped two-iterator for, whole-pair escape -> named-tuple bind (copyable values only; non-copyable falls through and the surviving each_kv instantiation concept-asserts). Bare count/long_count folds to O(1) length(tab); plain distinct over raw keys/kv elements is dropped (keys unique by construction; uniqueness-preserving prefixes only). group_by/join/reverse defer to tier-2 (staged: point-lookup folds, join probe — see benchmarks/sql/LINQ_TO_TABLE.md). Notable mechanics: the qmacro grammar allows $i() only in the FIRST iterator slot of a multi-source for, so the kv zip header uses literal loop-var names (ZipAdapter's itA/itB trade); keys() yields non-const elements, so the engine-visible bind is a let-rebind (workhorse copy, free); the dispatcher clears removeConstant on cloned element types so the -const iterator spelling doesn't leak into buffer types and break push_clone unification. benchmarks/sql/table.das: m7 lane (45 families, kv-form chains, order-insensitive guards) + fixture_table in _common + m7 column in _update_results + results.md re-sweep (2026-06-10). INTERP profile: pruned scans sit between array and XML (sum_aggregate 13.4 ns/elem vs array 2.1 / XML 54.3 / JSON 146.7; contains_match 6.6 keys-pruned); deferred markers groupby ~160-190 / join ~195-230 / reverse_take 58.7 flag the staged tier-2 cells. tests/linq/test_linq_table_source.das: 24 fused-vs-hand-loop agreement tests across all lanes (count shortcuts, accumulators, early-exit, to_array slot-order, order/distinct/take, dropped- distinct correctness, values-distinct stays real, iterator-typed result, set form, tier-2 heads). Docs: linq_fold_patterns.rst source row, linq_fold.md layout, LINQ_TO_TABLE.md findings. Validation: full INTERP suite 10912/0 fail; AOT tests/linq 1914; JIT lane green; MCP + CI lint clean; Sphinx clean; full 6-lane bench sweep regenerated. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 18 +- benchmarks/sql/_common.das | 18 + benchmarks/sql/_update_results.das | 4 +- benchmarks/sql/results.md | 328 ++++++----- benchmarks/sql/table.das | 620 ++++++++++++++++++++ daslib/linq_fold.das | 23 + daslib/linq_fold.md | 2 +- daslib/linq_fold_table.das | 212 +++++++ doc/source/reference/linq_fold_patterns.rst | 3 + tests/linq/test_linq_table_source.das | 207 ++++++- 10 files changed, 1266 insertions(+), 169 deletions(-) create mode 100644 benchmarks/sql/table.das create mode 100644 daslib/linq_fold_table.das diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 7206bda75..72acd2be6 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,7 +4,23 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 1 committed** (`each_kv` builtin, 8751bb9ba). +Status: **stage 2 committed** (TableAdapter + m7; stage 1 = `each_kv` builtin, 8751bb9ba). + +Stage 2 findings: +- m7 INTERP profile (2026-06-10 sweep): pruned scans sit between array and XML — `sum_aggregate` + 13.4 ns/elem (array 2.1, XML 54.3, JSON 146.7), `contains_match` 6.6 via the keys-pruned walk, + pure-select `count` hits the O(1) shortcut (0.0). Deferred markers: `groupby_count` 162.6 / + `groupby_sum` 192.8 / `join_count` 195.0 / `join_where_count` 229.1 / `reverse_take` 58.7 — + the tier-2 cells stages 4–5 erase. +- The qmacro grammar only allows `$i()` in the FIRST iterator slot of a multi-source `for` — the + kv zip header uses literal `_tab_kv_key_` / `_tab_kv_value_` names (ZipAdapter's itA/itB trade). +- `keys()` yields NON-const elements (writable temp copies) — the engine-visible bind is a `let` + rebind (workhorse copy, free); push_clone's `==const` composition needs it. +- `keys`/`each_kv` spell their element `-const` (iterator variance); the dispatcher clears + `removeConstant` on the cloned types or `array -const>` buffer spellings break + push_clone unification. +- Bare `.to_array()` is not a recognized chain for ANY source (only suffix variants like + `where_to_array` exist) — a keys-snapshot needs an op in the chain. Shared engine edge. **Branch strategy (Boris, 2026-06-10):** the ENTIRE arc stays on `bbatkin/linq-table-each-kv` as stacked stage commits — no per-stage PRs. A major fixed-array rework is in flight on master; diff --git a/benchmarks/sql/_common.das b/benchmarks/sql/_common.das index 587c47ec6..657469dc8 100644 --- a/benchmarks/sql/_common.das +++ b/benchmarks/sql/_common.das @@ -92,6 +92,24 @@ def public fixture_json(n : int) : JsonValue? { return JV([for (c in fixture_array(n)); JV(c)]) } +// Table fold lane (m7): same Car schema keyed by id in a table. Same deterministic row +// generator as fixture_array so the table lane is directly comparable to the array (m3f) lane; table +// slot order is unspecified, so m7 expectations stay order-insensitive (aggregates / counts). +def public fixture_table(n : int) : table { + var t <- { + for (i in range(n)); + i + 1 => Car( + id = i + 1, + name = "Car{i}", + price = (i * 37) % 1000, + brand = i % BRAND_COUNT, + year = 2010 + (i * 7) % 16, + dealer_id = (i % DEALER_COUNT) + 1 + ) + } + return <- t +} + def public fixture_dealers_array() : array { var arr : array arr |> resize(DEALER_COUNT) diff --git a/benchmarks/sql/_update_results.das b/benchmarks/sql/_update_results.das index b6cb2569e..4017c43d4 100644 --- a/benchmarks/sql/_update_results.das +++ b/benchmarks/sql/_update_results.das @@ -46,8 +46,8 @@ struct Config { help : bool } -let LANES = ["m1", "m3f", "m4", "m5f", "m6f"] -let HEADERS = ["SQL (m1)", "Array (m3f)", "Decs (m4)", "XML fold (m5f)", "JSON fold (m6f)"] +let LANES = ["m1", "m3f", "m4", "m5f", "m6f", "m7"] +let HEADERS = ["SQL (m1)", "Array (m3f)", "Decs (m4)", "XML fold (m5f)", "JSON fold (m6f)", "Table fold (m7)"] let BEGIN_MARKER = "" let END_MARKER = "" diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 73614dec5..519254b0b 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -1,19 +1,22 @@ -# Benchmarks — SQL / Array / Decs / XML / JSON comparison +# Benchmarks — SQL / Array / Decs / XML / JSON / Table comparison -Five lanes run the same query families over one `Car` schema (n = 100 000 cars, 100 dealers, +Six lanes run the same query families over one `Car` schema (n = 100 000 cars, 100 dealers, 5 brands); cells are ns/op, `—` = intentionally absent lane (see "Missing lanes"). The tables between the `BENCH:TABLES` markers are machine-generated (see "How to re-run"); all other text is hand-edited. -Each lane lives in its own file (`array.das` / `decs.das` / `xml.das` / `json.das` / `sql.das`) with -the source fixture built once in `[init]`; the sweep runs one process per file, so a lane is never -contaminated by another lane's code in the same process (this is why JIT cells are stable now). +Each lane lives in its own file (`array.das` / `decs.das` / `xml.das` / `json.das` / `sql.das` / +`table.das`) with the source fixture built once in `[init]`; the sweep runs one process per file, +so a lane is never contaminated by another lane's code in the same process (this is why JIT cells +are stable now). - **m1 SQL** — `_fold(db |> select_from(type) |> …)` over in-memory SQLite; `_fold` passes the chain to `_sql`. - **m3f Array** — `_fold` over `each(array)`. - **m4 Decs** — `_fold` over `from_decs_template(type)` (per-archetype walk). - **m5f XML** — `_fold` over `from_xml_node(root, type)` (`XmlAdapter` fuses + field-prunes). - **m6f JSON** — `_fold` over `from_json(jv, type)` (`JsonAdapter`, same machinery, array walk). +- **m7 Table** — `_fold` over `each_kv(table)` (`TableAdapter`; kv usage-pruning picks keys-only / + values-only / zipped slot walks; group_by / join / reverse defer to tier-2 until their stages land). `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`. @@ -22,169 +25,169 @@ contaminated by another lane's code in the same process (this is why JIT cells a signal, JIT deltas as indicative.** -*Generated 2026-06-06 by `benchmarks/sql/_update_results.das` — ns/op; `—` = absent lane. Edit the prose around the markers, not the tables.* +*Generated 2026-06-10 by `benchmarks/sql/_update_results.das` — ns/op; `—` = absent lane. Edit the prose around the markers, not the tables.* ## INTERP -| Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | -|---|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.1 | 5.9 | 5.9 | 60.9 | 159.9 | -| `all_match` | 28.0 | 3.5 | 3.4 | 56.6 | 154.0 | -| `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.7 | 5.9 | 8.8 | 60.4 | 164.6 | -| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | -| `bare_order_where` | 283.1 | 116.2 | 128.0 | 304.3 | 288.6 | -| `chained_select_collapse` | — | 17.8 | 17.8 | 70.4 | 162.7 | -| `chained_where` | 37.0 | 6.6 | 7.1 | 104.8 | 185.2 | -| `contains_match` | 0.0 | 2.3 | 1.5 | 29.1 | 73.0 | -| `count_aggregate` | 29.9 | 4.1 | 4.2 | 64.1 | 154.6 | -| `cross_join` | 12610.5 | 3738.5 | — | 4039.6 | 4042.5 | -| `decs_count_bare_pred` | — | — | 4.2 | — | — | -| `distinct_by_count` | 41.7 | 16.1 | 16.0 | 70.7 | 163.5 | -| `distinct_by_order_take` | 240.1 | 22.0 | 23.3 | 123.9 | 161.9 | -| `distinct_by_order_to_array` | 242.0 | 22.3 | 23.4 | 124.4 | 162.7 | -| `distinct_count` | 41.6 | 15.8 | 15.8 | 71.2 | 161.8 | -| `distinct_count_pred` | 252.2 | 15.8 | 15.9 | 112.2 | 178.7 | -| `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | -| `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 173.2 | 29.5 | 29.4 | 122.6 | 195.2 | -| `groupby_count` | 142.1 | 19.2 | 19.2 | 74.5 | 167.8 | -| `groupby_first` | 252.4 | 19.1 | 19.8 | 71.7 | 163.3 | -| `groupby_having_count` | 141.9 | 19.1 | 19.2 | 74.2 | 167.7 | -| `groupby_having_hidden_sum` | 176.5 | 22.6 | 22.8 | 118.4 | 192.1 | -| `groupby_having_post_where` | 171.4 | 19.6 | 19.2 | 114.6 | 188.3 | -| `groupby_max` | 174.1 | 25.1 | 25.0 | 120.0 | 193.1 | -| `groupby_min` | 173.7 | 25.2 | 24.9 | 120.1 | 193.1 | -| `groupby_multi_reducer` | 190.3 | 30.9 | 30.6 | 124.8 | 196.7 | -| `groupby_select_order` | 171.8 | 19.2 | 19.1 | 115.1 | 189.8 | -| `groupby_select_sum` | 201.8 | 38.5 | 37.9 | 102.8 | 195.0 | -| `groupby_sum` | 172.7 | 19.1 | 19.1 | 115.3 | 188.2 | -| `groupby_where_count` | 76.3 | 13.9 | 14.2 | 116.0 | 186.3 | -| `groupby_where_sum` | 87.7 | 14.0 | 14.5 | 116.3 | 186.7 | -| `join_count` | 38.7 | 51.5 | 64.2 | 113.9 | 183.3 | -| `join_groupby_count` | 156.9 | 77.3 | 89.9 | 178.0 | 230.6 | -| `join_groupby_to_array` | 190.3 | 78.4 | 90.8 | 215.4 | 212.7 | -| `join_select` | 150.2 | 72.6 | 85.0 | 189.2 | 215.2 | -| `join_where_count` | 40.1 | 61.4 | 76.7 | 162.3 | 199.0 | -| `last_match` | 0.0 | 5.8 | 13.8 | 65.5 | 160.3 | -| `long_count_aggregate` | 29.9 | 4.2 | 4.1 | 64.0 | 154.6 | -| `max_aggregate` | 31.5 | 6.0 | 6.7 | 58.9 | 163.7 | -| `min_aggregate` | 31.2 | 6.0 | 6.8 | 59.2 | 162.9 | -| `order_by_multi_key` | 345.5 | 281.8 | 285.5 | 460.6 | 445.5 | -| `order_distinct_take` | 138.4 | 15.7 | 100.4 | 72.9 | 163.4 | -| `order_reverse_normalized` | 38.6 | 16.3 | 20.0 | 70.2 | 170.3 | -| `order_take_desc` | 38.5 | 16.2 | 20.0 | 70.1 | 171.7 | -| `reverse_distinct_by` | 296.5 | 21.3 | 27.6 | 70.9 | 162.9 | -| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.0 | -| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.1 | -| `select_count` | 0.1 | 0.0 | 2.2 | 69.7 | 2.2 | -| `select_many` | — | 191.3 | — | — | — | -| `select_where` | 196.2 | 11.2 | 19.4 | 196.7 | 183.3 | -| `select_where_count` | 33.0 | 5.8 | 7.4 | 65.0 | 157.8 | -| `select_where_order_take` | 37.3 | 12.3 | 14.9 | 72.8 | 167.2 | -| `select_where_sum` | 37.3 | 7.4 | 7.4 | 66.8 | 163.2 | -| `single_match` | 0.0 | 2.8 | 5.5 | 58.6 | 155.5 | -| `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 60.7 | 153.9 | -| `sort_first` | 38.5 | 11.1 | 13.3 | 64.8 | 168.2 | -| `sort_take` | 38.7 | 16.3 | 20.2 | 70.9 | 170.8 | -| `sort_take_select` | 38.5 | 16.3 | 20.7 | 71.3 | 171.0 | -| `sum_aggregate` | 30.3 | 2.1 | 2.1 | 54.4 | 153.3 | -| `sum_where` | 33.1 | 4.4 | 4.3 | 64.2 | 154.7 | -| `take_count` | 3.7 | 0.2 | 0.4 | 2.9 | 2.7 | -| `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | -| `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | -| `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | -| `take_while_match` | 7.9 | 2.4 | 2.4 | 30.3 | 77.7 | -| `to_array_filter` | 69.8 | 11.7 | 12.1 | 71.9 | 165.4 | -| `where_join_count` | 39.7 | 29.2 | 41.8 | 133.1 | 168.8 | -| `zip_count_pred` | 39.3 | 15.8 | — | 315.5 | 317.1 | -| `zip_dot_product` | 46.9 | 12.7 | 10.7 | 317.9 | 314.0 | -| `zip_dot_product_3arg` | 46.7 | 12.6 | — | 310.7 | 314.2 | -| `zip_reverse_to_array` | — | 31.8 | — | 344.1 | 349.6 | +| Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | +|---|---:|---:|---:|---:|---:|---:| +| `aggregate_match` | 34.7 | 5.9 | 5.8 | 60.1 | 152.3 | 19.0 | +| `all_match` | 27.3 | 3.5 | 3.4 | 55.6 | 147.0 | 15.8 | +| `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `average_aggregate` | 29.8 | 5.9 | 8.8 | 58.3 | 156.2 | 17.2 | +| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 29.2 | +| `bare_order_where` | 277.1 | 118.1 | 126.8 | 300.9 | 292.2 | 166.4 | +| `chained_select_collapse` | — | 17.7 | 17.4 | 70.1 | 155.4 | 27.8 | +| `chained_where` | 35.8 | 6.6 | 7.1 | 104.2 | 174.7 | 24.1 | +| `contains_match` | 0.0 | 2.2 | 1.4 | 27.5 | 68.5 | 6.6 | +| `count_aggregate` | 29.2 | 4.1 | 4.1 | 63.4 | 147.5 | 20.2 | +| `cross_join` | 13122.7 | 3685.9 | — | 3995.6 | 4066.2 | — | +| `decs_count_bare_pred` | — | — | 4.1 | — | — | — | +| `distinct_by_count` | 40.8 | 15.6 | 15.6 | 70.2 | 154.0 | 26.4 | +| `distinct_by_order_take` | 240.7 | 22.1 | 23.4 | 122.7 | 161.6 | 48.5 | +| `distinct_by_order_to_array` | 239.2 | 22.2 | 23.5 | 123.6 | 161.7 | 48.4 | +| `distinct_count` | 40.7 | 15.9 | 15.7 | 70.5 | 155.8 | 26.9 | +| `distinct_count_pred` | 251.0 | 16.1 | 15.8 | 111.5 | 178.0 | 26.3 | +| `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | +| `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `groupby_average` | 173.3 | 29.3 | 29.3 | 122.9 | 190.0 | — | +| `groupby_count` | 143.5 | 19.4 | 19.4 | 75.4 | 161.0 | 162.6 | +| `groupby_first` | 251.7 | 19.5 | 20.1 | 72.1 | 156.9 | — | +| `groupby_having_count` | 140.7 | 19.5 | 19.5 | 74.7 | 161.2 | — | +| `groupby_having_hidden_sum` | 176.1 | 22.5 | 22.6 | 118.0 | 183.5 | — | +| `groupby_having_post_where` | 172.8 | 20.8 | 20.8 | 114.1 | 180.4 | — | +| `groupby_max` | 173.5 | 24.8 | 25.3 | 119.7 | 185.2 | — | +| `groupby_min` | 173.8 | 25.2 | 25.1 | 119.8 | 184.7 | — | +| `groupby_multi_reducer` | 189.5 | 30.5 | 30.6 | 124.3 | 188.4 | — | +| `groupby_select_order` | 169.9 | 20.8 | 20.8 | 114.3 | 180.9 | — | +| `groupby_select_sum` | 196.9 | 38.6 | 38.1 | 101.6 | 186.6 | — | +| `groupby_sum` | 170.5 | 21.2 | 20.8 | 114.4 | 180.2 | 192.8 | +| `groupby_where_count` | 75.6 | 14.1 | 14.3 | 115.2 | 177.8 | — | +| `groupby_where_sum` | 86.4 | 14.1 | 14.6 | 116.2 | 178.1 | — | +| `join_count` | 38.0 | 51.2 | 64.2 | 112.7 | 176.9 | 195.0 | +| `join_groupby_count` | 157.7 | 86.1 | 88.2 | 177.4 | 221.8 | — | +| `join_groupby_to_array` | 194.9 | 80.3 | 91.7 | 214.8 | 212.1 | — | +| `join_select` | 150.3 | 72.4 | 84.4 | 187.8 | 209.0 | — | +| `join_where_count` | 39.0 | 61.6 | 76.7 | 159.8 | 193.6 | 229.1 | +| `last_match` | 0.0 | 5.9 | 13.9 | 64.9 | 152.3 | 31.0 | +| `long_count_aggregate` | 28.7 | 4.1 | 4.1 | 63.3 | 147.5 | 20.3 | +| `max_aggregate` | 30.6 | 6.0 | 6.8 | 58.4 | 156.1 | 17.0 | +| `min_aggregate` | 30.5 | 6.0 | 6.8 | 58.4 | 155.1 | 17.0 | +| `order_by_multi_key` | 338.7 | 272.3 | 286.1 | 457.7 | 448.2 | 333.0 | +| `order_distinct_take` | 138.4 | 15.9 | 99.2 | 72.4 | 156.5 | 31.0 | +| `order_reverse_normalized` | 37.9 | 16.3 | 20.0 | 70.4 | 162.9 | — | +| `order_take_desc` | 37.8 | 16.3 | 20.3 | 69.8 | 163.3 | 33.2 | +| `reverse_distinct_by` | 294.1 | 21.2 | 28.0 | 70.8 | 155.4 | — | +| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.7 | +| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.1 | — | +| `select_count` | 0.1 | 0.0 | 2.2 | 64.8 | 2.2 | 0.0 | +| `select_many` | — | 191.0 | — | — | — | — | +| `select_where` | 194.7 | 11.5 | 19.3 | 195.9 | 185.7 | 37.5 | +| `select_where_count` | 32.3 | 5.1 | 7.4 | 64.6 | 150.7 | 21.8 | +| `select_where_order_take` | 36.2 | 12.2 | 15.0 | 72.3 | 158.5 | 34.4 | +| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.3 | 160.5 | 23.2 | +| `single_match` | 0.0 | 2.9 | 5.5 | 56.9 | 151.1 | 22.8 | +| `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | 0.3 | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.3 | 146.6 | 18.2 | +| `sort_first` | 37.6 | 11.1 | 13.3 | 64.6 | 159.5 | 31.7 | +| `sort_take` | 38.0 | 16.2 | 20.9 | 70.2 | 161.9 | 33.0 | +| `sort_take_select` | 37.6 | 16.3 | 20.9 | 70.8 | 162.7 | 33.3 | +| `sum_aggregate` | 29.7 | 2.1 | 2.1 | 54.3 | 146.7 | 13.4 | +| `sum_where` | 31.9 | 4.3 | 4.3 | 63.6 | 148.1 | 20.5 | +| `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | +| `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | 0.3 | +| `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | +| `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | +| `take_while_match` | 7.8 | 2.4 | 2.4 | 28.8 | 71.4 | 16.8 | +| `to_array_filter` | 70.3 | 11.8 | 11.7 | 71.1 | 157.4 | 28.8 | +| `where_join_count` | 41.0 | 29.0 | 41.5 | 133.0 | 163.1 | — | +| `zip_count_pred` | 39.0 | 15.8 | — | 313.5 | 319.6 | — | +| `zip_dot_product` | 46.1 | 12.6 | 10.5 | 308.6 | 317.2 | — | +| `zip_dot_product_3arg` | 46.1 | 12.8 | — | 308.7 | 316.5 | — | +| `zip_reverse_to_array` | — | 31.6 | — | 343.1 | 351.0 | — | ## JIT -| Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | -|---|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.9 | 0.3 | 0.7 | 16.7 | 26.7 | -| `all_match` | 27.8 | 0.3 | 0.2 | 16.6 | 25.7 | -| `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.7 | 1.0 | 3.6 | 16.6 | 25.3 | -| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | -| `bare_order_where` | 187.8 | 34.6 | 35.7 | 105.3 | 53.0 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 21.4 | 34.0 | -| `chained_where` | 37.0 | 0.6 | 0.8 | 33.9 | 31.4 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 17.3 | 9.3 | -| `count_aggregate` | 29.6 | 0.3 | 0.6 | 16.7 | 25.9 | -| `cross_join` | 5984.0 | 751.9 | — | 833.5 | 768.9 | -| `decs_count_bare_pred` | — | — | 0.6 | — | — | -| `distinct_by_count` | 42.0 | 1.1 | 1.1 | 21.4 | 34.3 | -| `distinct_by_order_take` | 238.6 | 1.7 | 2.9 | 45.9 | 40.7 | -| `distinct_by_order_to_array` | 241.0 | 1.7 | 2.7 | 46.1 | 40.3 | -| `distinct_count` | 41.5 | 1.1 | 1.1 | 21.4 | 33.0 | -| `distinct_count_pred` | 253.3 | 1.1 | 1.3 | 38.6 | 45.2 | -| `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | -| `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 170.2 | 1.6 | 1.9 | 36.5 | 45.6 | -| `groupby_count` | 141.7 | 1.4 | 1.5 | 21.8 | 34.1 | -| `groupby_first` | 253.3 | 1.3 | 2.3 | 21.8 | 34.8 | -| `groupby_having_count` | 140.7 | 1.3 | 1.5 | 21.5 | 34.2 | -| `groupby_having_hidden_sum` | 175.1 | 1.5 | 1.7 | 36.4 | 45.3 | -| `groupby_having_post_where` | 169.9 | 1.4 | 1.9 | 36.3 | 44.5 | -| `groupby_max` | 173.3 | 1.5 | 1.9 | 36.4 | 45.9 | -| `groupby_min` | 173.0 | 1.5 | 1.8 | 36.4 | 46.0 | -| `groupby_multi_reducer` | 189.8 | 1.6 | 2.0 | 36.7 | 46.1 | -| `groupby_select_order` | 170.2 | 1.4 | 1.9 | 36.3 | 44.8 | -| `groupby_select_sum` | 198.8 | 2.8 | 3.2 | 31.3 | 40.2 | -| `groupby_sum` | 170.7 | 1.4 | 1.6 | 36.3 | 44.2 | -| `groupby_where_count` | 75.7 | 0.9 | 1.3 | 36.6 | 42.3 | -| `groupby_where_sum` | 87.0 | 0.9 | 1.3 | 36.6 | 43.7 | -| `join_count` | 39.0 | 10.8 | 11.9 | 42.9 | 75.7 | -| `join_groupby_count` | 156.8 | 17.2 | 19.2 | 69.8 | 95.1 | -| `join_groupby_to_array` | 190.7 | 18.3 | 20.1 | 80.7 | 37.6 | -| `join_select` | 93.3 | 20.0 | 21.8 | 75.3 | 100.1 | -| `join_where_count` | 39.8 | 19.0 | 20.7 | 63.1 | 81.0 | -| `last_match` | 0.0 | 0.5 | 1.4 | 17.5 | 26.2 | -| `long_count_aggregate` | 29.7 | 0.3 | 0.6 | 16.7 | 25.9 | -| `max_aggregate` | 31.2 | 0.3 | 0.5 | 16.7 | 27.3 | -| `min_aggregate` | 31.2 | 0.3 | 0.5 | 16.9 | 27.4 | -| `order_by_multi_key` | 250.8 | 54.1 | 54.9 | 124.3 | 72.9 | -| `order_distinct_take` | 140.2 | 1.1 | 75.2 | 21.6 | 36.8 | -| `order_reverse_normalized` | 38.5 | 0.7 | 1.4 | 21.7 | 28.1 | -| `order_take_desc` | 38.5 | 0.7 | 1.3 | 21.7 | 28.2 | -| `reverse_distinct_by` | 297.5 | 1.6 | 3.2 | 21.8 | 35.5 | -| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | -| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | -| `select_count` | 0.1 | 0.0 | 0.0 | 66.8 | 0.0 | -| `select_many` | — | 63.2 | — | — | — | -| `select_where` | 110.6 | 4.2 | 5.3 | 75.2 | 22.6 | -| `select_where_count` | 32.8 | 0.3 | 0.6 | 16.9 | 26.8 | -| `select_where_order_take` | 37.1 | 0.7 | 1.4 | 17.6 | 27.6 | -| `select_where_sum` | 38.3 | 0.4 | 0.6 | 16.6 | 25.6 | -| `single_match` | 0.0 | 0.4 | 1.1 | 46.5 | 22.4 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.2 | 0.2 | -| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.5 | 22.2 | -| `sort_first` | 38.4 | 0.4 | 1.3 | 16.7 | 27.0 | -| `sort_take` | 38.9 | 0.7 | 1.4 | 21.7 | 28.0 | -| `sort_take_select` | 38.5 | 0.7 | 1.3 | 21.8 | 27.6 | -| `sum_aggregate` | 30.2 | 0.3 | 0.1 | 16.9 | 24.6 | -| `sum_where` | 33.2 | 0.3 | 0.6 | 16.6 | 26.4 | -| `take_count` | 1.9 | 0.1 | 0.1 | 1.2 | 0.2 | -| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | -| `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | -| `take_where_count` | 0.9 | 0.0 | 0.0 | 0.2 | 0.0 | -| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.0 | 9.1 | -| `to_array_filter` | 49.0 | 3.3 | 3.3 | 20.1 | 35.9 | -| `where_join_count` | 40.0 | 5.9 | 6.7 | 47.9 | 44.9 | -| `zip_count_pred` | 39.4 | 0.1 | — | 115.0 | 34.0 | -| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 117.9 | 33.9 | -| `zip_dot_product_3arg` | 47.1 | 0.1 | — | 115.0 | 34.0 | -| `zip_reverse_to_array` | — | 4.7 | — | 126.6 | 51.1 | +| Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | +|---|---:|---:|---:|---:|---:|---:| +| `aggregate_match` | 35.0 | 0.3 | 0.6 | 21.7 | 27.3 | 13.6 | +| `all_match` | 27.8 | 0.3 | 0.2 | 18.1 | 25.9 | 13.5 | +| `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `average_aggregate` | 29.9 | 1.0 | 3.6 | 18.0 | 24.4 | 13.4 | +| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.1 | +| `bare_order_where` | 186.2 | 34.0 | 35.3 | 106.3 | 52.4 | 78.7 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.4 | 33.0 | 14.0 | +| `chained_where` | 35.9 | 0.6 | 0.8 | 35.5 | 31.5 | 17.6 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 14.8 | 9.2 | 4.7 | +| `count_aggregate` | 29.5 | 0.3 | 0.6 | 20.4 | 25.1 | 13.4 | +| `cross_join` | 5964.4 | 734.4 | — | 834.2 | 772.7 | — | +| `decs_count_bare_pred` | — | — | 0.6 | — | — | — | +| `distinct_by_count` | 41.0 | 1.1 | 1.1 | 20.4 | 32.0 | 14.0 | +| `distinct_by_order_take` | 237.4 | 1.7 | 2.6 | 48.4 | 37.1 | 29.9 | +| `distinct_by_order_to_array` | 237.2 | 1.7 | 2.7 | 47.5 | 36.8 | 30.0 | +| `distinct_count` | 40.8 | 1.1 | 1.1 | 20.5 | 31.9 | 14.0 | +| `distinct_count_pred` | 249.8 | 1.1 | 1.3 | 37.6 | 41.7 | 14.0 | +| `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | +| `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | +| `groupby_average` | 170.1 | 1.5 | 1.9 | 35.7 | 43.0 | — | +| `groupby_count` | 141.1 | 1.3 | 1.5 | 20.5 | 32.2 | 43.0 | +| `groupby_first` | 251.0 | 1.3 | 2.3 | 20.5 | 32.9 | — | +| `groupby_having_count` | 141.1 | 1.3 | 1.5 | 20.5 | 32.1 | — | +| `groupby_having_hidden_sum` | 173.9 | 1.5 | 1.7 | 35.8 | 42.7 | — | +| `groupby_having_post_where` | 170.2 | 1.4 | 1.9 | 35.8 | 41.8 | — | +| `groupby_max` | 172.3 | 1.5 | 1.9 | 35.9 | 43.6 | — | +| `groupby_min` | 173.0 | 1.5 | 1.8 | 35.8 | 43.6 | — | +| `groupby_multi_reducer` | 191.8 | 1.6 | 1.9 | 36.1 | 43.7 | — | +| `groupby_select_order` | 170.5 | 1.4 | 1.9 | 35.8 | 42.0 | — | +| `groupby_select_sum` | 195.5 | 2.8 | 3.2 | 32.3 | 37.6 | — | +| `groupby_sum` | 169.8 | 1.4 | 1.6 | 35.8 | 42.0 | 51.2 | +| `groupby_where_count` | 75.7 | 0.9 | 1.3 | 35.9 | 39.7 | — | +| `groupby_where_sum` | 86.4 | 0.9 | 1.3 | 35.9 | 39.6 | — | +| `join_count` | 37.9 | 11.0 | 11.7 | 43.4 | 68.3 | 62.9 | +| `join_groupby_count` | 156.2 | 18.2 | 20.0 | 68.3 | 86.7 | — | +| `join_groupby_to_array` | 189.2 | 17.5 | 19.4 | 80.2 | 36.1 | — | +| `join_select` | 92.8 | 19.6 | 21.6 | 74.4 | 94.1 | — | +| `join_where_count` | 39.1 | 18.9 | 20.6 | 64.5 | 77.9 | 80.0 | +| `last_match` | 0.0 | 0.5 | 1.4 | 18.6 | 25.9 | 22.9 | +| `long_count_aggregate` | 28.7 | 0.3 | 0.6 | 20.4 | 26.6 | 13.4 | +| `max_aggregate` | 30.6 | 0.3 | 0.5 | 18.1 | 26.7 | 13.4 | +| `min_aggregate` | 30.6 | 0.3 | 0.5 | 18.2 | 26.3 | 13.4 | +| `order_by_multi_key` | 247.0 | 53.4 | 54.8 | 125.3 | 70.3 | 128.9 | +| `order_distinct_take` | 137.9 | 1.1 | 75.6 | 20.9 | 34.1 | 14.0 | +| `order_reverse_normalized` | 37.8 | 0.7 | 1.3 | 24.6 | 27.0 | — | +| `order_take_desc` | 38.0 | 0.7 | 1.3 | 24.5 | 26.9 | 17.7 | +| `reverse_distinct_by` | 295.4 | 1.5 | 3.2 | 20.4 | 32.7 | — | +| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 26.8 | +| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | — | +| `select_count` | 0.1 | 0.0 | 0.0 | 63.4 | 0.0 | 0.0 | +| `select_many` | — | 61.5 | — | — | — | — | +| `select_where` | 110.5 | 4.3 | 5.3 | 76.1 | 22.1 | 27.9 | +| `select_where_count` | 32.1 | 0.3 | 0.6 | 18.4 | 25.9 | 13.3 | +| `select_where_order_take` | 36.3 | 0.7 | 1.4 | 18.9 | 26.6 | 22.9 | +| `select_where_sum` | 37.0 | 0.4 | 0.6 | 17.9 | 24.9 | 13.3 | +| `single_match` | 0.0 | 0.4 | 1.1 | 43.4 | 22.2 | 17.2 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 | +| `skip_while_match` | 3.5 | 0.4 | 0.4 | 43.5 | 21.8 | 13.2 | +| `sort_first` | 37.7 | 0.4 | 1.4 | 17.9 | 26.1 | 17.1 | +| `sort_take` | 38.0 | 0.7 | 1.5 | 24.5 | 26.8 | 17.7 | +| `sort_take_select` | 37.8 | 0.7 | 1.3 | 24.5 | 26.9 | 17.7 | +| `sum_aggregate` | 29.6 | 0.3 | 0.1 | 23.3 | 24.3 | 13.4 | +| `sum_where` | 32.1 | 0.3 | 0.6 | 18.4 | 25.9 | 13.3 | +| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.4 | +| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.2 | +| `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | +| `take_where_count` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | +| `take_while_match` | 7.8 | 0.2 | 0.3 | 14.7 | 9.0 | 13.3 | +| `to_array_filter` | 47.1 | 3.3 | 3.3 | 21.3 | 33.6 | 20.0 | +| `where_join_count` | 39.0 | 5.8 | 6.7 | 49.5 | 40.6 | — | +| `zip_count_pred` | 39.1 | 0.1 | — | 116.7 | 33.5 | — | +| `zip_dot_product` | 46.3 | 0.1 | 0.1 | 116.6 | 33.4 | — | +| `zip_dot_product_3arg` | 46.1 | 0.1 | — | 116.5 | 33.4 | — | +| `zip_reverse_to_array` | — | 4.6 | — | 127.7 | 50.0 | — | ## Missing lanes (the `—` cells) @@ -200,6 +203,7 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk). - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) +- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` and joins beyond `join_count`/`join_where_count` (table group_by/join fusion is staged — see `LINQ_TO_TABLE.md`; the four marker cells track the tier-2 cost until then), `decs_count_bare_pred` (decs-only). ## Accepted floors diff --git a/benchmarks/sql/table.das b/benchmarks/sql/table.das new file mode 100644 index 000000000..66564b963 --- /dev/null +++ b/benchmarks/sql/table.das @@ -0,0 +1,620 @@ +options gen2 +options persistent_heap + +require _common public + +// Per-source table benchmark lane (m7): the same Car rows keyed by id in a table, chains in +// each_kv form (`_.key` / `_.value.`) so the kv usage-pruner picks the cheapest iterator set. +// Table slot order is unspecified — guards stay order-insensitive. Functions stay named _m7. + +let N = 100000 + +typedef CarKV = tuple + +var g_t : table +var g_dealers : array + +[init] +def table_bench_init { + g_t <- fixture_table(N) + g_dealers <- fixture_dealers_array() +} + +[finalize] +def table_bench_fini { + delete g_t + delete g_dealers +} + +[benchmark] +def aggregate_match_m7(b : B?) { + b |> run("aggregate_match", N) { + let total = _fold(unsafe(each_kv(g_t))._where(_.value.price > 200) + .aggregate(0, $(acc : int, c : CarKV) => acc + c.value.price)) + b |> accept(total) + if (total == 0) { + b->failNow() + } + } +} + +[benchmark] +def all_match_m7(b : B?) { + b |> run("all_match", N) { + let yes = _fold(unsafe(each_kv(g_t))._all(_.value.price < 9999)) + b |> accept(yes) + if (!yes) { + b->failNow() + } + } +} + +[benchmark] +def any_match_m7(b : B?) { + b |> run("any_match", N) { + let yes = _fold(unsafe(each_kv(g_t))._any(_.value.price > 500)) + b |> accept(yes) + if (!yes) { + b->failNow() + } + } +} + +[benchmark] +def average_aggregate_m7(b : B?) { + b |> run("average_aggregate", N) { + let a = _fold(unsafe(each_kv(g_t))._select(double(_.value.price)).average()) + b |> accept(a) + if (a == 0.0lf) { + b->failNow() + } + } +} + +[benchmark] +def bare_last_m7(b : B?) { + b |> run("bare_last", N) { + let row = _fold(unsafe(each_kv(g_t)).last()) + b |> accept(row) + if (row.key == 0) { + b->failNow() + } + } +} + +[benchmark] +def bare_order_where_m7(b : B?) { + b |> run("bare_order_where", N) { + let rows <- _fold(unsafe(each_kv(g_t))._where(_.value.price > 500) + ._order_by(_.value.price) + .to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def chained_select_collapse_m7(b : B?) { + b |> run("chained_select_collapse", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _select(_.value.brand) |> _select(_ + 1) |> distinct() |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def chained_where_m7(b : B?) { + b |> run("chained_where", N) { + let c = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500) + ._where(_.value.year >= 2015) + .count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def contains_match_m7(b : B?) { + b |> run("contains_match", N) { + let yes = _fold(unsafe(each_kv(g_t))._select(_.key).contains(50000)) + b |> accept(yes) + if (!yes) { + b->failNow() + } + } +} + +[benchmark] +def count_aggregate_m7(b : B?) { + b |> run("count_aggregate", N) { + let c = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def distinct_by_count_m7(b : B?) { + b |> run("distinct_by_count", N) { + let c = _fold(unsafe(each_kv(g_t))._distinct_by(_.value.brand).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def distinct_by_order_take_m7(b : B?) { + b |> run("distinct_by_order_take", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._distinct_by(_.value.dealer_id)._order_by(_.value.price).take(10).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def distinct_by_order_to_array_m7(b : B?) { + b |> run("distinct_by_order_to_array", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._distinct_by(_.value.dealer_id)._order_by(_.value.price).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def distinct_count_m7(b : B?) { + b |> run("distinct_count", N) { + let rows <- _fold(unsafe(each_kv(g_t))._select(_.value.brand).distinct().to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def distinct_count_pred_m7(b : B?) { + b |> run("distinct_count_pred", N) { + unsafe { + let c = _fold(each_kv(g_t) |> _distinct_by(_.value.brand) |> count($(c) => c.value.year > 2009)) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } + } +} + +[benchmark] +def distinct_take_m7(b : B?) { + b |> run("distinct_take", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._select(_.value.brand).distinct().take(3).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def element_at_match_m7(b : B?) { + b |> run("element_at_match", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).element_at(100)) + b |> accept(row) + if (row.key == 0) { + b->failNow() + } + } +} + +[benchmark] +def first_match_m7(b : B?) { + b |> run("first_match", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).first()) + b |> accept(row) + if (row.value.price <= 500) { + b->failNow() + } + } +} + +[benchmark] +def first_or_default_match_m7(b : B?) { + let sentinel : CarKV = (key = -1, value = Car(id = -1, name = "none", price = 0, brand = 0, year = 0, dealer_id = 0)) + b |> run("first_or_default_match", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).first_or_default(sentinel)) + b |> accept(row) + if (row.key == -1) { + b->failNow() + } + } +} + +[benchmark] +def groupby_count_m7(b : B?) { + b |> run("groupby_count", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, N = _._1 |> length)) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_sum_m7(b : B?) { + b |> run("groupby_sum", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + TotalPrice = _._1 |> select($(c : CarKV) => c.value.price) |> sum())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def join_count_m7(b : B?) { + b |> run("join_count", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, d : Dealer) => (c.value.name, d.name)) + |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def join_where_count_m7(b : B?) { + b |> run("join_where_count", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, d : Dealer) => (CarPrice = c.value.price, DealerId = d.id)) + |> _where(_.CarPrice > 500) + |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def last_match_m7(b : B?) { + b |> run("last_match", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).last()) + b |> accept(row) + if (row.value.price <= 500) { + b->failNow() + } + } +} + +[benchmark] +def long_count_aggregate_m7(b : B?) { + b |> run("long_count_aggregate", N) { + let c = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).long_count()) + b |> accept(c) + if (c == 0l) { + b->failNow() + } + } +} + +[benchmark] +def max_aggregate_m7(b : B?) { + b |> run("max_aggregate", N) { + let m = _fold(unsafe(each_kv(g_t))._select(_.value.price).max()) + b |> accept(m) + if (m == 0) { + b->failNow() + } + } +} + +[benchmark] +def min_aggregate_m7(b : B?) { + b |> run("min_aggregate", N) { + let m = _fold(unsafe(each_kv(g_t))._select(_.value.price).min()) + b |> accept(m) + if (m > 999) { + b->failNow() + } + } +} + +[benchmark] +def order_by_multi_key_m7(b : B?) { + b |> run("order_by_multi_key", N) { + let rows <- _fold(unsafe(each_kv(g_t))._where(_.value.price > 500) + ._order_by_keys((_.value.brand, _.value.price), 0u) + .to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def order_distinct_take_m7(b : B?) { + b |> run("order_distinct_take", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._select(_.value.brand)._order_by(_).distinct().take(5).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def order_take_desc_m7(b : B?) { + b |> run("order_take_desc", N) { + let rows <- _fold(unsafe(each_kv(g_t))._order_by_descending(_.value.price).take(10).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def reverse_take_m7(b : B?) { + b |> run("reverse_take", N) { + unsafe { + let rows <- _fold(each_kv(g_t).reverse().take(10).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def select_count_m7(b : B?) { + b |> run("select_count", N) { + let c = _fold(unsafe(each_kv(g_t))._select(_.value.price * 2).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def select_where_m7(b : B?) { + b |> run("select_where", N) { + let rows <- _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def select_where_count_m7(b : B?) { + b |> run("select_where_count", N) { + let c = _fold(unsafe(each_kv(g_t))._select(_.value.price * 2)._where(_ > 1000).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def select_where_order_take_m7(b : B?) { + b |> run("select_where_order_take", N) { + let rows <- _fold(unsafe(each_kv(g_t))._where(_.value.price > 500) + ._order_by(_.value.price) + .take(10) + .to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def select_where_sum_m7(b : B?) { + b |> run("select_where_sum", N) { + let s = _fold(unsafe(each_kv(g_t))._select(_.value.price * 2)._where(_ > 1000).sum()) + b |> accept(s) + if (s == 0) { + b->failNow() + } + } +} + +[benchmark] +def single_match_m7(b : B?) { + b |> run("single_match", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.key == 42).single()) + b |> accept(row) + if (row.key != 42) { + b->failNow() + } + } +} + +[benchmark] +def skip_take_m7(b : B?) { + b |> run("skip_take", N) { + let rows <- _fold(unsafe(each_kv(g_t)).skip(1000).take(100).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +// skip_while / take_while measure the gated walk: slot order is unspecified, so the predicates are +// chosen to be uniformly false (skip nothing) / uniformly true (take everything) — full deterministic walks. +[benchmark] +def skip_while_match_m7(b : B?) { + b |> run("skip_while_match", N) { + let total = _fold(unsafe(each_kv(g_t))._skip_while(_.key < 0).count()) + b |> accept(total) + if (total == 0) { + b->failNow() + } + } +} + +[benchmark] +def sort_first_m7(b : B?) { + b |> run("sort_first", N) { + let row = _fold(unsafe(each_kv(g_t))._order_by(_.value.price).first()) + b |> accept(row) + if (row.key == 0) { + b->failNow() + } + } +} + +[benchmark] +def sort_take_m7(b : B?) { + b |> run("sort_take", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._order_by(_.value.price).take(10).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def sort_take_select_m7(b : B?) { + b |> run("sort_take_select", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._order_by(_.value.price).take(10)._select(_.value.name).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + +[benchmark] +def sum_aggregate_m7(b : B?) { + b |> run("sum_aggregate", N) { + let s = _fold(unsafe(each_kv(g_t))._select(_.value.price).sum()) + b |> accept(s) + if (s == 0) { + b->failNow() + } + } +} + +[benchmark] +def sum_where_m7(b : B?) { + b |> run("sum_where", N) { + let s = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500)._select(_.value.price).sum()) + b |> accept(s) + if (s == 0) { + b->failNow() + } + } +} + +[benchmark] +def take_count_m7(b : B?) { + b |> run("take_count", N) { + let rows <- _fold(unsafe(each_kv(g_t)).take(1000).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + +[benchmark] +def take_count_filtered_m7(b : B?) { + b |> run("take_count_filtered", N) { + let c = _fold(unsafe(each_kv(g_t))._where(_.value.price > 500).take(1000).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def take_sum_aggregate_m7(b : B?) { + b |> run("take_sum_aggregate", N) { + let s = _fold(unsafe(each_kv(g_t))._select(_.value.price).take(1000).sum()) + b |> accept(s) + if (s == 0) { + b->failNow() + } + } +} + +[benchmark] +def take_where_count_m7(b : B?) { + b |> run("take_where_count", N) { + let c = _fold(unsafe(each_kv(g_t)).take(1000)._where(_.value.price > 500).count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def take_while_match_m7(b : B?) { + b |> run("take_while_match", N) { + let total = _fold(unsafe(each_kv(g_t))._take_while(_.key > 0).count()) + b |> accept(total) + if (total == 0) { + b->failNow() + } + } +} + +[benchmark] +def to_array_filter_m7(b : B?) { + b |> run("to_array_filter", N) { + let prices <- _fold(unsafe(each_kv(g_t))._where(_.value.price > 500)._select(_.value.price).to_array()) + b |> accept(prices) + if (empty(prices)) { + b->failNow() + } + } +} diff --git a/daslib/linq_fold.das b/daslib/linq_fold.das index 71cc33a1d..d09f9f027 100644 --- a/daslib/linq_fold.das +++ b/daslib/linq_fold.das @@ -36,6 +36,7 @@ require daslib/linq_fold_common public require daslib/linq_fold_array public require daslib/linq_fold_decs public require daslib/linq_fold_json public // in-tree JSON source adapter — emits by name, pulls in no json dep +require daslib/linq_fold_table public // in-tree table source adapter — each_kv/keys/values chain heads require ?pugixml pugixml/linq_fold_xml // optional XML source adapter — loaded only when pugixml is linked require ?sqlite sqlite/linq_fold_sql // optional SQL source pass-through — loaded only when sqlite is linked @@ -211,6 +212,28 @@ def private try_splice_patterns(prog : ProgramPtr; var expr : Expression?) : Exp new JsonAdapter(jsonExpr = clone_expression(jsonCall.arguments[0]), srcName = qn("jsrc", at), elemType = clone_type(jsonCall._type.firstType)), exprIsIter, at) } + // Table adapter (in-tree, no static_if — extract_table_source name-matches each_kv/keys/values with a table-typed arg). The kv lane fuses only copyable values: a non-copyable-valued each_kv falls through to the array tier, where the surviving each_kv instantiation concept-asserts with the user-facing message. + var tabCall = extract_table_source(top) + if (tabCall != null) { + let tabName = get_call_short_name(tabCall) + let valT = tabCall.arguments[0]._type.secondType + if (tabName != "each_kv" || (valT != null && valT.canCopy)) { + let lane = tabName == "each_kv" ? TableLane.KV : (tabName == "keys" ? TableLane.KEYS : TableLane.VALUES) + if (lane != TableLane.VALUES) { + drop_redundant_distinct(calls) // keys are unique by construction; values can repeat + } + var ttopClone = clone_expression(top) + // keys/each_kv spell their element `-const` (iterator-variance concern); that flag must not leak into emitted var/buffer type spellings (`array -const>` breaks push_clone unification). + if (ttopClone._type != null && ttopClone._type.firstType != null) { + ttopClone._type.firstType.flags.removeConstant = false + } + var elemT = clone_type(tabCall._type.firstType) + elemT.flags.removeConstant = false + return run_splice_adapter(calls, ttopClone, ttopClone, + new TableAdapter(tabExpr = clone_expression(tabCall.arguments[0]), srcName = qn("tsrc", at), + elemType = elemT, lane = lane), exprIsIter, at) + } + } top = peel_each(top) var topClone = clone_expression(top) return run_splice_adapter(calls, top, topClone, diff --git a/daslib/linq_fold.md b/daslib/linq_fold.md index eda6a38c1..8f313f4b4 100644 --- a/daslib/linq_fold.md +++ b/daslib/linq_fold.md @@ -69,7 +69,7 @@ The adapter is an abstract `class SourceAdapter` (`[macro_interface]`, so every Emit fns hold a `SourceAdapter?` (via `EmitCtx.src` or an `adapter` local) and call these virtually. **daslang classes have no `is`/`as` downcast** (variant-only), so source-specific data is never pulled off a base pointer by downcasting — it goes through virtual methods. Beyond the 4 dispatch methods the base also declares 6 default-null **per-operation hook methods** (`emit_loop_or_count` / `emit_reverse_skip_into_tail` / `emit_reverse_last_backward` / `emit_distinct_take_loop` / `build_group_by_adapter` / `emit_join_hook`) that the owning source overrides; the generic lane falls back to its inline (array) body when the hook returns null. (`XmlAdapter` overrides the two reverse hooks with a **backward DOM walk** — `last_child`/`previous_sibling`, both O(1) in pugixml: `emit_reverse_skip_into_tail` collects only the last N children for `reverse |> take(N)` (m5f `reverse_take` 88.9 → 0.0 ns/op), and `emit_reverse_last_backward` returns the last element in one step for a no-predicate `last()` / `reverse |> first`. Predicated `[where] |> last` stays on the forward walk — reverse DOM traversal is ~2× cache-hostile per node, profiled — and the named 3-arg `from_xml_node` form falls back to the buffer path since pugixml has no last-named-child primitive.) (`emit_join_hook` is the standalone-join dispatch: the single `join_general` pattern's thin `emit_join` routes to it, so each source supplies its own join body — array `for`+2-param invoke, decs `for_each_archetype`, XML field-pruned DOM walk — with no parallel per-source join pattern.) It also declares **capability methods** the source answers about itself — `can_group_by` / `can_join` / `can_reserve_by_length` / `has_own_loop_or_count_lane` (bool, default false) and `name_prefix` (string) — which replaced the old `kind() : AdapterKind` enum + per-site switches, so a new source only implements the methods (no central enum to extend). The `can_group_by` / `can_join` capabilities are queried from the `can_group_by_source` / `can_join_source` `RequiresPredicate`s (which thread the adapter), so the single `group_by` / `join_general` pattern admits any capable source and the adapter's `build_group_by_adapter` / `emit_join_hook` does the source/srcb-shape gating (null → tier-2). Two transitional getters remain — `arrayTop()`/`arraySrcName()` (default null/"" on base, overridden by `ArrayAdapter`); the decs-specific getters were removed in G2a so the base (and thus `linq_fold_common`) is free of `DecsAdapter`/ECS coupling. One decorator subclass lives in `linq_fold_common`: `ProjectedSourceAdapter` wraps any inner adapter to absorb a leading `_select(f)` source projection (the `srcsel` slot) — it binds `projName = f(rawElem)` atop the per-element body and delegates `wrap_source_loop`/`wrap_invoke`/`name_prefix` to the inner adapter, leaving the base no-op `arrayTop`/`arraySrcName`/`can_reserve_by_length` so source-direct fast paths (which would bypass the projection) stay disabled. This lets order/distinct splices fuse over `source |> _select(f) |> …` for any source. -**Realized module layout (post-G3d):** `linq_fold_common` (kernel + abstract base + adapter-pure generic lanes — terminator/fold-array plus the source-generic loop_or_count / counter / accumulator / early-exit lanes, with `LoopDispatch` + the per-op `!supports_direct_return` state path that lets nested-callback sources ride the early-exit lane — + `splice_patterns` + `DecsBridgeShape`/`extract_decs_bridge`) ← `linq_fold_array` (Array/Zip/ArrayJoin adapters + the zip/join emit `emit_zip`/`emit_array_join` + array row-builders) and `linq_fold_decs` (Decs/DecsJoin adapters + decs-bridge visitors + the decs dispatcher `emit_loop_or_count_lane_decs` + the decs-specific hooks `emit_decs_count_archsize`/`emit_decs_reverse_skip_into_tail`/`emit_decs_join_impl`/`emit_decs_min_max_by` — the parallel terminator scaffold is gone, decs rides the generic lanes via `DecsAdapter`); the engine `linq_fold` requires all three and holds only the dispatcher + the `LinqFold` macro + the single `register_all_linq_fold_rows`. Adding a source = a new `linq_fold_.das` subclass module + one `require` + one `build__rows()` call in the engine registrar. +**Realized module layout (post-G3d):** `linq_fold_common` (kernel + abstract base + adapter-pure generic lanes — terminator/fold-array plus the source-generic loop_or_count / counter / accumulator / early-exit lanes, with `LoopDispatch` + the per-op `!supports_direct_return` state path that lets nested-callback sources ride the early-exit lane — + `splice_patterns` + `DecsBridgeShape`/`extract_decs_bridge`) ← `linq_fold_array` (Array/Zip/ArrayJoin adapters + the zip/join emit `emit_zip`/`emit_array_join` + array row-builders) and `linq_fold_decs` (Decs/DecsJoin adapters + decs-bridge visitors + the decs dispatcher `emit_loop_or_count_lane_decs` + the decs-specific hooks `emit_decs_count_archsize`/`emit_decs_reverse_skip_into_tail`/`emit_decs_join_impl`/`emit_decs_min_max_by` — the parallel terminator scaffold is gone, decs rides the generic lanes via `DecsAdapter`); the engine `linq_fold` requires all three and holds only the dispatcher + the `LinqFold` macro + the single `register_all_linq_fold_rows`. Adding a source = a new `linq_fold_.das` subclass module + one `require` + one `build__rows()` call in the engine registrar. Later sources follow that recipe: `linq_fold_json` (`JsonAdapter`/`JsonJoinAdapter`), `pugixml/linq_fold_xml` (`XmlAdapter`, optional), `sqlite/linq_fold_sql` (pass-through detector), and `linq_fold_table` (`TableAdapter` over `each_kv`/`keys`/`values` heads — kv usage-pruned slot walks, no new rows; arc plan in `benchmarks/sql/LINQ_TO_TABLE.md`). ## Goal diff --git a/daslib/linq_fold_table.das b/daslib/linq_fold_table.das new file mode 100644 index 000000000..b089bb0b6 --- /dev/null +++ b/daslib/linq_fold_table.das @@ -0,0 +1,212 @@ +options gen2 +options indenting = 4 +options no_unused_block_arguments = false +options no_unused_function_arguments = false +options _comment_hygiene = true + +module linq_fold_table shared public + +//! linq_fold table source adapter: ``TableAdapter`` + ``extract_table_source``. Lets ``_fold`` over an +//! ``each_kv(tab)`` / ``keys(tab)`` / ``values(tab)`` chain emit a fused slot-walk loop over the table +//! instead of riding the generic iterator tier. The kv lane scans which sides of the pair the chain +//! touches and prunes the walk to a keys-only / values-only single iterator (the table analog of XML +//! field-pruning); bare ``count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv +//! elements is dropped (keys are unique by construction). In-tree companion to ``daslib/linq_fold`` +//! (required unconditionally; the matcher returns null for non-table chains). Emits ``keys`` / ``values`` +//! / the kv zip BY NAME at the user's splice site. See benchmarks/sql/LINQ_TO_TABLE.md. + +require daslib/ast_boost +require daslib/ast_match +require daslib/templates_boost +require daslib/linq_fold_common public + +enum TableLane { + KV + KEYS + VALUES +} + +// Per-element table walk; the KV lane prunes to the cheapest iterator set from the body's +// `it.key` / `it.value` usage (the table analog of XML field-pruning, see LINQ_TO_TABLE.md). +[macro_function] +def private build_table_walk(lane : TableLane; srcName, bindName : string; var body : Expression?; + var breakGuard : Expression?; at : LineInfo) : Expression? { + var inner : array + if (breakGuard != null) { + inner |> push <| qmacro_expr() { + break if ($e(breakGuard)) + } + } + // Literal loop-var names below: the qmacro grammar only allows $i() in the FIRST iterator slot of a multi-source for, so the zip header uses fixed names (same trade ZipAdapter makes with itA/itB) — they live only inside the generated invoke. keys() deliberately yields NON-const elements (writable temp copies), so the engine-visible bind is a `let` rebind of the loop var — keys are workhorse types, the copy is free, and downstream ==const composition (push_clone of a bare projected key) needs the const. + if (lane == TableLane.KEYS) { + inner |> push <| qmacro_expr() { + let $i(bindName) = _tab_kv_key_ + } + inner |> push(body) + return <- qmacro_block() { + for (_tab_kv_key_ in keys($i(srcName))) { + $b(inner) + } + } + } + if (lane == TableLane.VALUES) { + // values over the const table param yield `V& const` — bind directly, no rebind copy + inner |> push(body) + return <- qmacro_block() { + for ($i(bindName) in values($i(srcName))) { + $b(inner) + } + } + } + // KV lane + let vName = "_tab_kv_value_" + var (allUsed, usedFields) = collect_row_usage(body, bindName) + if (allUsed) { + inner |> push <| qmacro_expr() { + let $i(bindName) = (key = _tab_kv_key_, value = _tab_kv_value_) + } + inner |> push(body) + return <- qmacro_block() { + for (_tab_kv_key_, _tab_kv_value_ in keys($i(srcName)), values($i(srcName))) { // nolint:LINT002 parallel-for must bind both vars + $b(inner) + } + } + } + let useKey = usedFields |> has_value("key") + let useValue = usedFields |> has_value("value") + let kLocal = qn("tkey", at) + var fieldToLocal <- { "key" => kLocal, "value" => vName } + body = flatten_row_to_locals(body, bindName, fieldToLocal) + if (useKey) { + inner |> push <| qmacro_expr() { + let $i(kLocal) = _tab_kv_key_ + } + } + inner |> push(body) + if (useValue && !useKey) { + return <- qmacro_block() { + for ($i(vName) in values($i(srcName))) { + $b(inner) + } + } + } + if (useKey && useValue) { + return <- qmacro_block() { + for (_tab_kv_key_, _tab_kv_value_ in keys($i(srcName)), values($i(srcName))) { // nolint:LINT002 parallel-for must bind both vars + $b(inner) + } + } + } + // key-only AND no-field-touched (e.g. bare count walk) both ride the cheaper keys iterator + return <- qmacro_block() { + for (_tab_kv_key_ in keys($i(srcName))) { // nolint:LINT002 body may not read the key (bare-count walk) + $b(inner) + } + } +} + +// Single flat for-loop over the table inside a 1-param invoke binding the table, like ArrayAdapter over +// an array — except the loop iterator(s) derive from the table param at the splice site. +class TableAdapter : SourceAdapter { + tabExpr : Expression? // the table expression (argument of each_kv/keys/values) + srcName : string // invoke param name binding the table + elemType : TypeDeclPtr // KV: tuple; KEYS: K; VALUES: V + lane : TableLane + def override name_prefix() : string { + return "tab_" + } + def override supports_direct_return() : bool { + return true // single flat for-loop inside the invoke; a mid-loop `return` exits the invoke + } + def override can_reserve_by_length() : bool { + return true // length(tab) is O(1); the shared reserve hint reads arrayTop/arraySrcName + } + def override arrayTop() : Expression? { + // Feeds the reserve hint (type_has_length covers tables). The backward-index reverse lanes that + // also read arrayTop gate on array_source, which is false here — matchTop stays iterator-typed. + return tabExpr + } + def override arraySrcName() : string { + return srcName + } + def override bind_name(at : LineInfo) : string { + return qn("it", at) + } + def override element_type() : TypeDeclPtr { + return clone_type(elemType) + } + def override count_shortcut(opName : string; at : LineInfo) : Expression? { + return emit_length_shortcut(opName, tabExpr, srcName, at) + } + def override wrap_source_loop(loopShape : LoopDispatch; var body : Expression?; at : LineInfo) : Expression? { + return build_table_walk(lane, srcName, bind_name(at), body, null, at) + } + def override emit_distinct_take_loop(bindName : string; takenName : string; takeLimName : string; var perElement : Expression?; at : LineInfo) : Expression? { + var breakGuard = qmacro($i(takenName) >= $i(takeLimName)) + return build_table_walk(lane, srcName, bindName, perElement, breakGuard, at) + } + def override wrap_invoke(var stmts : array; retType : TypeDeclPtr; wrapIter : bool; at : LineInfo) : Expression? { + // Const-accepting param: the source table is often a `let`, and a non-const source adds-const cleanly. + var tabType = strip_const_ref(clone_type(tabExpr._type)) + tabType.flags.constant = true + var tabClone = clone_expression(tabExpr) + tabClone.genFlags.alwaysSafe = true + let sn = srcName + var emission : Expression? + if (retType != null) { + emission = qmacro(invoke($($i(sn) : $t(tabType)) : $t(retType) { + $b(stmts) + }, $e(tabClone))) + } else { + emission = qmacro(invoke($($i(sn) : $t(tabType)) { + $b(stmts) + }, $e(tabClone))) + } + emission = finalize_invoke(emission, at) + if (wrapIter) { + emission = qmacro($e(emission).to_sequence_move()) + emission.force_generated(true) + } + return emission + } +} + +// Recognize an `each_kv(tab)` / `keys(tab)` / `values(tab)` chain top. Returns the call (caller reads +// arguments[0] = the table, `_type.firstType` = element); null otherwise. Name + table-typed-arg match, +// like extract_json_source — the strong arg-type gate keeps an unrelated user `keys` from firing this. +[macro_function] +def extract_table_source(var top : Expression?) : ExprCall? { + if (top == null || !(top is ExprCall)) return null + var c = top as ExprCall + let name = get_call_short_name(c) + if ((name != "each_kv" && name != "keys" && name != "values") + || c._type == null || !c._type.isIterator || c._type.firstType == null + || (c.arguments |> length) != 1) { + return null + } + let srcT = c.arguments[0]._type + if (srcT == null || !srcT.isGoodTableType) return null + return c +} + +// Drop plain `distinct` over raw keys/kv elements — table keys are unique by construction, so the whole +// dedup-set machinery is a no-op. Only when every call BEFORE the distinct preserves element uniqueness +// (filters/ranges/reorders — never a `select`, which reshapes elements). `distinct_by` keeps its own key. +[macro_function] +def drop_redundant_distinct(var calls : array>) { + var dropIdx = -1 + for (i in range(length(calls))) { + let name = calls[i]._1.name + if (name == "distinct") { + dropIdx = i + break + } + // uniqueness-preserving prefix ops only + if (name != "where_" && name != "skip" && name != "take" + && name != "skip_while" && name != "take_while" && name != "reverse") { + return + } + } + if (dropIdx < 0 || length(calls) <= 1) return // keep a bare `distinct`-only chain for the generic lanes + calls |> erase(dropIdx) +} diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index eb0db3dda..2d5bdc524 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -148,6 +148,9 @@ Source-side entry points * - ``unsafe(from_xml_node(node[, name], type))`` - ``extract_xml_source`` (``XmlAdapter``, ``modules/dasPUGIXML/daslib/linq_fold_xml.das``) - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy). + * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)`` + - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``) + - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. ``can_join`` / ``can_group_by`` are off and reverse has no backward slot walk — those shapes cascade to tier-2 (the join probe and key-lookup folds are staged: see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. * - ``unsafe(from_json(jv, type))`` - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``) - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML. diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das index e85407216..bcbfef726 100644 --- a/tests/linq/test_linq_table_source.das +++ b/tests/linq/test_linq_table_source.das @@ -2,10 +2,211 @@ options gen2 require dastest/testing_boost public require daslib/linq_boost +require strings -// Tier-2 LINQ over a table source via each_kv (the fused TableAdapter lands separately). -// each_kv is [unsafe_outside_of_for], so a chain head needs the explicit unsafe(...) wrap — -// same contract as each(arr) outside a fused chain. +// Table source (each_kv / keys / values) through _fold — fused TableAdapter lanes must agree with +// hand loops over the same table. Slot order is unspecified but stable per table instance, so +// order-sensitive expectations compare against a keys()/values() walk of the same table. + +struct Pt { + x : int + y : int +} + +def make_int_table(n : int) : table { + var t <- { for (i in range(n)); i => i * 10 } + return <- t +} + +def make_pt_table : table { + var t : table + t |> insert("a", Pt(x = 1, y = 10)) + t |> insert("b", Pt(x = 2, y = 20)) + t |> insert("c", Pt(x = 3, y = 30)) + t |> insert("d", Pt(x = 4, y = 40)) + return <- t +} + +[test] +def test_table_fold_count_shortcuts(t : T?) { + t |> run("bare count is length") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab).count()), 10) + t |> equal(_fold(keys(tab).count()), 10) + t |> equal(_fold(values(tab).count()), 10) + t |> equal(_fold(each_kv(tab).long_count()), 10l) + delete tab + } + t |> run("count with predicate walks") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._where(_.key % 2 == 0).count()), 5) + t |> equal(_fold(each_kv(tab)._where(_.value > 50).count()), 4) + delete tab + } + t |> run("empty table") @(t : T?) { + let e : table + t |> equal(_fold(each_kv(e).count()), 0) + t |> equal(_fold(each_kv(e).any()), false) + t |> equal(_fold(each_kv(e)._where(_.key > 0).count()), 0) + } +} + +[test] +def test_table_fold_accumulators(t : T?) { + t |> run("sum/min/max over values and keys") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(values(tab).sum()), 450) + t |> equal(_fold(keys(tab).sum()), 45) + t |> equal(_fold(each_kv(tab)._select(_.value).sum()), 450) + t |> equal(_fold(each_kv(tab)._select(_.key).min()), 0) + t |> equal(_fold(each_kv(tab)._select(_.value).max()), 90) + // body touches both sides — zipped walk + t |> equal(_fold(each_kv(tab)._select(_.key + _.value).sum()), 495) + delete tab + } + t |> run("early exit: any/all/contains") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab).any()), true) + t |> equal(_fold(each_kv(tab)._where(_.value > 80).any()), true) + t |> equal(_fold(each_kv(tab)._where(_.value > 90).any()), false) + t |> equal(_fold(each_kv(tab)._select(_.key)._all(_ >= 0)), true) + t |> equal(_fold(values(tab).contains(40)), true) + t |> equal(_fold(values(tab).contains(41)), false) + delete tab + } +} + +[test] +def test_table_fold_to_array_agreement(t : T?) { + t |> run("kv where+select agrees with hand loop, in slot order") @(t : T?) { + var tab <- make_pt_table() + var expected : array + for (_k, v in keys(tab), values(tab)) { + if (v.x > 1) { + expected |> push(v.y) + } + } + var got <- _fold(each_kv(tab)._where(_.value.x > 1)._select(_.value.y).to_array()) + t |> equal(length(got), length(expected)) + for (i in range(length(expected))) { + t |> equal(got[i], expected[i]) + } + delete got + delete expected + delete tab + } + t |> run("keys to_array in slot order") @(t : T?) { + var tab <- make_pt_table() + var expected : array + for (k in keys(tab)) { + expected |> push(k) + } + // bare `.to_array()` is not a recognized chain (any source) — keep a where on it + var got <- _fold(keys(tab)._where(_ != "zzz").to_array()) + t |> equal(length(got), length(expected)) + for (i in range(length(expected))) { + t |> equal(got[i], expected[i]) + } + delete got + delete expected + delete tab + } + t |> run("whole-kv escape: identity to_array") @(t : T?) { + var tab <- make_pt_table() + var got <- _fold(each_kv(tab)._where(_.value.x >= 3).to_array()) + t |> equal(length(got), 2) + for (kv in got) { + let expectedPt = tab?[kv.key] ?? Pt() + t |> equal(expectedPt.x, kv.value.x) + t |> equal(expectedPt.y, kv.value.y) + } + delete got + delete tab + } +} + +[test] +def test_table_fold_order_distinct_take(t : T?) { + t |> run("order_by key descending") @(t : T?) { + var tab <- make_int_table(10) + var got <- _fold(each_kv(tab)._select(_.key).order_by_descending(@(k : int) => k).to_array()) + t |> equal(length(got), 10) + for (i in range(10)) { + t |> equal(got[i], 9 - i) + } + delete got + delete tab + } + t |> run("redundant distinct over keys/kv is dropped but correct") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab).distinct().count()), 10) + t |> equal(_fold(each_kv(tab)._where(_.key > 4).distinct().count()), 5) + t |> equal(_fold(keys(tab).distinct().count()), 10) + delete tab + } + t |> run("values distinct stays real") @(t : T?) { + var dup : table + dup |> insert(1, 7) + dup |> insert(2, 7) + dup |> insert(3, 8) + t |> equal(_fold(values(dup).distinct().count()), 2) + t |> equal(_fold(each_kv(dup)._select(_.value).distinct().count()), 2) + delete dup + } + t |> run("take/skip ride the walk") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._select(_.value).take(3).count()), 3) + t |> equal(_fold(each_kv(tab)._select(_.key).skip(4).count()), 6) + var firstTwo <- _fold(keys(tab).take(2).to_array()) + var expected : array + for (k in keys(tab)) { + if (length(expected) < 2) { + expected |> push(k) + } + } + t |> equal(length(firstTwo), 2) + t |> equal(firstTwo[0], expected[0]) + t |> equal(firstTwo[1], expected[1]) + delete firstTwo + delete expected + delete tab + } + t |> run("first / first_or_default") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._where(_.key == 7)._select(_.value).first()), 70) + t |> equal(_fold(each_kv(tab)._where(_.key == 99)._select(_.value).first_or_default(-1)), -1) + delete tab + } +} + +[test] +def test_table_fold_iterator_result(t : T?) { + t |> run("chain consumed as iterator") @(t : T?) { + var tab <- make_int_table(6) + var s = 0 + for (v in _fold(each_kv(tab)._where(_.key % 2 == 0)._select(_.value))) { + s += v + } + t |> equal(s, 0 + 20 + 40) + delete tab + } +} + +[test] +def test_table_fold_set_form(t : T?) { + t |> run("keys over table set") @(t : T?) { + var s : table + s |> insert("alpha") + s |> insert("beta") + s |> insert("gamma") + t |> equal(_fold(keys(s).count()), 3) + t |> equal(_fold(keys(s)._where(_ |> length() > 4).count()), 2) + delete s + } +} + +// Tier-2 over the raw each_kv iterator (no _fold) — the [unsafe_outside_of_for] contract requires the +// explicit unsafe(...) wrap at a bare chain head; fused chains rewrite the head before inference. [test] def test_each_kv_tier2(t : T?) { From c00f655c29c16fd00a1b5e8fc40e7295ed7ebff3 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 00:10:11 -0700 Subject: [PATCH 04/11] linq-table arc: link #3096 (qmacro multi-source for $i limitation) in the plan doc Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 72acd2be6..92483eae5 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -14,6 +14,8 @@ Stage 2 findings: the tier-2 cells stages 4–5 erase. - The qmacro grammar only allows `$i()` in the FIRST iterator slot of a multi-source `for` — the kv zip header uses literal `_tab_kv_key_` / `_tab_kv_value_` names (ZipAdapter's itA/itB trade). + Filed as [#3096](https://github.com/GaijinEntertainment/daScript/issues/3096) (grammar fix + and/or a templates_boost loop-builder helper). - `keys()` yields NON-const elements (writable temp copies) — the engine-visible bind is a `let` rebind (workhorse copy, free); push_clone's `==const` composition needs it. - `keys`/`each_kv` spell their element `-const` (iterator variance); the dispatcher clears From 29d23baf6dada5ce73c4638048eb799b961d014f Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 00:54:21 -0700 Subject: [PATCH 05/11] =?UTF-8?q?linq=5Fdas:=20tables=20as=20%linq!=20sour?= =?UTF-8?q?ces=20=E2=80=94=20untyped=20`from`=20dispatches=20via=201-arg?= =?UTF-8?q?=20from=5Fin?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit `from kv in tab` over table → each_kv (kv.key/kv.value), table set → keys, anything else → each (arrays unchanged, ast-verified identical emission). The reader can't tell an array from a table, so every untyped fused source now emits `from_in(src)` and FromInMacro dispatches by the inferred value type. FromInMacro rejects switch from `return call` to macro_error + return null (the _sql idiom) — returning the call report-ast-changes every pass and churns to the 50-pass infer cap (30507). The not-inferred arm also gates on isAutoOrAlias and doubles as the defer for local sources whose type settles a pass later. Joins over tables already work on either side at tier-2 (tested both ways); cross/SelectMany over tables stays a named deferred edge in LINQ_TO_TABLE.md. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 29 ++++++- daslib/linq_das.das | 81 +++++++++++++------ doc/source/reference/linq_das.rst | 54 +++++++++---- tests/linq/failed_linq_das_table.das | 29 +++++++ tests/linq/test_linq_das.das | 115 +++++++++++++++++++++++++++ 5 files changed, 267 insertions(+), 41 deletions(-) create mode 100644 tests/linq/failed_linq_das_table.das diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 92483eae5..10a62d34b 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,7 +4,28 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 2 committed** (TableAdapter + m7; stage 1 = `each_kv` builtin, 8751bb9ba). +Status: **stage 3 committed** (`%linq!` table sources; stage 2 = TableAdapter + m7, 571fe879e; +stage 1 = `each_kv` builtin, 8751bb9ba). + +Stage 3 findings: +- The untyped `from c in ` now emits the **1-arg `from_in(src)`** for every source (the reader + can't tell an array from a table); FromInMacro dispatches at infer time — `table` → + `each_kv`, `table` set → `keys`, anything else → `each` (arrays land on the identical fused + emission as before, ast-verified). The `unsafe($c(...))` qmacro form puts `alwaysSafe` on the call + itself (templates_boost `carry_tag_safe_flags`), so extractors/peel_each still see a bare ExprCall. +- **Call-macro reject/defer idiom**: macro_error + `return null` (the `_sql` idiom). Returning + `call` after an error report-ast-changes every pass and churns to the 50-pass infer cap (30507). + All FromInMacro rejects switched to null; the "source type is not inferred" arm (now also gated on + `isAutoOrAlias`) doubles as the DEFER — errors clear per pass while other inference progresses + (a local `var tab <- {...}` source reaches the visit before its own type settles), and only stick + if the source genuinely never infers. +- A rejecting `from_in` leaves the chain head unresolved, so `_fold`'s "expecting linq expression" + verify lands on the same generated line with the same cerr — the error report collapses the pair + to ONE 50503 (`+1 more on this line`); failed-test `expect` counts are post-collapse. +- **Joins over tables already work on either side** (tier-2; the kv pair is that side's row) — + tested both directions. Stage 5's probe will optimize the table-as-srcB case. +- The non-copyable-value gate composes through the reader unchanged: fused dispatch declines, + tier-2 instantiates the real `each_kv`, one clean 31400. Stage 2 findings: - m7 INTERP profile (2026-06-10 sweep): pruned scans sit between array and XML — `sum_aggregate` @@ -120,6 +141,12 @@ End of arc: `skills/linq.md` + linq docs mention the table source. ## Deferred edges (named, not built) +- **Multiple-`from` (cross / SelectMany) over tables**: the unfused `_cross_join` arm passes the + bare source text so the array×array overload resolves without an `each` unsafe trip; a table + there has no overload (confusing 30303 cascade). `cross_join` has iterator overloads, so routing + the unfused untyped sources through `from_in` would work — but it changes overload selection for + every existing untyped array cross query. Documented as unsupported (join a table instead); + revisit on demand. - **Key-as-handle deferred materialization**: for `order_by` over kv with large (copyable) values, buffer `(orderKey, key)` surrogates and materialize survivors via `tab?[key]` — K probes instead of N value copies. The table handle is its key; clean fit for the existing diff --git a/daslib/linq_das.das b/daslib/linq_das.das index b0fa73a79..4162b6f2e 100644 --- a/daslib/linq_das.das +++ b/daslib/linq_das.das @@ -14,17 +14,21 @@ require daslib/linq_boost public // The `%linq! … %%` inline reader macro rewrites a C#-like query into a `_fold(...)` chain: // // %linq! from c in cars where c.price > 100 orderby c.price select c.name %% -// → ( _fold( each(cars) |> _where($(c) => c.price > 100) |> _order_by($(c) => c.price) |> _select($(c) => c.name) |> to_array() ) ) +// → ( _fold( from_in(cars) |> _where($(c) => c.price > 100) |> _order_by($(c) => c.price) |> _select($(c) => c.name) |> to_array() ) ) // %linq! from c in cars group c by c.brand %% -// → ( _fold( each(cars) |> _group_by_lazy($(c) => c.brand) |> to_array() ) ) +// → ( _fold( from_in(cars) |> _group_by_lazy($(c) => c.brand) |> to_array() ) ) // %linq! from c in cars join d in dealers on c.brand equals d.brand select (N = c.name, City = d.city) %% -// → ( _fold( each(cars) |> _join(each(dealers), $(c, d) => c.brand == d.brand, $(c, d) => (N = c.name, City = d.city)) |> to_array() ) ) +// → ( _fold( from_in(cars) |> _join(from_in(dealers), $(c, d) => c.brand == d.brand, $(c, d) => (N = c.name, City = d.city)) |> to_array() ) ) // %linq! from c in cars from d in dealers select (N = c.name, City = d.city) %% // → ( _fold( (cars) |> _cross_join((dealers), $(c, d) => (N = c.name, City = d.city)) |> to_array() ) ) // -// An untyped `from c in ` is an array source. A typed `from c : Row in ` selects a non-array -// source: `decs` (a keyword marker → `from_decs_template`), or any value whose type the `from_in` call -// macro dispatches (a SQL runner → `select_from`, an XML node → `from_xml_node`). The range variable is +// (`from_in(arr)` resolves during inference to `each(arr)` — see FromInMacro at the bottom.) +// +// An untyped `from c in ` is an array or table source — the 1-arg `from_in` call macro dispatches +// by the value type (array → `each`, `table` → `each_kv` with `kv.key`/`kv.value`, `table` set → +// `keys`). A typed `from c : Row in ` selects a row-typed source: `decs` (a keyword marker → +// `from_decs_template`), or any value whose type the 2-arg `from_in` dispatches (a SQL runner → +// `select_from`, an XML node → `from_xml_node`). The range variable is // spliced verbatim as the block parameter. `orderby [descending]` is a single sort key. `group c // by ` is a terminal yielding `tuple>` per bucket — IGrouping, in-memory sources // only (over SQL it errors: SQL needs an aggregating projection). A trailing `iterator` keyword yields an @@ -234,13 +238,14 @@ def private strip_trailing_keyword(text : string; kw : string) : tuple)" return "from_in({srcText}, type<{rowType}>)" } @@ -1230,20 +1235,45 @@ def private transpile_query(query : string; prog : ProgramPtr; at : LineInfo) : [call_macro(name="from_in")] class FromInMacro : AstCallMacro { - //! Typed-source dispatcher for the C# query form `from c : Row in `. Rewrites - //! `from_in(src, type)` to the concrete `_fold` source builder by `src`'s type — a SQL runner - //! → `select_from`, an XML node → `from_xml_node`, a JSON value → `from_json`. Must be a call macro: - //! a plain function would leave `from_in(...)` at the chain head, which `_fold`'s name-based source + //! Source dispatcher for the C# query `from` clause. The typed form `from c : Row in ` arrives + //! as `from_in(src, type)` and rewrites to the concrete `_fold` source builder by `src`'s type — + //! a SQL runner → `select_from`, an XML node → `from_xml_node`, a JSON value → `from_json`. The + //! UNTYPED form `from c in ` arrives as `from_in(src)` — a table → `each_kv` (`keys` for the + //! `table` set form), anything else → `each` (the array path). Must be a call macro: a plain + //! function would leave `from_in(...)` at the chain head, which `_fold`'s name-based source //! detection cannot route. def override visit(prog : ProgramPtr; mod : Module?; var call : ExprCallMacro?) : ExpressionPtr { - if (length(call.arguments) != 2) { - macro_error(prog, call.at, "from_in(src, type): expected 2 arguments") - return call + // Every reject below is macro_error + return null (the `_sql` idiom): infer stabilizes and the + // error sticks. Returning `call` instead would report-ast-changed every pass and churn to the + // 50-pass cap (30507). The not-inferred arm doubles as the DEFER: errors clear per pass, so while + // other inference still progresses (e.g. a local `var tab <- {...}` source settling) the error is + // discarded and the macro re-runs; it only sticks if the source genuinely never infers. + if (length(call.arguments) != 1 && length(call.arguments) != 2) { + macro_error(prog, call.at, "from_in(src[, type]): expected 1 or 2 arguments") + return null } let srcT = call.arguments[0]._type - if (srcT == null) { + if (srcT == null || srcT.isAutoOrAlias) { macro_error(prog, call.at, "from_in: source type is not inferred") - return call + return null + } + if (length(call.arguments) == 1) { + // Untyped `from c in ` — table sources dispatch on the value type; anything else keeps + // the historical array emit (`each` carries its own diagnostics for non-iterable sources). + // `unsafe(...)` over a $c tag lands as alwaysSafe on the call itself ([unsafe_outside_of_for] + // heads are fine fused or unfused), and extractors/peel_each see the bare ExprCall. + if (srcT.isGoodTableType) { + if (srcT.secondType == null || srcT.secondType.baseType == Type.tVoid) + return qmacro(unsafe($c("keys")($e(call.arguments[0])))) + return qmacro(unsafe($c("each_kv")($e(call.arguments[0])))) + } + return qmacro(unsafe($c("each")($e(call.arguments[0])))) + } + // A table source carries its row shape (tuple / the key type) — the annotation has + // nothing to add and the typed builders below would all mis-fire. Reject early with the fix. + if (srcT.isGoodTableType) { + macro_error(prog, call.at, "linq: a table source takes no row-type annotation — write `from kv in ` (kv.key / kv.value)") + return null } // SQL: db is a sqlite_boost::SqlRunner → select_from(db, type) if (srcT.structType != null && srcT.structType.name == "SqlRunner" @@ -1260,8 +1290,8 @@ class FromInMacro : AstCallMacro { && srcT.firstType.structType.name == "JsonValue" && srcT.firstType.structType._module != null && srcT.firstType.structType._module.name == "json") return qmacro(unsafe($c("from_json")($e(call.arguments[0]), $e(call.arguments[1])))) - macro_error(prog, call.at, "linq: unsupported source for `from c : Row in ` — expected a SQL runner, an XML node, or a JSON value (use `in decs` for decs, or an array for the untyped `from c in ` form)") - return call + macro_error(prog, call.at, "linq: unsupported source for `from c : Row in ` — expected a SQL runner, an XML node, or a JSON value (use `in decs` for decs; arrays and tables use the untyped `from c in ` form)") + return null } } @@ -1269,8 +1299,9 @@ class FromInMacro : AstCallMacro { class LinqDasReader : AstReaderMacro { //! C#-style LINQ query reader macro. //! ``%linq! from c [: Row] in src [where ] [ join d [: RowB] in B on equals | from d [: RowB] in B ] [orderby [descending]] ( select | group c by ) [iterator] %%`` - //! rewrites to a ``_fold(...)`` chain. Sources: array (untyped `from c in arr`), or a typed - //! ``from c : Row in src`` over decs / SQL / XML / JSON. ``orderby`` is a single sort key; ``group c by `` + //! rewrites to a ``_fold(...)`` chain. Sources: array or table (untyped `from c in src`, dispatched + //! by value type — a ``table`` binds ``kv.key``/``kv.value`` pairs, a ``table`` set binds + //! keys), or a typed ``from c : Row in src`` over decs / SQL / XML / JSON. ``orderby`` is a single sort key; ``group c by `` //! is a terminal yielding ``tuple>`` buckets (in-memory sources only). A second range //! variable comes from either a ``join`` (single inner equi-join) or a second ``from`` (SelectMany): //! uncorrelated (``from d in B``) is the cross product, emitting ``_cross_join`` (pushes down to a SQL diff --git a/doc/source/reference/linq_das.rst b/doc/source/reference/linq_das.rst index 183053e30..6d699c139 100644 --- a/doc/source/reference/linq_das.rst +++ b/doc/source/reference/linq_das.rst @@ -20,7 +20,10 @@ rewrites to, and is re-parsed in place as: .. code-block:: das - var names <- ( _fold( each(cars) |> _where($(c) => c.price > 100) |> _select($(c) => c.name) |> to_array() ) ) + var names <- ( _fold( from_in(cars) |> _where($(c) => c.price > 100) |> _select($(c) => c.name) |> to_array() ) ) + +where ``from_in(cars)`` resolves during inference to ``each(cars)`` (see +`Sources`_ — for a table source it resolves to ``each_kv`` / ``keys`` instead). The macro lives in the lexer's inline reader-macro slot (``%name!``), so a query is an ordinary expression — it can be assigned, passed as an argument, or @@ -45,7 +48,9 @@ between body clauses** — it is inlined away before the rest is parsed (see :ref:`linq_das_let`): - ``from in `` — the element bind ```` names the per-row - value. With no type annotation, ```` is an ``array``. + value. With no type annotation, ```` is an ``array`` or a table — + a ``table`` binds read-only ``(key, value)`` pairs (``kv.key`` / + ``kv.value``), a ``table`` set binds its keys. - ``let = `` — optional, repeatable, and free to appear between any body clauses; binds a computed value reused in the clauses that follow it (see :ref:`linq_das_let`). @@ -78,16 +83,23 @@ Clauses may span multiple lines inside the ``%linq! … %%`` body. Sources ------- -An **untyped** ``from c in `` is an array source. A **typed** range -variable ``from c : Row in `` selects a non-array source — the row type -``Row`` is supplied on the range variable (C#-faithful ``from Type c in src``) -because the source value alone does not carry it: +An **untyped** ``from c in `` is an array or table source, dispatched by +the source value's type. A **typed** range variable ``from c : Row in `` +selects a row-typed source — the row type ``Row`` is supplied on the range +variable (C#-faithful ``from Type c in src``) because the source value alone +does not carry it: .. code-block:: das // array (untyped) — `each(arr)` var a <- %linq! from c in cars where c.price > 100 select c.name %% + // table (untyped) — `each_kv(tab)`: read-only (key, value) pairs, fused by the TableAdapter + var t <- %linq! from kv in carsById where kv.value.price > 100 select kv.value.name %% + + // table set (untyped) — `keys(s)` + var k <- %linq! from id in soldIds where id > 100 select id %% + // decs — the `decs` keyword marker → `from_decs_template(type)` var d <- %linq! from c : CarComp in decs where c.price > 100 select c.name %% @@ -100,12 +112,18 @@ because the source value alone does not carry it: // JSON — a JsonValue? array → `from_json`, fused by the JsonAdapter var j <- %linq! from c : Car in carsJson where c.price > 100 select c.name %% -For value sources (SQL, XML, JSON) the reader emits -``from_in(, type)``; the ``from_in`` call macro dispatches on the -source value's type to the concrete builder (so a new backend is a new -``from_in`` branch, never a parser change). ``decs`` has no source value, so it -is emitted directly as ``from_decs_template`` and never goes through -``from_in``. The row type's required annotation depends on the source — +Untyped sources go through the 1-arg ``from_in()``, typed value sources +(SQL, XML, JSON) through ``from_in(, type)``; the ``from_in`` call +macro dispatches on the source value's type to the concrete builder (so a new +backend is a new ``from_in`` branch, never a parser change). ``decs`` has no +source value, so it is emitted directly as ``from_decs_template`` and never +goes through ``from_in``. A table element is the ``each_kv`` named tuple +``(key, value)`` — both fields are **copies** (read-only view), the value type +must be copyable (a ``table>`` source rejects at compile time — +see :ref:`the table source row in linq_fold_patterns `), +and slot order is unspecified, so add an ``orderby`` when the result order +matters. A table source takes no row-type annotation (its element shape comes +from the table type itself). The row type's required annotation depends on the source — ``[decs_template]`` for decs, ``[sql_table]`` / ``[sql_view]`` for SQL, a plain struct for XML and JSON. The JSON source is a ``JsonValue?`` holding a JSON **array** of objects (``from c : Car in jv["cars"]`` descends into a nested @@ -142,7 +160,7 @@ own ``_where`` filter, AND-folded in source order: // two predicates — both apply var names <- %linq! from c in cars where c.price > 100 where c.brand == "eco" select c.name %% - // expands to: _fold( each(cars) |> _where($(c) => c.price > 100) |> _where($(c) => c.brand == "eco") |> _select($(c) => c.name) |> to_array() ) + // expands to: _fold( from_in(cars) |> _where($(c) => c.price > 100) |> _where($(c) => c.brand == "eco") |> _select($(c) => c.name) |> to_array() ) Over a **SQL** source the predicates push down as one ANDed ``WHERE`` (a single statement, no intermediate materialize). On a two-source query (``join`` / second @@ -339,8 +357,10 @@ Join ``join [ : ] in on equals `` adds a single **inner equi-join** — one new range variable, one equality key. The second -source is built exactly like the first (untyped → array, typed → the -``from_in`` dispatch), so it may be a different kind of source than the left. +source is built exactly like the first (untyped → array/table, typed → the +``from_in`` dispatch), so it may be a different kind of source than the left — +a table works on either side (its kv pair is that side's row, e.g. +``on c.brand equals p.key``); note a table left source walks in slot order. The reader picks one of two emit shapes from the **post-join** clauses (it transpiles before type inference and cannot see the source, so it decides @@ -460,6 +480,10 @@ subset. Both slots are repeatable (see :ref:`linq_das_filtering`): terminal carries ``(c, b)`` as a pair, in-memory only (same SQL boundary as ``join``). +Table sources are **not supported** in a multiple-``from`` query (the +cross/flatten arms are array-shaped) — ``join`` a table instead, or +materialize it first. + Correlated ``from`` (flatten) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/tests/linq/failed_linq_das_table.das b/tests/linq/failed_linq_das_table.das new file mode 100644 index 000000000..9ba635ec2 --- /dev/null +++ b/tests/linq/failed_linq_das_table.das @@ -0,0 +1,29 @@ +options gen2 + +// linq_das table-source rejections — INFER-level (FromInMacro), unlike the reader-level rejects in +// failed_linq_das.das (so inference DOES run here, and `_fold`'s own "expecting linq expression" +// verify lands next to a rejecting from_in's message — same line, same cerr, so the report collapses +// the pair to ONE 50503 with a `+1 more on this line` suffix). A rejecting from_in arm is macro_error + +// return null: infer stabilizes and the errors stick, with no 50-pass churn. +expect 50503:1 // typed annotation over a table source (`from kv : Car in tab`), collapsed with the _fold verify +expect 31400:1 // non-copyable value type: untyped from over table> → each_kv concept_assert + +require daslib/linq_das +require daslib/linq_fold + +struct Car { + name : string + price : int +} + +def trigger_table_typed_annotation() { + var tab : table + let x <- %linq! from kv : Car in tab select kv %% +} + +def trigger_table_non_copyable_value() { + // the uniform can_copy gate: fused dispatch declines, tier-2 instantiates the real each_kv, + // its concept_assert carries the user-facing message + var tab : table> + let x <- %linq! from kv in tab select kv.key %% +} diff --git a/tests/linq/test_linq_das.das b/tests/linq/test_linq_das.das index 695173869..bdfce8c2b 100644 --- a/tests/linq/test_linq_das.das +++ b/tests/linq/test_linq_das.das @@ -195,6 +195,121 @@ def test_decs_iterator_output(t : T?) { t |> equal(length(names), 2, "decs iterator output") } +// ===== table source (untyped `from kv in tab` — 1-arg from_in dispatches table → each_kv, +// ===== table set → keys; kv.key / kv.value pair surface; slot order unspecified → order-insensitive checks) ===== + +def private mk_car_tab() : table { + return <- { + 1 => Car(name = "cheap", price = 50, brand = "eco"), + 2 => Car(name = "mid", price = 150, brand = "eco"), + 3 => Car(name = "lux", price = 300, brand = "lux") + } +} + +[test] +def test_table_kv_where_select(t : T?) { + let tab <- mk_car_tab() + let prices <- %linq! from kv in tab where kv.value.price > 100 select kv.value.price %% + t |> equal(length(prices), 2, "kv.value predicate filters, projection rides the values-pruned walk") + t |> equal(prices[0] + prices[1], 450, "mid + lux, slot-order-insensitive") +} + +[test] +def test_table_kv_key_only(t : T?) { + let tab <- mk_car_tab() + let ks <- %linq! from kv in tab where kv.key != 2 select kv.key %% + t |> equal(length(ks), 2, "key-only chain rides the keys-pruned walk") + t |> equal(ks[0] + ks[1], 4, "keys 1 + 3") +} + +[test] +def test_table_kv_identity_select(t : T?) { + let tab <- mk_car_tab() + let rows <- %linq! from kv in tab where kv.value.brand == "eco" select kv %% + t |> equal(length(rows), 2, "identity select returns (key, value) tuples") + t |> equal(rows[0].key + rows[1].key, 3, "eco keys 1 + 2") + t |> equal(rows[0].value.price + rows[1].value.price, 200, "eco prices 50 + 150") +} + +[test] +def test_table_kv_orderby(t : T?) { + let tab <- mk_car_tab() + // orderby makes the unspecified slot order deterministic + let names <- %linq! from kv in tab orderby kv.value.price descending select kv.value.name %% + t |> equal(length(names), 3) + t |> equal(names[0], "lux") + t |> equal(names[2], "cheap") +} + +[test] +def test_table_kv_group_by(t : T?) { + let tab <- mk_car_tab() + let buckets <- %linq! from kv in tab group kv by kv.value.brand %% + t |> equal(length(buckets), 2, "two brands") + for (b in buckets) { + t |> equal(length(b._1), b._0 == "eco" ? 2 : 1, "eco bucket has 2 cars, lux has 1") + } +} + +[test] +def test_table_kv_let(t : T?) { + let tab <- mk_car_tab() + let doubled <- %linq! from kv in tab let p = kv.value.price * 2 where p >= 300 select p %% + t |> equal(length(doubled), 2, "let binding inlines over the kv pair") + t |> equal(doubled[0] + doubled[1], 900, "2*150 + 2*300") +} + +[test] +def test_table_kv_iterator_output(t : T?) { + let tab <- mk_car_tab() + let got <- [for (nm in %linq! from kv in tab orderby kv.value.price select kv.value.name iterator %%); nm] + t |> equal(length(got), 3) + t |> equal(got[0], "cheap") +} + +[test] +def test_table_set_form(t : T?) { + var s : table <- { 5, 7, 9 } + let big <- %linq! from k in s where k > 6 select k %% + t |> equal(length(big), 2, "table set source rides keys()") + t |> equal(big[0] + big[1], 16, "7 + 9") +} + +[test] +def test_table_arbitrary_range_var_name(t : T?) { + let tab <- mk_car_tab() + // the kv pair name is the range variable — any identifier works + let names <- %linq! from entry in tab where entry.value.price > 200 select entry.value.name %% + t |> equal(length(names), 1) + t |> equal(names[0], "lux") +} + +[test] +def test_table_as_join_right_source(t : T?) { + // a table works on either side of a join (tier-2; the kv pair is that side's row). + // left side is the array → result follows array order, deterministic + let cars <- mk_cars() + let prio <- { "eco" => 10, "lux" => 99 } + let rows <- %linq! from c in cars join p in prio on c.brand equals p.key select "{c.name}={p.value}" %% + t |> equal(length(rows), 3) + t |> equal(rows[0], "cheap=10") + t |> equal(rows[1], "mid=10") + t |> equal(rows[2], "lux=99") +} + +[test] +def test_table_as_join_left_source(t : T?) { + let cars <- mk_cars() + let prio <- { "eco" => 10, "lux" => 99 } + // left side is the table → slot order, so sort before asserting + var rows <- %linq! from p in prio join c in cars on p.key equals c.brand select "{c.name}={p.value}" %% + rows |> sort() + t |> equal(length(rows), 3) + t |> equal(rows[0], "cheap=10") + t |> equal(rows[1], "lux=99") + t |> equal(rows[2], "mid=10") +} + // ===== orderby (single key, optional `descending`) ===== [test] From ac441c4a0e7b97951b6e9cd3057201de3384e8e8 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 01:27:14 -0700 Subject: [PATCH 06/11] =?UTF-8?q?linq=5Ffold:=20table=20point-lookup=20fol?= =?UTF-8?q?ds=20=E2=80=94=20where(kv.key=20=3D=3D=20X)=20+=20terminator=20?= =?UTF-8?q?=E2=86=92=20O(1)=20probe?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit try_table_point_lookup runs ahead of pattern dispatch in the table arm: any / keys-lane contains → key_exists, count → key_exists ? 1 : 0, first / first_or_default (± one trailing select) → an unsafe(tab?[X]) probe with the scan's exact semantics (panic on missing first, eagerly-bound default). Predicate-form any(p)/count(p) and either operand order match too. X must be loop-invariant AND side-effect free — the scan evaluates X per element, a probe once; a regression test pins per-element evaluation for an impure X. Compound && predicates (incl. collapsed multi-where) decline the probe; conjunct extraction is a named deferred edge in LINQ_TO_TABLE.md. m7 INTERP: point_lookup 0.0 ns/elem vs point_lookup_scan 8.4 (the same query forced through the walk); results.md re-swept. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 25 +- benchmarks/sql/results.md | 275 ++++++++++---------- benchmarks/sql/table.das | 26 ++ daslib/linq_fold.das | 17 +- daslib/linq_fold_table.das | 163 ++++++++++++ doc/source/reference/linq_fold_patterns.rst | 2 +- tests/linq/test_linq_table_source.das | 73 ++++++ 7 files changed, 438 insertions(+), 143 deletions(-) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 10a62d34b..77d3efdf1 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,8 +4,26 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 3 committed** (`%linq!` table sources; stage 2 = TableAdapter + m7, 571fe879e; -stage 1 = `each_kv` builtin, 8751bb9ba). +Status: **stage 4 committed** (point-lookup folds; stage 3 = `%linq!` table sources, 29d23baf6; +stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba). + +Stage 4 findings: +- `try_table_point_lookup` (linq_fold_table.das) runs in the dispatcher arm BEFORE pattern dispatch; + shapes per plan — where(key==X)+any/count/first/first_or_default(±select), predicate-form + any(p)/count(p), keys-lane contains — all emit through `TableAdapter.wrap_invoke` (probe inside + the same 1-param const-table invoke as the walks). +- **Invariance alone is not enough**: X must also be side-effect free (`has_sideeffects`) — the scan + evaluates X per element, a probe once; an impure X (e.g. a counter bump) would change observable + behavior. Covered by a regression test asserting per-element evaluation is preserved. +- Table safe-index `tab?[k]` is **unsafe** (31034 — the pointer dangles on rehash); the generated + probe wraps it (the invoke never mutates the table). Deref after the null check is plain `*p`. +- Scan-semantics mirroring: `first` panics "sequence contains no elements"; `first_or_default` + binds its default eagerly before the probe (same order as the early-exit lane / linq.das). +- `collapse_chained_wheres` runs before dispatch, so `where(key==X)|>where(p)` arrives as one + `&&` body → correctly declined (compound predicates keep the scan). Conjunct extraction + (probe + residual predicate on the probed element) is a named deferred edge below. +- m7 INTERP (2026-06-11 sweep): `point_lookup` 0.0 ns/elem (O(1) probe) vs `point_lookup_scan` + (the same query forced through the walk via a second always-true where) at full scan cost. Stage 3 findings: - The untyped `from c in ` now emits the **1-arg `from_in(src)`** for every source (the reader @@ -141,6 +159,9 @@ End of arc: `skills/linq.md` + linq docs mention the table source. ## Deferred edges (named, not built) +- **Point-lookup conjunct extraction**: `where(kv.key == X && )` (incl. the collapsed + multi-where form) could probe and evaluate the residual on the probed element only. The matcher + currently declines compound predicates; add when a real chain wants it. - **Multiple-`from` (cross / SelectMany) over tables**: the unfused `_cross_join` arm passes the bare source text so the array×array overload resolves without an `each` unsafe trip; a table there has no overload (confusing 30303 cascade). `cross_join` has iterator overloads, so routing diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 519254b0b..aede8f85a 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -16,7 +16,9 @@ are stable now). - **m5f XML** — `_fold` over `from_xml_node(root, type)` (`XmlAdapter` fuses + field-prunes). - **m6f JSON** — `_fold` over `from_json(jv, type)` (`JsonAdapter`, same machinery, array walk). - **m7 Table** — `_fold` over `each_kv(table)` (`TableAdapter`; kv usage-pruning picks keys-only / - values-only / zipped slot walks; group_by / join / reverse defer to tier-2 until their stages land). + values-only / zipped slot walks; key-equality `where` + terminator folds to an O(1) probe — the + `point_lookup` / `point_lookup_scan` pair measures it; group_by / join / reverse defer to tier-2 + until their stages land). `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`. @@ -25,169 +27,173 @@ are stable now). signal, JIT deltas as indicative.** -*Generated 2026-06-10 by `benchmarks/sql/_update_results.das` — ns/op; `—` = absent lane. Edit the prose around the markers, not the tables.* +*Generated 2026-06-11 by `benchmarks/sql/_update_results.das` — ns/op; `—` = absent lane. Edit the prose around the markers, not the tables.* ## INTERP | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 34.7 | 5.9 | 5.8 | 60.1 | 152.3 | 19.0 | -| `all_match` | 27.3 | 3.5 | 3.4 | 55.6 | 147.0 | 15.8 | +| `aggregate_match` | 34.9 | 5.9 | 5.8 | 60.7 | 160.3 | 19.1 | +| `all_match` | 27.5 | 3.5 | 3.4 | 55.9 | 154.1 | 15.8 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 29.8 | 5.9 | 8.8 | 58.3 | 156.2 | 17.2 | +| `average_aggregate` | 30.5 | 5.9 | 8.8 | 60.2 | 163.1 | 17.3 | | `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 29.2 | -| `bare_order_where` | 277.1 | 118.1 | 126.8 | 300.9 | 292.2 | 166.4 | -| `chained_select_collapse` | — | 17.7 | 17.4 | 70.1 | 155.4 | 27.8 | -| `chained_where` | 35.8 | 6.6 | 7.1 | 104.2 | 174.7 | 24.1 | -| `contains_match` | 0.0 | 2.2 | 1.4 | 27.5 | 68.5 | 6.6 | -| `count_aggregate` | 29.2 | 4.1 | 4.1 | 63.4 | 147.5 | 20.2 | -| `cross_join` | 13122.7 | 3685.9 | — | 3995.6 | 4066.2 | — | +| `bare_order_where` | 278.2 | 117.7 | 126.7 | 299.6 | 292.7 | 166.4 | +| `chained_select_collapse` | — | 17.7 | 17.4 | 70.4 | 168.3 | 27.8 | +| `chained_where` | 35.9 | 6.6 | 7.1 | 104.9 | 184.0 | 24.1 | +| `contains_match` | 0.0 | 2.3 | 1.5 | 29.1 | 72.4 | 6.6 | +| `count_aggregate` | 30.0 | 4.1 | 4.2 | 63.7 | 155.2 | 20.2 | +| `cross_join` | 12604.3 | 3685.2 | — | 4006.6 | 4040.5 | — | | `decs_count_bare_pred` | — | — | 4.1 | — | — | — | -| `distinct_by_count` | 40.8 | 15.6 | 15.6 | 70.2 | 154.0 | 26.4 | -| `distinct_by_order_take` | 240.7 | 22.1 | 23.4 | 122.7 | 161.6 | 48.5 | -| `distinct_by_order_to_array` | 239.2 | 22.2 | 23.5 | 123.6 | 161.7 | 48.4 | -| `distinct_count` | 40.7 | 15.9 | 15.7 | 70.5 | 155.8 | 26.9 | -| `distinct_count_pred` | 251.0 | 16.1 | 15.8 | 111.5 | 178.0 | 26.3 | +| `distinct_by_count` | 40.9 | 15.6 | 15.6 | 70.6 | 162.2 | 26.3 | +| `distinct_by_order_take` | 239.3 | 22.1 | 23.4 | 123.3 | 162.4 | 48.6 | +| `distinct_by_order_to_array` | 237.8 | 22.1 | 23.5 | 124.1 | 163.3 | 48.6 | +| `distinct_count` | 41.2 | 15.8 | 15.7 | 70.8 | 163.6 | 26.9 | +| `distinct_count_pred` | 252.2 | 15.7 | 15.9 | 112.1 | 178.4 | 26.3 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 173.3 | 29.3 | 29.3 | 122.9 | 190.0 | — | -| `groupby_count` | 143.5 | 19.4 | 19.4 | 75.4 | 161.0 | 162.6 | -| `groupby_first` | 251.7 | 19.5 | 20.1 | 72.1 | 156.9 | — | -| `groupby_having_count` | 140.7 | 19.5 | 19.5 | 74.7 | 161.2 | — | -| `groupby_having_hidden_sum` | 176.1 | 22.5 | 22.6 | 118.0 | 183.5 | — | -| `groupby_having_post_where` | 172.8 | 20.8 | 20.8 | 114.1 | 180.4 | — | -| `groupby_max` | 173.5 | 24.8 | 25.3 | 119.7 | 185.2 | — | -| `groupby_min` | 173.8 | 25.2 | 25.1 | 119.8 | 184.7 | — | -| `groupby_multi_reducer` | 189.5 | 30.5 | 30.6 | 124.3 | 188.4 | — | -| `groupby_select_order` | 169.9 | 20.8 | 20.8 | 114.3 | 180.9 | — | -| `groupby_select_sum` | 196.9 | 38.6 | 38.1 | 101.6 | 186.6 | — | -| `groupby_sum` | 170.5 | 21.2 | 20.8 | 114.4 | 180.2 | 192.8 | -| `groupby_where_count` | 75.6 | 14.1 | 14.3 | 115.2 | 177.8 | — | -| `groupby_where_sum` | 86.4 | 14.1 | 14.6 | 116.2 | 178.1 | — | -| `join_count` | 38.0 | 51.2 | 64.2 | 112.7 | 176.9 | 195.0 | -| `join_groupby_count` | 157.7 | 86.1 | 88.2 | 177.4 | 221.8 | — | -| `join_groupby_to_array` | 194.9 | 80.3 | 91.7 | 214.8 | 212.1 | — | -| `join_select` | 150.3 | 72.4 | 84.4 | 187.8 | 209.0 | — | -| `join_where_count` | 39.0 | 61.6 | 76.7 | 159.8 | 193.6 | 229.1 | -| `last_match` | 0.0 | 5.9 | 13.9 | 64.9 | 152.3 | 31.0 | -| `long_count_aggregate` | 28.7 | 4.1 | 4.1 | 63.3 | 147.5 | 20.3 | -| `max_aggregate` | 30.6 | 6.0 | 6.8 | 58.4 | 156.1 | 17.0 | -| `min_aggregate` | 30.5 | 6.0 | 6.8 | 58.4 | 155.1 | 17.0 | -| `order_by_multi_key` | 338.7 | 272.3 | 286.1 | 457.7 | 448.2 | 333.0 | -| `order_distinct_take` | 138.4 | 15.9 | 99.2 | 72.4 | 156.5 | 31.0 | -| `order_reverse_normalized` | 37.9 | 16.3 | 20.0 | 70.4 | 162.9 | — | -| `order_take_desc` | 37.8 | 16.3 | 20.3 | 69.8 | 163.3 | 33.2 | -| `reverse_distinct_by` | 294.1 | 21.2 | 28.0 | 70.8 | 155.4 | — | -| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.7 | -| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.1 | — | -| `select_count` | 0.1 | 0.0 | 2.2 | 64.8 | 2.2 | 0.0 | -| `select_many` | — | 191.0 | — | — | — | — | -| `select_where` | 194.7 | 11.5 | 19.3 | 195.9 | 185.7 | 37.5 | -| `select_where_count` | 32.3 | 5.1 | 7.4 | 64.6 | 150.7 | 21.8 | -| `select_where_order_take` | 36.2 | 12.2 | 15.0 | 72.3 | 158.5 | 34.4 | -| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.3 | 160.5 | 23.2 | -| `single_match` | 0.0 | 2.9 | 5.5 | 56.9 | 151.1 | 22.8 | +| `groupby_average` | 170.5 | 29.3 | 29.3 | 122.7 | 197.8 | — | +| `groupby_count` | 141.6 | 19.5 | 19.4 | 74.7 | 169.0 | 163.3 | +| `groupby_first` | 252.3 | 19.4 | 20.1 | 71.8 | 163.5 | — | +| `groupby_having_count` | 141.3 | 19.5 | 19.5 | 74.3 | 169.3 | — | +| `groupby_having_hidden_sum` | 175.7 | 22.4 | 22.6 | 118.5 | 192.1 | — | +| `groupby_having_post_where` | 171.6 | 20.8 | 21.6 | 114.8 | 188.9 | — | +| `groupby_max` | 174.1 | 24.7 | 25.6 | 119.8 | 192.6 | — | +| `groupby_min` | 173.5 | 25.1 | 26.2 | 119.9 | 193.4 | — | +| `groupby_multi_reducer` | 189.9 | 30.2 | 30.6 | 125.1 | 196.0 | — | +| `groupby_select_order` | 174.3 | 20.8 | 20.8 | 114.6 | 189.8 | — | +| `groupby_select_sum` | 197.9 | 38.5 | 40.7 | 101.5 | 196.1 | — | +| `groupby_sum` | 171.2 | 20.7 | 20.8 | 115.0 | 190.5 | 192.9 | +| `groupby_where_count` | 75.7 | 14.0 | 14.3 | 115.5 | 187.7 | — | +| `groupby_where_sum` | 86.5 | 14.1 | 14.7 | 116.3 | 186.7 | — | +| `join_count` | 38.3 | 51.2 | 64.3 | 113.1 | 184.5 | 194.6 | +| `join_groupby_count` | 157.7 | 79.1 | 88.6 | 177.7 | 232.0 | — | +| `join_groupby_to_array` | 189.0 | 78.1 | 90.1 | 215.3 | 215.6 | — | +| `join_select` | 151.5 | 72.6 | 85.0 | 188.5 | 215.8 | — | +| `join_where_count` | 48.8 | 61.5 | 76.7 | 160.0 | 201.9 | 229.1 | +| `last_match` | 0.0 | 5.9 | 13.9 | 65.1 | 159.0 | 30.9 | +| `long_count_aggregate` | 28.9 | 4.1 | 4.2 | 63.3 | 154.6 | 20.3 | +| `max_aggregate` | 30.7 | 6.0 | 6.9 | 58.7 | 163.1 | 17.0 | +| `min_aggregate` | 30.6 | 6.0 | 6.9 | 58.6 | 163.3 | 17.1 | +| `order_by_multi_key` | 339.9 | 271.4 | 283.6 | 458.8 | 446.1 | 334.3 | +| `order_distinct_take` | 137.9 | 15.9 | 100.3 | 72.5 | 164.1 | 31.1 | +| `order_reverse_normalized` | 38.3 | 16.2 | 20.3 | 70.7 | 170.9 | — | +| `order_take_desc` | 38.2 | 16.2 | 20.6 | 70.1 | 170.2 | 33.3 | +| `point_lookup` | — | — | — | — | — | 0.0 | +| `point_lookup_scan` | — | — | — | — | — | 8.4 | +| `reverse_distinct_by` | 294.0 | 21.1 | 28.1 | 71.1 | 162.6 | — | +| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.9 | +| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | — | +| `select_count` | 0.1 | 0.0 | 2.2 | 68.3 | 2.2 | 0.0 | +| `select_many` | — | 191.5 | — | — | — | — | +| `select_where` | 197.5 | 11.2 | 19.4 | 195.6 | 183.7 | 37.5 | +| `select_where_count` | 32.2 | 5.1 | 7.5 | 64.8 | 157.1 | 21.9 | +| `select_where_order_take` | 36.2 | 12.2 | 15.1 | 72.5 | 165.1 | 34.5 | +| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.4 | 162.2 | 23.3 | +| `single_match` | 0.0 | 2.9 | 5.5 | 58.5 | 151.1 | 22.8 | | `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | 0.3 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.3 | 146.6 | 18.2 | -| `sort_first` | 37.6 | 11.1 | 13.3 | 64.6 | 159.5 | 31.7 | -| `sort_take` | 38.0 | 16.2 | 20.9 | 70.2 | 161.9 | 33.0 | -| `sort_take_select` | 37.6 | 16.3 | 20.9 | 70.8 | 162.7 | 33.3 | -| `sum_aggregate` | 29.7 | 2.1 | 2.1 | 54.3 | 146.7 | 13.4 | -| `sum_where` | 31.9 | 4.3 | 4.3 | 63.6 | 148.1 | 20.5 | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 60.2 | 153.8 | 18.3 | +| `sort_first` | 38.0 | 11.1 | 13.3 | 65.0 | 167.1 | 31.7 | +| `sort_take` | 38.2 | 16.3 | 21.1 | 70.2 | 170.7 | 33.2 | +| `sort_take_select` | 38.1 | 16.3 | 21.8 | 71.1 | 170.6 | 33.3 | +| `sum_aggregate` | 30.6 | 2.1 | 2.1 | 54.8 | 152.8 | 13.5 | +| `sum_where` | 32.9 | 4.4 | 4.3 | 63.4 | 154.2 | 20.6 | | `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | 0.3 | | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | -| `take_while_match` | 7.8 | 2.4 | 2.4 | 28.8 | 71.4 | 16.8 | -| `to_array_filter` | 70.3 | 11.8 | 11.7 | 71.1 | 157.4 | 28.8 | -| `where_join_count` | 41.0 | 29.0 | 41.5 | 133.0 | 163.1 | — | -| `zip_count_pred` | 39.0 | 15.8 | — | 313.5 | 319.6 | — | -| `zip_dot_product` | 46.1 | 12.6 | 10.5 | 308.6 | 317.2 | — | -| `zip_dot_product_3arg` | 46.1 | 12.8 | — | 308.7 | 316.5 | — | -| `zip_reverse_to_array` | — | 31.6 | — | 343.1 | 351.0 | — | +| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.3 | 75.4 | 16.5 | +| `to_array_filter` | 70.0 | 11.8 | 11.7 | 71.3 | 164.9 | 28.7 | +| `where_join_count` | 41.2 | 29.0 | 42.0 | 132.1 | 168.9 | — | +| `zip_count_pred` | 39.2 | 15.9 | — | 313.8 | 322.0 | — | +| `zip_dot_product` | 46.1 | 12.6 | 10.6 | 308.6 | 319.3 | — | +| `zip_dot_product_3arg` | 46.1 | 12.8 | — | 309.7 | 319.0 | — | +| `zip_reverse_to_array` | — | 31.6 | — | 343.4 | 353.5 | — | ## JIT | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.0 | 0.3 | 0.6 | 21.7 | 27.3 | 13.6 | -| `all_match` | 27.8 | 0.3 | 0.2 | 18.1 | 25.9 | 13.5 | +| `aggregate_match` | 35.1 | 0.3 | 0.6 | 21.8 | 26.0 | 13.4 | +| `all_match` | 27.8 | 0.3 | 0.2 | 18.1 | 25.2 | 13.5 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 29.9 | 1.0 | 3.6 | 18.0 | 24.4 | 13.4 | +| `average_aggregate` | 30.1 | 1.0 | 3.6 | 18.1 | 24.6 | 13.5 | | `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.1 | -| `bare_order_where` | 186.2 | 34.0 | 35.3 | 106.3 | 52.4 | 78.7 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 20.4 | 33.0 | 14.0 | -| `chained_where` | 35.9 | 0.6 | 0.8 | 35.5 | 31.5 | 17.6 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 14.8 | 9.2 | 4.7 | -| `count_aggregate` | 29.5 | 0.3 | 0.6 | 20.4 | 25.1 | 13.4 | -| `cross_join` | 5964.4 | 734.4 | — | 834.2 | 772.7 | — | +| `bare_order_where` | 185.6 | 34.0 | 35.2 | 106.5 | 53.5 | 78.9 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.4 | 14.0 | +| `chained_where` | 36.2 | 0.6 | 0.8 | 35.6 | 31.4 | 17.7 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 9.0 | 4.7 | +| `count_aggregate` | 29.3 | 0.3 | 0.6 | 20.5 | 25.3 | 13.5 | +| `cross_join` | 5962.8 | 733.1 | — | 836.0 | 773.4 | — | | `decs_count_bare_pred` | — | — | 0.6 | — | — | — | -| `distinct_by_count` | 41.0 | 1.1 | 1.1 | 20.4 | 32.0 | 14.0 | -| `distinct_by_order_take` | 237.4 | 1.7 | 2.6 | 48.4 | 37.1 | 29.9 | -| `distinct_by_order_to_array` | 237.2 | 1.7 | 2.7 | 47.5 | 36.8 | 30.0 | -| `distinct_count` | 40.8 | 1.1 | 1.1 | 20.5 | 31.9 | 14.0 | -| `distinct_count_pred` | 249.8 | 1.1 | 1.3 | 37.6 | 41.7 | 14.0 | +| `distinct_by_count` | 41.2 | 1.1 | 1.1 | 20.6 | 33.3 | 14.0 | +| `distinct_by_order_take` | 237.1 | 1.7 | 2.6 | 47.4 | 39.1 | 30.3 | +| `distinct_by_order_to_array` | 242.4 | 1.8 | 2.6 | 47.4 | 38.7 | 30.3 | +| `distinct_count` | 40.9 | 1.1 | 1.1 | 20.6 | 33.3 | 14.0 | +| `distinct_count_pred` | 250.6 | 1.1 | 1.3 | 37.7 | 43.5 | 14.0 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | +| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 170.1 | 1.5 | 1.9 | 35.7 | 43.0 | — | -| `groupby_count` | 141.1 | 1.3 | 1.5 | 20.5 | 32.2 | 43.0 | -| `groupby_first` | 251.0 | 1.3 | 2.3 | 20.5 | 32.9 | — | -| `groupby_having_count` | 141.1 | 1.3 | 1.5 | 20.5 | 32.1 | — | -| `groupby_having_hidden_sum` | 173.9 | 1.5 | 1.7 | 35.8 | 42.7 | — | -| `groupby_having_post_where` | 170.2 | 1.4 | 1.9 | 35.8 | 41.8 | — | -| `groupby_max` | 172.3 | 1.5 | 1.9 | 35.9 | 43.6 | — | -| `groupby_min` | 173.0 | 1.5 | 1.8 | 35.8 | 43.6 | — | -| `groupby_multi_reducer` | 191.8 | 1.6 | 1.9 | 36.1 | 43.7 | — | -| `groupby_select_order` | 170.5 | 1.4 | 1.9 | 35.8 | 42.0 | — | -| `groupby_select_sum` | 195.5 | 2.8 | 3.2 | 32.3 | 37.6 | — | -| `groupby_sum` | 169.8 | 1.4 | 1.6 | 35.8 | 42.0 | 51.2 | -| `groupby_where_count` | 75.7 | 0.9 | 1.3 | 35.9 | 39.7 | — | -| `groupby_where_sum` | 86.4 | 0.9 | 1.3 | 35.9 | 39.6 | — | -| `join_count` | 37.9 | 11.0 | 11.7 | 43.4 | 68.3 | 62.9 | -| `join_groupby_count` | 156.2 | 18.2 | 20.0 | 68.3 | 86.7 | — | -| `join_groupby_to_array` | 189.2 | 17.5 | 19.4 | 80.2 | 36.1 | — | -| `join_select` | 92.8 | 19.6 | 21.6 | 74.4 | 94.1 | — | -| `join_where_count` | 39.1 | 18.9 | 20.6 | 64.5 | 77.9 | 80.0 | -| `last_match` | 0.0 | 0.5 | 1.4 | 18.6 | 25.9 | 22.9 | -| `long_count_aggregate` | 28.7 | 0.3 | 0.6 | 20.4 | 26.6 | 13.4 | -| `max_aggregate` | 30.6 | 0.3 | 0.5 | 18.1 | 26.7 | 13.4 | -| `min_aggregate` | 30.6 | 0.3 | 0.5 | 18.2 | 26.3 | 13.4 | -| `order_by_multi_key` | 247.0 | 53.4 | 54.8 | 125.3 | 70.3 | 128.9 | -| `order_distinct_take` | 137.9 | 1.1 | 75.6 | 20.9 | 34.1 | 14.0 | -| `order_reverse_normalized` | 37.8 | 0.7 | 1.3 | 24.6 | 27.0 | — | -| `order_take_desc` | 38.0 | 0.7 | 1.3 | 24.5 | 26.9 | 17.7 | -| `reverse_distinct_by` | 295.4 | 1.5 | 3.2 | 20.4 | 32.7 | — | -| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 26.8 | -| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | — | -| `select_count` | 0.1 | 0.0 | 0.0 | 63.4 | 0.0 | 0.0 | -| `select_many` | — | 61.5 | — | — | — | — | -| `select_where` | 110.5 | 4.3 | 5.3 | 76.1 | 22.1 | 27.9 | -| `select_where_count` | 32.1 | 0.3 | 0.6 | 18.4 | 25.9 | 13.3 | -| `select_where_order_take` | 36.3 | 0.7 | 1.4 | 18.9 | 26.6 | 22.9 | -| `select_where_sum` | 37.0 | 0.4 | 0.6 | 17.9 | 24.9 | 13.3 | -| `single_match` | 0.0 | 0.4 | 1.1 | 43.4 | 22.2 | 17.2 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 | -| `skip_while_match` | 3.5 | 0.4 | 0.4 | 43.5 | 21.8 | 13.2 | -| `sort_first` | 37.7 | 0.4 | 1.4 | 17.9 | 26.1 | 17.1 | -| `sort_take` | 38.0 | 0.7 | 1.5 | 24.5 | 26.8 | 17.7 | -| `sort_take_select` | 37.8 | 0.7 | 1.3 | 24.5 | 26.9 | 17.7 | -| `sum_aggregate` | 29.6 | 0.3 | 0.1 | 23.3 | 24.3 | 13.4 | -| `sum_where` | 32.1 | 0.3 | 0.6 | 18.4 | 25.9 | 13.3 | -| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.4 | +| `groupby_average` | 171.1 | 1.6 | 1.9 | 35.9 | 45.7 | — | +| `groupby_count` | 141.4 | 1.3 | 1.5 | 20.6 | 33.9 | 45.7 | +| `groupby_first` | 250.7 | 1.3 | 2.3 | 20.6 | 34.3 | — | +| `groupby_having_count` | 141.5 | 1.3 | 1.5 | 20.6 | 33.8 | — | +| `groupby_having_hidden_sum` | 174.4 | 1.5 | 1.7 | 35.9 | 45.3 | — | +| `groupby_having_post_where` | 170.1 | 1.4 | 2.0 | 35.8 | 44.2 | — | +| `groupby_max` | 175.6 | 1.5 | 2.0 | 36.0 | 46.0 | — | +| `groupby_min` | 172.4 | 1.5 | 1.8 | 36.0 | 46.0 | — | +| `groupby_multi_reducer` | 189.6 | 1.6 | 2.0 | 36.1 | 46.1 | — | +| `groupby_select_order` | 170.1 | 1.4 | 1.9 | 35.9 | 44.3 | — | +| `groupby_select_sum` | 197.0 | 2.8 | 3.2 | 32.2 | 40.0 | — | +| `groupby_sum` | 170.5 | 1.4 | 1.6 | 35.9 | 43.4 | 54.2 | +| `groupby_where_count` | 75.6 | 0.9 | 1.3 | 36.0 | 41.7 | — | +| `groupby_where_sum` | 86.2 | 0.9 | 1.3 | 35.9 | 41.7 | — | +| `join_count` | 38.2 | 11.0 | 11.7 | 43.6 | 71.4 | 63.1 | +| `join_groupby_count` | 156.8 | 18.0 | 20.1 | 68.5 | 90.1 | — | +| `join_groupby_to_array` | 189.5 | 17.4 | 19.4 | 80.5 | 36.0 | — | +| `join_select` | 93.2 | 19.6 | 21.7 | 74.8 | 94.5 | — | +| `join_where_count` | 48.3 | 19.0 | 20.7 | 64.5 | 78.3 | 80.0 | +| `last_match` | 0.0 | 0.5 | 1.4 | 18.8 | 25.9 | 22.9 | +| `long_count_aggregate` | 28.8 | 0.3 | 0.6 | 20.6 | 25.4 | 13.5 | +| `max_aggregate` | 30.5 | 0.3 | 0.5 | 18.3 | 26.7 | 13.4 | +| `min_aggregate` | 30.6 | 0.3 | 0.5 | 18.3 | 26.6 | 13.5 | +| `order_by_multi_key` | 249.4 | 53.4 | 54.8 | 125.6 | 71.1 | 129.8 | +| `order_distinct_take` | 138.1 | 1.1 | 75.6 | 20.9 | 35.8 | 14.0 | +| `order_reverse_normalized` | 38.0 | 0.7 | 1.4 | 24.6 | 27.6 | — | +| `order_take_desc` | 37.9 | 0.7 | 1.3 | 24.6 | 27.9 | 17.8 | +| `point_lookup` | — | — | — | — | — | 0.0 | +| `point_lookup_scan` | — | — | — | — | — | 6.1 | +| `reverse_distinct_by` | 295.6 | 1.6 | 3.2 | 20.6 | 34.3 | — | +| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 26.9 | +| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | — | +| `select_count` | 0.1 | 0.0 | 0.0 | 68.7 | 0.0 | 0.0 | +| `select_many` | — | 64.0 | — | — | — | — | +| `select_where` | 110.6 | 4.2 | 5.3 | 76.5 | 22.0 | 28.1 | +| `select_where_count` | 32.3 | 0.3 | 0.6 | 18.6 | 26.7 | 13.5 | +| `select_where_order_take` | 37.1 | 0.7 | 1.4 | 19.1 | 27.4 | 23.0 | +| `select_where_sum` | 36.9 | 0.4 | 0.6 | 18.2 | 25.2 | 13.4 | +| `single_match` | 0.0 | 0.4 | 1.1 | 46.3 | 22.2 | 17.3 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.2 | +| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.7 | 21.7 | 13.3 | +| `sort_first` | 38.3 | 0.4 | 1.3 | 18.2 | 26.7 | 17.3 | +| `sort_take` | 38.2 | 0.7 | 1.4 | 24.7 | 27.8 | 17.8 | +| `sort_take_select` | 37.6 | 0.7 | 1.4 | 24.7 | 27.8 | 17.8 | +| `sum_aggregate` | 29.3 | 0.3 | 0.1 | 23.4 | 24.6 | 13.5 | +| `sum_where` | 31.8 | 0.3 | 0.6 | 18.6 | 26.4 | 13.4 | +| `take_count` | 1.9 | 0.1 | 0.1 | 1.2 | 0.2 | 0.2 | | `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.2 | | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_where_count` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_while_match` | 7.8 | 0.2 | 0.3 | 14.7 | 9.0 | 13.3 | -| `to_array_filter` | 47.1 | 3.3 | 3.3 | 21.3 | 33.6 | 20.0 | -| `where_join_count` | 39.0 | 5.8 | 6.7 | 49.5 | 40.6 | — | -| `zip_count_pred` | 39.1 | 0.1 | — | 116.7 | 33.5 | — | -| `zip_dot_product` | 46.3 | 0.1 | 0.1 | 116.6 | 33.4 | — | -| `zip_dot_product_3arg` | 46.1 | 0.1 | — | 116.5 | 33.4 | — | -| `zip_reverse_to_array` | — | 4.6 | — | 127.7 | 50.0 | — | +| `take_where_count` | 0.9 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | +| `take_while_match` | 7.7 | 0.2 | 0.3 | 17.3 | 9.0 | 13.4 | +| `to_array_filter` | 48.2 | 3.2 | 3.3 | 21.6 | 35.0 | 20.4 | +| `where_join_count` | 41.2 | 5.8 | 6.7 | 49.6 | 41.9 | — | +| `zip_count_pred` | 38.6 | 0.1 | — | 117.0 | 33.9 | — | +| `zip_dot_product` | 46.0 | 0.1 | 0.1 | 116.8 | 33.8 | — | +| `zip_dot_product_3arg` | 45.9 | 0.1 | — | 116.8 | 33.7 | — | +| `zip_reverse_to_array` | — | 4.6 | — | 128.3 | 51.4 | — | ## Missing lanes (the `—` cells) @@ -204,6 +210,7 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) - **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` and joins beyond `join_count`/`join_where_count` (table group_by/join fusion is staged — see `LINQ_TO_TABLE.md`; the four marker cells track the tier-2 cost until then), `decs_count_bare_pred` (decs-only). +- **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design. ## Accepted floors diff --git a/benchmarks/sql/table.das b/benchmarks/sql/table.das index 66564b963..2b49e1c31 100644 --- a/benchmarks/sql/table.das +++ b/benchmarks/sql/table.das @@ -388,6 +388,32 @@ def order_take_desc_m7(b : B?) { } } +// Point-lookup pair: the fused probe (key-equality where + first_or_default → `g_t?[k]`, O(1) total — +// per-element ns reads ~0) vs the same query forced onto the linear scan via a second always-true +// `where` (collapses to a compound `&&` predicate, which the probe matcher correctly declines). +[benchmark] +def point_lookup_m7(b : B?) { + b |> run("point_lookup", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.key == N / 2).first_or_default(default)) + b |> accept(row) + if (row.key == 0) { + b->failNow() + } + } +} + +[benchmark] +def point_lookup_scan_m7(b : B?) { + b |> run("point_lookup_scan", N) { + let row = _fold(unsafe(each_kv(g_t))._where(_.key == N / 2)._where(_.value.price >= 0) + .first_or_default(default)) + b |> accept(row) + if (row.key == 0) { + b->failNow() + } + } +} + [benchmark] def reverse_take_m7(b : B?) { b |> run("reverse_take", N) { diff --git a/daslib/linq_fold.das b/daslib/linq_fold.das index d09f9f027..3f81673c7 100644 --- a/daslib/linq_fold.das +++ b/daslib/linq_fold.das @@ -219,9 +219,6 @@ def private try_splice_patterns(prog : ProgramPtr; var expr : Expression?) : Exp let valT = tabCall.arguments[0]._type.secondType if (tabName != "each_kv" || (valT != null && valT.canCopy)) { let lane = tabName == "each_kv" ? TableLane.KV : (tabName == "keys" ? TableLane.KEYS : TableLane.VALUES) - if (lane != TableLane.VALUES) { - drop_redundant_distinct(calls) // keys are unique by construction; values can repeat - } var ttopClone = clone_expression(top) // keys/each_kv spell their element `-const` (iterator-variance concern); that flag must not leak into emitted var/buffer type spellings (`array -const>` breaks push_clone unification). if (ttopClone._type != null && ttopClone._type.firstType != null) { @@ -229,9 +226,17 @@ def private try_splice_patterns(prog : ProgramPtr; var expr : Expression?) : Exp } var elemT = clone_type(tabCall._type.firstType) elemT.flags.removeConstant = false - return run_splice_adapter(calls, ttopClone, ttopClone, - new TableAdapter(tabExpr = clone_expression(tabCall.arguments[0]), srcName = qn("tsrc", at), - elemType = elemT, lane = lane), exprIsIter, at) + var tadapter = new TableAdapter(tabExpr = clone_expression(tabCall.arguments[0]), srcName = qn("tsrc", at), + elemType = elemT, lane = lane) + if (!exprIsIter) { + // `where(kv.key == X)` + terminator → O(1) key probe instead of the walk + var probe = try_table_point_lookup(calls, tadapter, at) + if (probe != null) return probe + } + if (lane != TableLane.VALUES) { + drop_redundant_distinct(calls) // keys are unique by construction; values can repeat + } + return run_splice_adapter(calls, ttopClone, ttopClone, tadapter, exprIsIter, at) } } top = peel_each(top) diff --git a/daslib/linq_fold_table.das b/daslib/linq_fold_table.das index b089bb0b6..c9fc01181 100644 --- a/daslib/linq_fold_table.das +++ b/daslib/linq_fold_table.das @@ -18,6 +18,7 @@ module linq_fold_table shared public require daslib/ast_boost require daslib/ast_match require daslib/templates_boost +require daslib/macro_boost require daslib/linq_fold_common public enum TableLane { @@ -171,6 +172,168 @@ class TableAdapter : SourceAdapter { } } +// ===== Point-lookup folds — `where(kv.key == X)` + terminator → O(1) key probe ===== +// any/contains → key_exists, count → key_exists?1:0, first[_or_default] (± select) → tab?[X] probe, +// with the scan's exact semantics. Full shape/decline table: linq_fold_patterns.rst (table source row). + +[macro_function] +def private match_key_probe_side(var keySide, otherSide : Expression?; lane : TableLane; bindName : string) : Expression? { + var k = keySide + if (k != null && k is ExprRef2Value) { + k = (k as ExprRef2Value).subexpr + } + if (lane == TableLane.KV) { + if (k == null || !(k is ExprField)) return null + var f = k as ExprField + if (f.name != "key") return null + var base = f.value + if (base != null && base is ExprRef2Value) { + base = (base as ExprRef2Value).subexpr + } + if (base == null || !(base is ExprVar) || (base as ExprVar).name != bindName) return null + } else { + if (k == null || !(k is ExprVar) || (k as ExprVar).name != bindName) return null + } + // X must be loop-invariant AND side-effect free — the scan evaluates X per element, a probe once + if (expr_uses_var(otherSide, bindName) || has_sideeffects(otherSide)) return null + return clone_expression(otherSide) +} + +// Decompose a peeled predicate body (binder renamed to bindName) as ` == X`. Returns cloned X. +[macro_function] +def private extract_key_probe(var pred : Expression?; lane : TableLane; bindName : string) : Expression? { + if (pred == null || !(pred is ExprOp2)) return null + var op2 = pred as ExprOp2 + if (op2.op != "==") return null + var probe = match_key_probe_side(op2.left, op2.right, lane, bindName) + if (probe == null) { + probe = match_key_probe_side(op2.right, op2.left, lane, bindName) + } + return probe +} + +[macro_function] +def try_table_point_lookup(var calls : array>; var adapter : TableAdapter?; at : LineInfo) : Expression? { + if (adapter.lane == TableLane.VALUES) return null + let n = length(calls) + if (n < 1 || n > 3) return null + var termCall = calls[n - 1]._0 + let termName = calls[n - 1]._1.name + let termArgs = length(termCall.arguments) + var selCall : ExprCall? + var predArg : Expression? + var keyX : Expression? + let bindName = qn("plk_it", at) + if (n == 1) { + // predicate-form terminators, and the keys-lane contains + if ((termName == "any" || termName == "count") && termArgs == 2) { + predArg = termCall.arguments[1] + } elif (termName == "contains" && termArgs == 2 && adapter.lane == TableLane.KEYS) { + // element evaluated exactly once on both paths — no invariance/purity gate needed + keyX = clone_expression(termCall.arguments[1]) + } else { + return null + } + } else { + if (calls[0]._1.name != "where_" || length(calls[0]._0.arguments) != 2) return null + predArg = calls[0]._0.arguments[1] + if (n == 3) { + if (calls[1]._1.name != "select" || length(calls[1]._0.arguments) != 2 + || (termName != "first" && termName != "first_or_default")) { + return null + } + selCall = calls[1]._0 + } elif (!((termName == "any" || termName == "count" || termName == "first") && termArgs == 1) + && !(termName == "first_or_default" && termArgs == 2)) { + return null + } + } + if (predArg != null) { + var predBody = peel_lambda_rename_var(predArg, bindName) + keyX = extract_key_probe(predBody, adapter.lane, bindName) + } + if (keyX == null) return null + let sn = adapter.srcName + // boolean / counting probes + if (termName == "any" || termName == "contains") { + var anyStmts <- qmacro_block_to_array() { + return key_exists($i(sn), $e(keyX)) + } + return adapter->wrap_invoke(anyStmts, null, false, at) + } + if (termName == "count") { + var cntStmts <- qmacro_block_to_array() { + return key_exists($i(sn), $e(keyX)) ? 1 : 0 + } + return adapter->wrap_invoke(cntStmts, null, false, at) + } + // element probes: first / first_or_default, ± trailing select + var retT = strip_const_ref(clone_type(termCall._type)) + retT.flags.removeConstant = false + let kName = qn("plk_k", at) + let dName = qn("plk_d", at) + var stmts : array + if (termName == "first_or_default") { + // eager default bind, matching linq.das argument evaluation order + stmts |> push <| qmacro_expr() { + let $i(dName) = $e(termCall.arguments[1]) + } + } + stmts |> push <| qmacro_expr() { + let $i(kName) = $e(keyX) + } + var missTail : Expression? + if (termName == "first_or_default") { + missTail = qmacro_expr() { + return $i(dName) + } + } else { + missTail = qmacro_expr() { + panic("sequence contains no elements") + } + } + if (adapter.lane == TableLane.KEYS) { + stmts |> push <| qmacro_expr() { + if (!key_exists($i(sn), $i(kName))) { + $e(missTail) + } + } + if (selCall != null) { + var proj = peel_lambda_rename_var(selCall.arguments[1], kName) + stmts |> push <| qmacro_expr() { + return $e(proj) + } + } else { + stmts |> push <| qmacro_expr() { + return $i(kName) + } + } + return adapter->wrap_invoke(stmts, retT, false, at) + } + // KV lane: probe the value pointer, materialize the (key, value) pair on hit. Table safe-index is + // unsafe (the pointer dangles on rehash) — fine here, the generated invoke never mutates the table. + let pName = qn("plk_p", at) + stmts |> push_from <| qmacro_block_to_array() { + let $i(pName) = unsafe($i(sn)?[$i(kName)]) + if ($i(pName) == null) { + $e(missTail) + } + } + if (selCall != null) { + let bName = qn("plk_kv", at) + var proj = peel_lambda_rename_var(selCall.arguments[1], bName) + stmts |> push_from <| qmacro_block_to_array() { + let $i(bName) = (key = $i(kName), value = *$i(pName)) + return $e(proj) + } + } else { + stmts |> push <| qmacro_expr() { + return (key = $i(kName), value = *$i(pName)) + } + } + return adapter->wrap_invoke(stmts, retT, false, at) +} + // Recognize an `each_kv(tab)` / `keys(tab)` / `values(tab)` chain top. Returns the call (caller reads // arguments[0] = the table, `_type.firstType` = element); null otherwise. Name + table-typed-arg match, // like extract_json_source — the strong arg-type gate keeps an unrelated user `keys` from firing this. diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index 2d5bdc524..3ced024a9 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -150,7 +150,7 @@ Source-side entry points - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy). * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)`` - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``) - - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. ``can_join`` / ``can_group_by`` are off and reverse has no backward slot walk — those shapes cascade to tier-2 (the join probe and key-lookup folds are staged: see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. + - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. ``can_join`` / ``can_group_by`` are off and reverse has no backward slot walk — those shapes cascade to tier-2 (the join probe is staged: see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. * - ``unsafe(from_json(jv, type))`` - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``) - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML. diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das index bcbfef726..630960d18 100644 --- a/tests/linq/test_linq_table_source.das +++ b/tests/linq/test_linq_table_source.das @@ -205,6 +205,79 @@ def test_table_fold_set_form(t : T?) { } } +def private bump_key(var c : int&) : int { + c++ + return 2 +} + +// Point-lookup folds: `where(kv.key == X)` + terminator → O(1) key probe (key_exists / `tab?[X]`). +// Probes must agree with the scan on hit AND miss; non-probe shapes must keep riding the scan. + +[test] +def test_table_point_lookup(t : T?) { + t |> run("kv any/count/first probes, hit and miss") @(t : T?) { + var tab <- make_int_table(10) + let k = 7 + t |> equal(_fold(each_kv(tab)._where(_.key == k).any()), true) + t |> equal(_fold(each_kv(tab)._where(_.key == 99).any()), false) + t |> equal(_fold(each_kv(tab)._any(_.key == k)), true) + t |> equal(_fold(each_kv(tab)._where(_.key == k).count()), 1) + t |> equal(_fold(each_kv(tab)._where(_.key == 99).count()), 0) + t |> equal(_fold(each_kv(tab)._count(_.key == 3)), 1) + let f = _fold(each_kv(tab)._where(k == _.key).first()) // flipped operand side + t |> equal(f.key, 7) + t |> equal(f.value, 70) + let m = _fold(each_kv(tab)._where(_.key == 99).first_or_default(default>)) + t |> equal(m.key, 0) + delete tab + } + t |> run("probe + trailing select projects the probed element") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._where(_.key == 4)._select(_.value).first()), 40) + t |> equal(_fold(each_kv(tab)._where(_.key == 99)._select(_.value).first_or_default(-1)), -1) + t |> equal(_fold(each_kv(tab)._where(_.key == 4)._select("{_.key}:{_.value}").first()), "4:40") + delete tab + } + t |> run("keys lane probes + set form contains") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(keys(tab)._where(_ == 5).any()), true) + t |> equal(_fold(keys(tab)._where(_ == 5).first()), 5) + t |> equal(_fold(keys(tab)._where(_ == 5)._select(_ * 100).first()), 500) + t |> equal(_fold(keys(tab).contains(5)), true) + t |> equal(_fold(keys(tab).contains(55)), false) + delete tab + var s : table <- { "x", "y" } + t |> equal(_fold(keys(s).contains("y")), true) + t |> equal(_fold(keys(s).contains("z")), false) + delete s + } + t |> run("first probe panics on a missing key, like the scan") @(t : T?) { + var tab <- make_int_table(4) + var panicked = false + try { + let _r = _fold(each_kv(tab)._where(_.key == 99).first()) + } recover { + panicked = true + } + t |> equal(panicked, true) + delete tab + } + t |> run("non-probe shapes stay scans and stay correct") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._where(_.key != 5).count()), 9) // wrong operator + t |> equal(_fold(each_kv(tab)._where(_.key == _.value / 10).count()), 10) // X references the binder + t |> equal(_fold(each_kv(tab)._where(_.key == 5)._where(_.value > 0).any()), true) // collapses to a compound && predicate + delete tab + } + t |> run("impure X stays a scan — per-element evaluation preserved") @(t : T?) { + var tab <- make_int_table(4) + var evals = 0 + t |> equal(_fold(each_kv(tab)._where(_.key == bump_key(evals)).count()), 1) + t |> equal(evals, 4, "side-effectful X must evaluate per element, not once") + delete tab + } +} + // Tier-2 over the raw each_kv iterator (no _fold) — the [unsafe_outside_of_for] contract requires the // explicit unsafe(...) wrap at a bare chain head; fused chains rewrite the head before inference. From 2742f6db2fe386541bbc9952a08ac97327907368 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 02:32:18 -0700 Subject: [PATCH 07/11] =?UTF-8?q?linq=5Ffold:=20table=20joins=20=E2=80=94?= =?UTF-8?q?=20adapter-generalized=20emit=5Farray=5Fjoin=20+=20table-srcB?= =?UTF-8?q?=20key=20probe?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stage 5 of the table arc (benchmarks/sql/LINQ_TO_TABLE.md). Two halves: 1. Lead generalization: emit_array_join takes its lead loop, bind name, and lead invoke-param spelling from the adapter (wrap_source_loop / bind_name / new SourceAdapter.invoke_param_type), so TableAdapter sets can_join=true and routes emit_join_hook to the same emitter — table-lead joins walk the kv usage-pruned slot iterators (a join touching only c.value.* walks values(tab) alone), group joins stay outer over every slot. 2. Table-srcB probe: a join whose srcb is each_kv(tab)/keys(set) joined on its bare key skips the internal table> + build loop — srcB binds the user's table and the per-A probe is a key lookup, usage-pruned like the point-lookup fold (count/key-only -> key_exists, value shapes -> by-ref bind off tab?[k], whole-pair -> kv tuple). Unique table keys make probe == hash semantics exactly; non-bare keybs and group joins keep the hashed build. Per-pair statements factored into build_join_pair_core, shared by build_join_standalone_pieces (group-join arm + bucket wrap unchanged for the decs/xml/json callers) and the new build_join_probe_pieces. m7 sweep: join_count 195.0 -> 65.6 ns/elem INTERP, join_where_count 229.1 -> 81.4; new join_probe 47.3 vs join_probe_build 79.1 (probe ~1.7x on identical rows). Tests: fused-vs-hand-loop agreement both leads, probe shapes, declines (non-bare keyb, group join), %linq! set-srcB + into forms. INTERP 10947/0, AOT+JIT linq 1949/1949, Sphinx -W clean. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 37 +- benchmarks/sql/results.md | 280 ++++++------- benchmarks/sql/table.das | 43 ++ daslib/linq_fold.md | 1 + daslib/linq_fold_common.das | 414 +++++++++++++++----- daslib/linq_fold_table.das | 12 +- doc/source/reference/linq_das.rst | 5 + doc/source/reference/linq_fold_patterns.rst | 47 ++- tests/linq/test_linq_das.das | 28 +- tests/linq/test_linq_table_source.das | 190 +++++++++ 10 files changed, 799 insertions(+), 258 deletions(-) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 77d3efdf1..6cd55c533 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,8 +4,35 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 4 committed** (point-lookup folds; stage 3 = `%linq!` table sources, 29d23baf6; -stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba). +Status: **stage 5 committed** (join probe + table-lead joins; stage 4 = point-lookup folds, +ac441c4a0; stage 3 = `%linq!` table sources, 29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; +stage 1 = `each_kv` builtin, 8751bb9ba). + +Stage 5 findings: +- **`emit_array_join` generalized instead of a parallel `emit_table_join`**: the lead loop, bind + name, and lead invoke-param spelling now come from the adapter (`wrap_source_loop` / + `bind_name` / new `invoke_param_type` capability), so `TableAdapter.emit_join_hook` just routes + to `emit_array_join` and the kv usage-pruner sees the whole probe body for free — a table-lead + join touching only `c.value.*` walks `values(tab)` alone. Any future direct-return loop source + joins the same way; decs/xml/json keep their own hooks (nested-callback walks). +- **srcB probe**: `join_srcb_table_call` (each_kv/keys over a table in the srcb slot) + + `join_keyb_is_bare_key` (peeled keyb is bare `d.key` / bare set element) switch the emitter to + `build_join_probe_pieces` — srcB binds the user's table (const param), no internal + `table>`, no build loop; the per-A probe usage-prunes like the point lookup + (count-no-where / key-only → `key_exists`, value shapes → by-ref bind off `tab?[k]`, whole-pair + → kv tuple bind). Skipping keyb's per-B evaluation is unobservable (a bare field read is pure + by construction — no `has_sideeffects` gate needed, unlike stage 4's X). +- **Shared per-pair core**: `build_join_pair_core` factored out of `build_join_standalone_pieces` + (which keeps the group-join arm + bucket wrap); both builders emit identical per-pair + statements, so hash-mode AST is unchanged for the decs/xml/json callers of the standalone + builder. Group joins never probe — their result consumes the whole bucket. +- The `_join` predicate splitter is **position-based** (` == `); a flipped + `d.key == a` fails to compile for any source (pre-existing). The probe matcher therefore only + sees keyb on the b-side. +- m7 (2026-06-11 sweep): table-lead joins leave tier-2 — `join_count` 195.0 → 65.6 ns/elem + INTERP (33.1 JIT), `join_where_count` 229.1 → 81.4 (37.9 JIT). The probe A/B pair: + `join_probe` 47.3 vs `join_probe_build` 79.1 INTERP (24.2 vs 38.1 JIT) — skipping the + internal hash is ~1.7× on identical rows. Stage 4 findings: - `try_table_point_lookup` (linq_fold_table.das) runs in the dispatcher arm BEFORE pattern dispatch; @@ -172,6 +199,12 @@ End of arc: `skills/linq.md` + linq docs mention the table source. values, buffer `(orderKey, key)` surrogates and materialize survivors via `tab?[key]` — K probes instead of N value copies. The table handle is its key; clean fit for the existing 4-hook surface. Revisit once m7 numbers show whether it matters. +- **decs/xml/json lead × table srcB probe**: those leads keep their own `emit_join_hook` + (nested-callback walks) and hash a table srcB like any iterator. Correct, just unprobed — + port `build_join_probe_pieces` into their hooks if a real chain wants it. +- **Group-join probe**: a table srcB group join could bind a 0/1-element bucket from the probe + instead of hashing; the result lambda consumes `array`, so it needs a synthesized + one-element array per hit. Hashed build is correct; revisit on demand. - Set-ops probe (`except`/`intersect` where the *other* side is a `table`) — rides the engine-wide set-ops edge. - Fused-kv-over-non-copyable values (loosening the uniform gate) — only if a real use case diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index aede8f85a..4a9015f62 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -17,8 +17,9 @@ are stable now). - **m6f JSON** — `_fold` over `from_json(jv, type)` (`JsonAdapter`, same machinery, array walk). - **m7 Table** — `_fold` over `each_kv(table)` (`TableAdapter`; kv usage-pruning picks keys-only / values-only / zipped slot walks; key-equality `where` + terminator folds to an O(1) probe — the - `point_lookup` / `point_lookup_scan` pair measures it; group_by / join / reverse defer to tier-2 - until their stages land). + `point_lookup` / `point_lookup_scan` pair measures it; joins fuse on either side, and a table srcB + joined on its bare key probes the table instead of building the join hash — the `join_probe` / + `join_probe_build` pair measures it; group_by / reverse defer to tier-2 until their stages land). `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`. @@ -33,167 +34,171 @@ signal, JIT deltas as indicative.** | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 34.9 | 5.9 | 5.8 | 60.7 | 160.3 | 19.1 | -| `all_match` | 27.5 | 3.5 | 3.4 | 55.9 | 154.1 | 15.8 | +| `aggregate_match` | 34.8 | 5.9 | 5.8 | 60.6 | 159.5 | 19.2 | +| `all_match` | 27.5 | 3.5 | 3.4 | 56.1 | 154.1 | 16.4 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.5 | 5.9 | 8.8 | 60.2 | 163.1 | 17.3 | -| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 29.2 | -| `bare_order_where` | 278.2 | 117.7 | 126.7 | 299.6 | 292.7 | 166.4 | -| `chained_select_collapse` | — | 17.7 | 17.4 | 70.4 | 168.3 | 27.8 | -| `chained_where` | 35.9 | 6.6 | 7.1 | 104.9 | 184.0 | 24.1 | -| `contains_match` | 0.0 | 2.3 | 1.5 | 29.1 | 72.4 | 6.6 | -| `count_aggregate` | 30.0 | 4.1 | 4.2 | 63.7 | 155.2 | 20.2 | -| `cross_join` | 12604.3 | 3685.2 | — | 4006.6 | 4040.5 | — | +| `average_aggregate` | 30.6 | 5.9 | 8.8 | 58.4 | 164.3 | 17.3 | +| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.6 | +| `bare_order_where` | 284.5 | 117.8 | 126.7 | 300.9 | 291.5 | 163.8 | +| `chained_select_collapse` | — | 18.3 | 17.5 | 70.4 | 162.2 | 28.0 | +| `chained_where` | 36.1 | 6.6 | 7.1 | 104.9 | 183.8 | 24.1 | +| `contains_match` | 0.0 | 2.2 | 1.4 | 29.1 | 72.0 | 6.6 | +| `count_aggregate` | 29.8 | 4.1 | 4.1 | 63.7 | 155.9 | 20.3 | +| `cross_join` | 12556.2 | 3697.8 | — | 4012.8 | 4069.8 | — | | `decs_count_bare_pred` | — | — | 4.1 | — | — | — | -| `distinct_by_count` | 40.9 | 15.6 | 15.6 | 70.6 | 162.2 | 26.3 | -| `distinct_by_order_take` | 239.3 | 22.1 | 23.4 | 123.3 | 162.4 | 48.6 | -| `distinct_by_order_to_array` | 237.8 | 22.1 | 23.5 | 124.1 | 163.3 | 48.6 | -| `distinct_count` | 41.2 | 15.8 | 15.7 | 70.8 | 163.6 | 26.9 | -| `distinct_count_pred` | 252.2 | 15.7 | 15.9 | 112.1 | 178.4 | 26.3 | +| `distinct_by_count` | 41.0 | 15.7 | 15.6 | 70.6 | 160.7 | 26.6 | +| `distinct_by_order_take` | 239.3 | 22.1 | 23.4 | 123.7 | 163.1 | 48.5 | +| `distinct_by_order_to_array` | 238.9 | 22.1 | 23.5 | 124.2 | 163.1 | 48.8 | +| `distinct_count` | 41.0 | 15.8 | 15.8 | 70.8 | 162.4 | 27.0 | +| `distinct_count_pred` | 254.3 | 15.8 | 15.9 | 112.2 | 177.8 | 26.8 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 170.5 | 29.3 | 29.3 | 122.7 | 197.8 | — | -| `groupby_count` | 141.6 | 19.5 | 19.4 | 74.7 | 169.0 | 163.3 | -| `groupby_first` | 252.3 | 19.4 | 20.1 | 71.8 | 163.5 | — | -| `groupby_having_count` | 141.3 | 19.5 | 19.5 | 74.3 | 169.3 | — | -| `groupby_having_hidden_sum` | 175.7 | 22.4 | 22.6 | 118.5 | 192.1 | — | -| `groupby_having_post_where` | 171.6 | 20.8 | 21.6 | 114.8 | 188.9 | — | -| `groupby_max` | 174.1 | 24.7 | 25.6 | 119.8 | 192.6 | — | -| `groupby_min` | 173.5 | 25.1 | 26.2 | 119.9 | 193.4 | — | -| `groupby_multi_reducer` | 189.9 | 30.2 | 30.6 | 125.1 | 196.0 | — | -| `groupby_select_order` | 174.3 | 20.8 | 20.8 | 114.6 | 189.8 | — | -| `groupby_select_sum` | 197.9 | 38.5 | 40.7 | 101.5 | 196.1 | — | -| `groupby_sum` | 171.2 | 20.7 | 20.8 | 115.0 | 190.5 | 192.9 | -| `groupby_where_count` | 75.7 | 14.0 | 14.3 | 115.5 | 187.7 | — | -| `groupby_where_sum` | 86.5 | 14.1 | 14.7 | 116.3 | 186.7 | — | -| `join_count` | 38.3 | 51.2 | 64.3 | 113.1 | 184.5 | 194.6 | -| `join_groupby_count` | 157.7 | 79.1 | 88.6 | 177.7 | 232.0 | — | -| `join_groupby_to_array` | 189.0 | 78.1 | 90.1 | 215.3 | 215.6 | — | -| `join_select` | 151.5 | 72.6 | 85.0 | 188.5 | 215.8 | — | -| `join_where_count` | 48.8 | 61.5 | 76.7 | 160.0 | 201.9 | 229.1 | -| `last_match` | 0.0 | 5.9 | 13.9 | 65.1 | 159.0 | 30.9 | -| `long_count_aggregate` | 28.9 | 4.1 | 4.2 | 63.3 | 154.6 | 20.3 | -| `max_aggregate` | 30.7 | 6.0 | 6.9 | 58.7 | 163.1 | 17.0 | -| `min_aggregate` | 30.6 | 6.0 | 6.9 | 58.6 | 163.3 | 17.1 | -| `order_by_multi_key` | 339.9 | 271.4 | 283.6 | 458.8 | 446.1 | 334.3 | -| `order_distinct_take` | 137.9 | 15.9 | 100.3 | 72.5 | 164.1 | 31.1 | -| `order_reverse_normalized` | 38.3 | 16.2 | 20.3 | 70.7 | 170.9 | — | -| `order_take_desc` | 38.2 | 16.2 | 20.6 | 70.1 | 170.2 | 33.3 | +| `groupby_average` | 171.8 | 29.2 | 29.3 | 123.7 | 197.4 | — | +| `groupby_count` | 141.9 | 19.5 | 19.5 | 75.0 | 167.5 | 162.7 | +| `groupby_first` | 252.6 | 19.5 | 20.2 | 72.2 | 162.7 | — | +| `groupby_having_count` | 141.8 | 19.5 | 19.5 | 74.8 | 169.1 | — | +| `groupby_having_hidden_sum` | 175.7 | 23.3 | 22.6 | 118.8 | 192.7 | — | +| `groupby_having_post_where` | 171.2 | 20.8 | 20.8 | 114.6 | 189.2 | — | +| `groupby_max` | 173.9 | 24.9 | 25.4 | 120.5 | 193.1 | — | +| `groupby_min` | 173.7 | 25.0 | 25.1 | 120.0 | 192.9 | — | +| `groupby_multi_reducer` | 190.8 | 30.2 | 30.6 | 124.9 | 196.2 | — | +| `groupby_select_order` | 170.9 | 20.8 | 20.8 | 114.8 | 188.6 | — | +| `groupby_select_sum` | 198.9 | 38.6 | 38.2 | 101.7 | 195.2 | — | +| `groupby_sum` | 170.8 | 20.8 | 20.8 | 114.9 | 188.4 | 192.8 | +| `groupby_where_count` | 76.0 | 14.1 | 14.3 | 116.6 | 186.3 | — | +| `groupby_where_sum` | 86.7 | 14.1 | 14.7 | 116.4 | 186.4 | — | +| `join_count` | 38.3 | 51.3 | 64.6 | 113.1 | 183.4 | 65.6 | +| `join_groupby_count` | 157.6 | 77.4 | 88.8 | 177.7 | 230.9 | — | +| `join_groupby_to_array` | 189.1 | 78.0 | 90.6 | 215.4 | 213.5 | — | +| `join_probe` | — | — | — | — | — | 47.3 | +| `join_probe_build` | — | — | — | — | — | 79.1 | +| `join_select` | 152.6 | 72.5 | 84.7 | 188.7 | 214.4 | — | +| `join_where_count` | 48.6 | 61.6 | 76.8 | 160.4 | 199.8 | 81.4 | +| `last_match` | 0.0 | 6.1 | 13.9 | 65.1 | 159.7 | 31.0 | +| `long_count_aggregate` | 29.1 | 4.1 | 4.1 | 63.4 | 154.3 | 21.2 | +| `max_aggregate` | 30.7 | 6.0 | 6.8 | 58.6 | 163.1 | 17.0 | +| `min_aggregate` | 31.2 | 6.0 | 6.9 | 58.7 | 163.6 | 17.0 | +| `order_by_multi_key` | 348.8 | 272.2 | 282.9 | 458.7 | 449.2 | 334.0 | +| `order_distinct_take` | 137.8 | 15.9 | 99.3 | 72.5 | 162.8 | 31.3 | +| `order_reverse_normalized` | 38.1 | 16.3 | 20.0 | 70.7 | 170.6 | — | +| `order_take_desc` | 38.5 | 16.2 | 20.4 | 70.1 | 170.4 | 33.3 | | `point_lookup` | — | — | — | — | — | 0.0 | | `point_lookup_scan` | — | — | — | — | — | 8.4 | -| `reverse_distinct_by` | 294.0 | 21.1 | 28.1 | 71.1 | 162.6 | — | -| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.9 | +| `reverse_distinct_by` | 295.5 | 21.3 | 28.0 | 70.9 | 162.2 | — | +| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.2 | 58.8 | | `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | — | -| `select_count` | 0.1 | 0.0 | 2.2 | 68.3 | 2.2 | 0.0 | -| `select_many` | — | 191.5 | — | — | — | — | -| `select_where` | 197.5 | 11.2 | 19.4 | 195.6 | 183.7 | 37.5 | -| `select_where_count` | 32.2 | 5.1 | 7.5 | 64.8 | 157.1 | 21.9 | -| `select_where_order_take` | 36.2 | 12.2 | 15.1 | 72.5 | 165.1 | 34.5 | -| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.4 | 162.2 | 23.3 | -| `single_match` | 0.0 | 2.9 | 5.5 | 58.5 | 151.1 | 22.8 | +| `select_count` | 0.1 | 0.0 | 2.2 | 69.3 | 2.2 | 0.0 | +| `select_many` | — | 190.7 | — | — | — | — | +| `select_where` | 207.9 | 11.2 | 19.5 | 195.5 | 188.7 | 37.6 | +| `select_where_count` | 32.4 | 5.1 | 7.4 | 64.6 | 158.7 | 21.7 | +| `select_where_order_take` | 36.3 | 12.3 | 15.1 | 72.7 | 164.5 | 34.5 | +| `select_where_sum` | 37.2 | 7.5 | 7.5 | 66.5 | 164.6 | 23.3 | +| `single_match` | 0.0 | 2.9 | 5.5 | 58.4 | 151.5 | 22.6 | | `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | 0.3 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 60.2 | 153.8 | 18.3 | -| `sort_first` | 38.0 | 11.1 | 13.3 | 65.0 | 167.1 | 31.7 | -| `sort_take` | 38.2 | 16.3 | 21.1 | 70.2 | 170.7 | 33.2 | -| `sort_take_select` | 38.1 | 16.3 | 21.8 | 71.1 | 170.6 | 33.3 | -| `sum_aggregate` | 30.6 | 2.1 | 2.1 | 54.8 | 152.8 | 13.5 | -| `sum_where` | 32.9 | 4.4 | 4.3 | 63.4 | 154.2 | 20.6 | -| `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 59.9 | 153.1 | 18.3 | +| `sort_first` | 37.9 | 11.0 | 13.3 | 64.9 | 167.0 | 32.0 | +| `sort_take` | 38.4 | 16.3 | 20.9 | 70.5 | 171.5 | 33.3 | +| `sort_take_select` | 38.2 | 16.3 | 20.9 | 71.0 | 170.8 | 33.2 | +| `sum_aggregate` | 29.6 | 2.1 | 2.1 | 54.4 | 153.0 | 13.5 | +| `sum_where` | 32.1 | 4.4 | 11.5 | 63.8 | 154.6 | 21.3 | +| `take_count` | 3.9 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | 0.3 | | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | -| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.3 | 75.4 | 16.5 | -| `to_array_filter` | 70.0 | 11.8 | 11.7 | 71.3 | 164.9 | 28.7 | -| `where_join_count` | 41.2 | 29.0 | 42.0 | 132.1 | 168.9 | — | -| `zip_count_pred` | 39.2 | 15.9 | — | 313.8 | 322.0 | — | -| `zip_dot_product` | 46.1 | 12.6 | 10.6 | 308.6 | 319.3 | — | -| `zip_dot_product_3arg` | 46.1 | 12.8 | — | 309.7 | 319.0 | — | -| `zip_reverse_to_array` | — | 31.6 | — | 343.4 | 353.5 | — | +| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.2 | 75.6 | 16.4 | +| `to_array_filter` | 70.2 | 11.8 | 11.8 | 71.5 | 165.1 | 29.0 | +| `where_join_count` | 41.2 | 29.1 | 41.7 | 132.7 | 168.6 | — | +| `zip_count_pred` | 39.3 | 15.9 | — | 315.0 | 321.2 | — | +| `zip_dot_product` | 46.2 | 12.6 | 10.6 | 309.2 | 319.0 | — | +| `zip_dot_product_3arg` | 46.2 | 12.8 | — | 309.4 | 320.7 | — | +| `zip_reverse_to_array` | — | 31.7 | — | 345.0 | 353.4 | — | ## JIT | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.1 | 0.3 | 0.6 | 21.8 | 26.0 | 13.4 | -| `all_match` | 27.8 | 0.3 | 0.2 | 18.1 | 25.2 | 13.5 | +| `aggregate_match` | 35.0 | 0.3 | 0.6 | 21.7 | 27.1 | 13.5 | +| `all_match` | 27.9 | 0.3 | 0.2 | 18.1 | 26.2 | 13.5 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.1 | 1.0 | 3.6 | 18.1 | 24.6 | 13.5 | -| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.1 | -| `bare_order_where` | 185.6 | 34.0 | 35.2 | 106.5 | 53.5 | 78.9 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.4 | 14.0 | -| `chained_where` | 36.2 | 0.6 | 0.8 | 35.6 | 31.4 | 17.7 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 9.0 | 4.7 | -| `count_aggregate` | 29.3 | 0.3 | 0.6 | 20.5 | 25.3 | 13.5 | -| `cross_join` | 5962.8 | 733.1 | — | 836.0 | 773.4 | — | +| `average_aggregate` | 30.5 | 1.0 | 3.6 | 18.1 | 25.7 | 13.5 | +| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.2 | +| `bare_order_where` | 188.1 | 35.3 | 35.5 | 106.7 | 53.3 | 79.0 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.5 | 14.1 | +| `chained_where` | 36.1 | 0.6 | 0.8 | 35.7 | 32.0 | 17.7 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 9.2 | 4.7 | +| `count_aggregate` | 29.6 | 0.3 | 0.6 | 20.6 | 26.4 | 13.5 | +| `cross_join` | 5976.1 | 733.7 | — | 837.5 | 767.7 | — | | `decs_count_bare_pred` | — | — | 0.6 | — | — | — | -| `distinct_by_count` | 41.2 | 1.1 | 1.1 | 20.6 | 33.3 | 14.0 | -| `distinct_by_order_take` | 237.1 | 1.7 | 2.6 | 47.4 | 39.1 | 30.3 | -| `distinct_by_order_to_array` | 242.4 | 1.8 | 2.6 | 47.4 | 38.7 | 30.3 | -| `distinct_count` | 40.9 | 1.1 | 1.1 | 20.6 | 33.3 | 14.0 | -| `distinct_count_pred` | 250.6 | 1.1 | 1.3 | 37.7 | 43.5 | 14.0 | +| `distinct_by_count` | 41.2 | 1.1 | 1.1 | 20.6 | 33.6 | 14.1 | +| `distinct_by_order_take` | 239.4 | 1.7 | 2.6 | 47.4 | 39.2 | 30.1 | +| `distinct_by_order_to_array` | 239.3 | 1.7 | 2.7 | 47.4 | 38.9 | 30.1 | +| `distinct_count` | 41.3 | 1.1 | 1.1 | 20.5 | 33.7 | 14.1 | +| `distinct_count_pred` | 252.4 | 1.1 | 1.3 | 37.4 | 43.4 | 14.1 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | +| `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.1 | 1.6 | 1.9 | 35.9 | 45.7 | — | -| `groupby_count` | 141.4 | 1.3 | 1.5 | 20.6 | 33.9 | 45.7 | -| `groupby_first` | 250.7 | 1.3 | 2.3 | 20.6 | 34.3 | — | -| `groupby_having_count` | 141.5 | 1.3 | 1.5 | 20.6 | 33.8 | — | -| `groupby_having_hidden_sum` | 174.4 | 1.5 | 1.7 | 35.9 | 45.3 | — | -| `groupby_having_post_where` | 170.1 | 1.4 | 2.0 | 35.8 | 44.2 | — | -| `groupby_max` | 175.6 | 1.5 | 2.0 | 36.0 | 46.0 | — | -| `groupby_min` | 172.4 | 1.5 | 1.8 | 36.0 | 46.0 | — | -| `groupby_multi_reducer` | 189.6 | 1.6 | 2.0 | 36.1 | 46.1 | — | -| `groupby_select_order` | 170.1 | 1.4 | 1.9 | 35.9 | 44.3 | — | -| `groupby_select_sum` | 197.0 | 2.8 | 3.2 | 32.2 | 40.0 | — | -| `groupby_sum` | 170.5 | 1.4 | 1.6 | 35.9 | 43.4 | 54.2 | -| `groupby_where_count` | 75.6 | 0.9 | 1.3 | 36.0 | 41.7 | — | -| `groupby_where_sum` | 86.2 | 0.9 | 1.3 | 35.9 | 41.7 | — | -| `join_count` | 38.2 | 11.0 | 11.7 | 43.6 | 71.4 | 63.1 | -| `join_groupby_count` | 156.8 | 18.0 | 20.1 | 68.5 | 90.1 | — | -| `join_groupby_to_array` | 189.5 | 17.4 | 19.4 | 80.5 | 36.0 | — | -| `join_select` | 93.2 | 19.6 | 21.7 | 74.8 | 94.5 | — | -| `join_where_count` | 48.3 | 19.0 | 20.7 | 64.5 | 78.3 | 80.0 | -| `last_match` | 0.0 | 0.5 | 1.4 | 18.8 | 25.9 | 22.9 | -| `long_count_aggregate` | 28.8 | 0.3 | 0.6 | 20.6 | 25.4 | 13.5 | -| `max_aggregate` | 30.5 | 0.3 | 0.5 | 18.3 | 26.7 | 13.4 | -| `min_aggregate` | 30.6 | 0.3 | 0.5 | 18.3 | 26.6 | 13.5 | -| `order_by_multi_key` | 249.4 | 53.4 | 54.8 | 125.6 | 71.1 | 129.8 | -| `order_distinct_take` | 138.1 | 1.1 | 75.6 | 20.9 | 35.8 | 14.0 | -| `order_reverse_normalized` | 38.0 | 0.7 | 1.4 | 24.6 | 27.6 | — | -| `order_take_desc` | 37.9 | 0.7 | 1.3 | 24.6 | 27.9 | 17.8 | +| `groupby_average` | 170.7 | 1.6 | 1.9 | 35.9 | 44.3 | — | +| `groupby_count` | 141.5 | 1.3 | 1.5 | 20.6 | 32.7 | 42.9 | +| `groupby_first` | 252.2 | 1.3 | 2.3 | 20.6 | 33.3 | — | +| `groupby_having_count` | 141.3 | 1.3 | 1.5 | 20.6 | 33.3 | — | +| `groupby_having_hidden_sum` | 175.6 | 1.5 | 1.7 | 36.0 | 45.2 | — | +| `groupby_having_post_where` | 171.9 | 1.6 | 2.0 | 35.9 | 44.3 | — | +| `groupby_max` | 172.8 | 1.5 | 1.9 | 36.0 | 45.9 | — | +| `groupby_min` | 173.4 | 1.5 | 1.8 | 35.9 | 45.9 | — | +| `groupby_multi_reducer` | 190.6 | 1.6 | 2.0 | 36.2 | 46.1 | — | +| `groupby_select_order` | 170.6 | 1.4 | 1.9 | 35.7 | 44.2 | — | +| `groupby_select_sum` | 198.6 | 2.8 | 3.2 | 32.2 | 39.7 | — | +| `groupby_sum` | 170.3 | 1.4 | 1.7 | 35.8 | 44.2 | 51.5 | +| `groupby_where_count` | 76.0 | 0.9 | 1.3 | 36.1 | 41.8 | — | +| `groupby_where_sum` | 86.7 | 0.9 | 1.3 | 36.0 | 41.7 | — | +| `join_count` | 38.3 | 10.9 | 11.7 | 43.5 | 71.4 | 33.1 | +| `join_groupby_count` | 157.6 | 18.2 | 20.1 | 68.5 | 89.9 | — | +| `join_groupby_to_array` | 189.7 | 17.6 | 19.5 | 80.3 | 36.2 | — | +| `join_probe` | — | — | — | — | — | 24.2 | +| `join_probe_build` | — | — | — | — | — | 38.1 | +| `join_select` | 95.4 | 19.7 | 21.7 | 75.0 | 94.3 | — | +| `join_where_count` | 39.4 | 18.9 | 20.8 | 64.4 | 78.4 | 37.9 | +| `last_match` | 0.0 | 0.5 | 1.4 | 18.9 | 26.8 | 22.9 | +| `long_count_aggregate` | 29.0 | 0.3 | 0.6 | 20.5 | 26.4 | 13.5 | +| `max_aggregate` | 30.7 | 0.3 | 0.5 | 18.4 | 27.7 | 13.5 | +| `min_aggregate` | 30.7 | 0.3 | 0.5 | 18.4 | 27.7 | 13.5 | +| `order_by_multi_key` | 252.6 | 53.4 | 55.0 | 125.4 | 71.9 | 129.1 | +| `order_distinct_take` | 137.9 | 1.1 | 75.7 | 20.9 | 36.0 | 14.0 | +| `order_reverse_normalized` | 38.2 | 0.7 | 1.4 | 24.6 | 28.5 | — | +| `order_take_desc` | 38.1 | 0.7 | 1.4 | 24.6 | 28.4 | 17.7 | | `point_lookup` | — | — | — | — | — | 0.0 | -| `point_lookup_scan` | — | — | — | — | — | 6.1 | -| `reverse_distinct_by` | 295.6 | 1.6 | 3.2 | 20.6 | 34.3 | — | +| `point_lookup_scan` | — | — | — | — | — | 6.0 | +| `reverse_distinct_by` | 295.4 | 1.5 | 3.2 | 20.6 | 34.6 | — | | `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 26.9 | -| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | — | -| `select_count` | 0.1 | 0.0 | 0.0 | 68.7 | 0.0 | 0.0 | -| `select_many` | — | 64.0 | — | — | — | — | -| `select_where` | 110.6 | 4.2 | 5.3 | 76.5 | 22.0 | 28.1 | -| `select_where_count` | 32.3 | 0.3 | 0.6 | 18.6 | 26.7 | 13.5 | -| `select_where_order_take` | 37.1 | 0.7 | 1.4 | 19.1 | 27.4 | 23.0 | -| `select_where_sum` | 36.9 | 0.4 | 0.6 | 18.2 | 25.2 | 13.4 | -| `single_match` | 0.0 | 0.4 | 1.1 | 46.3 | 22.2 | 17.3 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.2 | -| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.7 | 21.7 | 13.3 | -| `sort_first` | 38.3 | 0.4 | 1.3 | 18.2 | 26.7 | 17.3 | -| `sort_take` | 38.2 | 0.7 | 1.4 | 24.7 | 27.8 | 17.8 | -| `sort_take_select` | 37.6 | 0.7 | 1.4 | 24.7 | 27.8 | 17.8 | -| `sum_aggregate` | 29.3 | 0.3 | 0.1 | 23.4 | 24.6 | 13.5 | -| `sum_where` | 31.8 | 0.3 | 0.6 | 18.6 | 26.4 | 13.4 | -| `take_count` | 1.9 | 0.1 | 0.1 | 1.2 | 0.2 | 0.2 | -| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.2 | +| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.9 | — | +| `select_count` | 0.1 | 0.0 | 0.0 | 66.0 | 0.0 | 0.0 | +| `select_many` | — | 62.7 | — | — | — | — | +| `select_where` | 109.1 | 4.1 | 5.3 | 76.2 | 23.0 | 28.1 | +| `select_where_count` | 32.3 | 0.3 | 0.6 | 18.5 | 27.2 | 13.4 | +| `select_where_order_take` | 36.5 | 0.7 | 1.4 | 19.0 | 27.9 | 23.0 | +| `select_where_sum` | 37.1 | 0.4 | 0.6 | 18.0 | 26.3 | 13.4 | +| `single_match` | 0.0 | 0.4 | 1.1 | 46.3 | 23.2 | 17.4 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 | +| `skip_while_match` | 3.5 | 0.4 | 0.4 | 45.8 | 22.7 | 13.3 | +| `sort_first` | 37.9 | 0.4 | 1.3 | 18.1 | 27.5 | 17.3 | +| `sort_take` | 37.9 | 0.7 | 1.4 | 24.6 | 28.3 | 17.8 | +| `sort_take_select` | 37.8 | 0.7 | 1.4 | 24.6 | 28.4 | 17.8 | +| `sum_aggregate` | 29.9 | 0.3 | 0.1 | 23.2 | 25.6 | 13.5 | +| `sum_where` | 32.1 | 0.3 | 0.6 | 18.5 | 27.2 | 13.4 | +| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.2 | +| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.1 | | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | | `take_where_count` | 0.9 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_while_match` | 7.7 | 0.2 | 0.3 | 17.3 | 9.0 | 13.4 | -| `to_array_filter` | 48.2 | 3.2 | 3.3 | 21.6 | 35.0 | 20.4 | -| `where_join_count` | 41.2 | 5.8 | 6.7 | 49.6 | 41.9 | — | -| `zip_count_pred` | 38.6 | 0.1 | — | 117.0 | 33.9 | — | -| `zip_dot_product` | 46.0 | 0.1 | 0.1 | 116.8 | 33.8 | — | -| `zip_dot_product_3arg` | 45.9 | 0.1 | — | 116.8 | 33.7 | — | -| `zip_reverse_to_array` | — | 4.6 | — | 128.3 | 51.4 | — | +| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.1 | 9.3 | 13.5 | +| `to_array_filter` | 47.4 | 3.3 | 3.3 | 21.5 | 35.1 | 20.2 | +| `where_join_count` | 39.4 | 5.8 | 6.8 | 49.7 | 42.3 | — | +| `zip_count_pred` | 39.4 | 0.1 | — | 117.0 | 33.9 | — | +| `zip_dot_product` | 46.5 | 0.1 | 0.1 | 117.1 | 33.8 | — | +| `zip_dot_product_3arg` | 46.4 | 0.1 | — | 116.9 | 33.7 | — | +| `zip_reverse_to_array` | — | 4.5 | — | 128.4 | 51.3 | — | ## Missing lanes (the `—` cells) @@ -209,8 +214,9 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk). - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) -- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` and joins beyond `join_count`/`join_where_count` (table group_by/join fusion is staged — see `LINQ_TO_TABLE.md`; the four marker cells track the tier-2 cost until then), `decs_count_bare_pred` (decs-only). +- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` (table group_by fusion is staged — see `LINQ_TO_TABLE.md`; the two marker cells track the tier-2 cost until then) plus the join-composition lanes (`join_select` / `where_join_count` would fuse today but aren't instantiated; `join_groupby_*` needs the staged group_by), `decs_count_bare_pred` (decs-only). - **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design. +- **`join_probe` / `join_probe_build` non-m7** — m7-only A/B pair: a table srcB joined on its bare key probes the user's table per lead row (no internal join hash, no build loop); the `_build` twin feeds the identical rows pre-materialized to a kv array, forcing the hashed build. Other sources have no keyed-srcB analog by design. ## Accepted floors diff --git a/benchmarks/sql/table.das b/benchmarks/sql/table.das index 2b49e1c31..e33e7ee64 100644 --- a/benchmarks/sql/table.das +++ b/benchmarks/sql/table.das @@ -10,20 +10,31 @@ require _common public let N = 100000 typedef CarKV = tuple +typedef DKV = tuple var g_t : table var g_dealers : array +var g_dealer_t : table // dealers keyed by id — the join_probe srcB +var g_dealer_kv : array // same rows pre-materialized in slot order — the build-side baseline [init] def table_bench_init { g_t <- fixture_table(N) g_dealers <- fixture_dealers_array() + for (d in g_dealers) { + g_dealer_t |> insert(d.id, d) + } + for (k, v in keys(g_dealer_t), values(g_dealer_t)) { + g_dealer_kv |> push((key = k, value = v)) + } } [finalize] def table_bench_fini { delete g_t delete g_dealers + delete g_dealer_t + delete g_dealer_kv } [benchmark] @@ -292,6 +303,38 @@ def join_count_m7(b : B?) { } } +[benchmark] +def join_probe_m7(b : B?) { + // srcB is a table joined on its bare key → fused key probe, no internal join hash + b |> run("join_probe", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _join(unsafe(each_kv(g_dealer_t)), + $(c : CarKV, d : DKV) => c.value.dealer_id == d.key, + $(c : CarKV, d : DKV) => (CarPrice = c.value.price, DealerName = d.value.name)) + |> _where(_.CarPrice > 500) + |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + +[benchmark] +def join_probe_build_m7(b : B?) { + // build-side baseline: identical rows from the pre-materialized kv array — the join hashes srcB + b |> run("join_probe_build", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _join(g_dealer_kv, + $(c : CarKV, d : DKV) => c.value.dealer_id == d.key, + $(c : CarKV, d : DKV) => (CarPrice = c.value.price, DealerName = d.value.name)) + |> _where(_.CarPrice > 500) + |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} + [benchmark] def join_where_count_m7(b : B?) { b |> run("join_where_count", N) { diff --git a/daslib/linq_fold.md b/daslib/linq_fold.md index 8f313f4b4..67388d8db 100644 --- a/daslib/linq_fold.md +++ b/daslib/linq_fold.md @@ -660,6 +660,7 @@ The imperative code has a few subtle co-occurrence rules that may not map cleanl - **2026-05-31 (deferred materialization — handle-buffering for buffered reducers)** — the buffered reducers (`order_by`/`sort`/`reverse` + `take`/`first`, `distinct_by |> order_by`) materialized the full `Car` (its `name` clone) for *every* source element before the reducer kept only K — `from_xml_node` builds all N. Fix: the reducer buffers a cheap **surrogate** — `(orderKey, xml_node)` for the order emits, a bare `xml_node` for reverse (no key) — and `build_xml_row` runs only for the K survivors. The comparator is the fixed `_::less(a._0, b._0)` on the precomputed key; where/distinct are consumed during the walk (cheap field reads gate which elements get a surrogate), so they never enter the surrogate. **The abstraction is source-generic** (an "element handle"): the surrogate machinery + materialize-survivors tail live in `linq_fold_common` (`build_surrogate_type` / `build_surrogate_cmp` / `build_surrogate_materialize_loop`); each source supplies `defers_materialization()` + `handle_type()` + `current_handle_expr()` + `materialize_handle()`. Only `XmlAdapter` overrides them this PR (a future `linq_json` is just those 4 hooks); `array`/`decs` inherit the no-defer default and stay **byte-identical** (their backing store is pre-materialized, so their reducers already clone only ~K heap-entrants — confirmed by `benchmarks/micro/sort_distinct_take_shapes.das`, where the `array` pointer form is slower or tied). Wired into `emit_bounded_heap` (take), `emit_fused_prefilter` (distinct-only no-take arm — the pure-where case is already materialize-under-guard'd), `emit_streaming_min` (first), and `emit_reverse_buffer_inplace` (reverse + take). **Design validated by hand-coded micro-bench first** (`benchmarks/micro/sort_distinct_take_shapes.das`). Wins (m5f INTERP / JIT): `sort_take` 338 → 69 / 17, `order_take_desc` 343 → 69, `distinct_by_order_take` 354 → 126 / 46, `select_where_order_take` 228 → 71, `distinct_by_order_to_array` 356 → 131 / 46, `sort_first` 336 → 64 / 17, `reverse_take` 360 → 90 / 70 — string clones 100 000 → K everywhere. Not deferred (inherent floor / out of scope): `bare_order_where` (already at the under-guard survivor floor), `order_reverse_normalized` (order+reverse → all rows out), `reverse_distinct_by` (tier-2, no fused emit), `groupby_first` (group_by path). - **2026-05-31 (deferred materialization — `last` + group-by `first`)** — extends the element-handle deferral to the two remaining survivors-≪-N reducers: the full-walk `last`/`last_or_default` terminator (in `emit_early_exit_lane`) and `first`-per-group inside `plan_group_by_core`. `last` cloned the whole `Car` (`lst := it`) on *every* match and kept only the final one; over a deferring source it now stores the node **handle** per match and runs `materialize_handle` once, for the single survivor. `group_by(brand) |> select((key, first per group))` pinned the whole row (`slot := it`) in `mk_reducer_first`, forcing `wrap_source_loop` to build every element; a new `mk_reducer_first_deferred` materializes from the handle *inside the table miss-branch*, so the walk field-prunes to just the group key and `build_xml_row` runs only once per distinct group. Both ride the same four `SourceAdapter` hooks — only `XmlAdapter` defers; `array`/`decs` pass `null`/no-defer and stay byte-identical (the `emit_reducer_branches` adapter param defaults to `null`; the group-by gate also requires the bind be the raw element — `itName == bind_name`, i.e. no upstream `_select` rebinds it — since the handle yields the raw row). **Design validated by hand-coded micro-bench first** (the `last_match` / `groupby_first` lanes in `benchmarks/micro/sort_distinct_take_shapes.das`). Wins (m5f INTERP / JIT, string clones 100 000 → K): `last_match` 219 → 65 / 21 (K=1), `groupby_first` 339 → 72 / 22 (K=#brands). Closes `groupby_first` (the last item on the prior entry's floor list). Still not deferred: `bare_order_where` / `order_reverse_normalized` (all rows out), `reverse_distinct_by` (tier-2, no fused emit). - **2026-05-31 (forward keep-last — `reverse |> distinct[_by]` over forward sources)** — the only buffered shape still falling to tier-2 over a forward source. `reverse() |> distinct_by(K) |> to_array()` means "keep the LAST forward row per key, output in reverse-discovery order." The sole fused emit was `emit_reverse_backward_walk_dset_gate` — a backward **index** walk (`src[len-1-k]`) gated `array_source`, so XML / decs / plain iterators (forward-only, no random access) cascaded: `reverse()` materialized all N, then `distinct_by` walked. New `emit_reverse_distinct_forward_keeplast` (R-2b, gated by the exact complement `non_array_source`) does a single forward pass instead — `table`, **OVERWRITE** the slot per element (so it ends at the last forward occurrence + its seq), then sort survivors by **descending seq** (`build_surrogate_cmp(true)`) and emit. Output-identical to the backward walk (descending forward-index of each last occurrence), proven by parity vs both `m3f` (array backward walk) and the tier-2 cascade. It rides `emit_terminator_lane` → `wrap_source_loop`, so it's source-generic: **XML defers** (the table holds `(seq, xml_node)` and `build_xml_row` runs only for the K survivors — field-pruned to the key); **decs / iterator** store the full element (no handle), winning single-pass over the cascade's reverse-buffer + second walk. `ctx.top` is `null` for decs (bridge-driven), so `elemType` falls back to `ctx.src->element_type()`; arrays still match the backward-walk row first (registered earlier), so they're byte-identical. **Design validated by hand-coded micro-bench first** (the `reverse_distinct_by` lane in `benchmarks/micro/sort_distinct_take_shapes.das`: INTERP 405.8 → 88.6, JIT 162.6 → 37.0, string clones 100 000 → #keys). Wins: `reverse_distinct_by` m5f **429 → 74 INTERP / 166.6 → 22 JIT** (clones 100 000 → 5), and the previously-`—` decs **m4 lights up at 27.7 / 5.0** (near the array fast path). Closes `reverse_distinct_by` — the last forward-source buffered floor. +- **2026-06-11 (table joins — adapter-generalized `emit_array_join` + table-srcB probe)** — table-arc stage 5 (branch `bbatkin/linq-table-each-kv`; plan: `benchmarks/sql/LINQ_TO_TABLE.md`). Two halves. (1) **Lead generalization**: `emit_array_join` no longer hand-rolls its `for (tup_a in srcA)` — the lead loop, bind name, and lead invoke-param spelling come from the adapter (`wrap_source_loop(LoopDispatch(Each=null))` / `bind_name(at)` / new `SourceAdapter.invoke_param_type()` capability, default `invoke_src_param_type(arrayTop())`), so `TableAdapter` just sets `can_join() = true` and routes `emit_join_hook` to the same emitter: a table-lead join walks the kv usage-pruned slot iterator(s) — a join body touching only `c.value.*` walks `values(tab)` alone — and group joins stay outer over every slot. decs/xml/json hooks untouched (nested-callback walks). (2) **Table-srcB probe**: when the join's srcb is `each_kv(tab)` / `keys(set)` joined on its **bare key** (`join_srcb_table_call` + `join_keyb_is_bare_key` on the peeled keyb), the emitter skips the internal `table>` + build loop entirely — srcB binds the user's table (const param) and the per-A probe is a key lookup, usage-pruned like the point-lookup fold (count-no-where / key-only → `key_exists`, value shapes → by-ref bind off `unsafe(tab?[k])`, whole-pair → kv-tuple bind). Unique table keys ⇒ probe ≡ hash semantics exactly; a bare field read is pure by construction so skipping keyb's per-B evaluation is unobservable; non-bare keybs and `group_join` (result consumes the whole bucket) keep the hashed build. Plumbing: per-pair statements factored into `build_join_pair_core` (`JoinPairCore`), shared by `build_join_standalone_pieces` (keeps the group-join arm + `get`-bucket wrap — hash-mode AST unchanged for the decs/xml/json callers) and the new `build_join_probe_pieces`. m7: `join_count` / `join_where_count` (table lead) leave tier-2; new `join_probe` vs `join_probe_build` A/B lanes. ## Open questions diff --git a/daslib/linq_fold_common.das b/daslib/linq_fold_common.das index f08036349..3037749f2 100644 --- a/daslib/linq_fold_common.das +++ b/daslib/linq_fold_common.das @@ -126,6 +126,10 @@ class SourceAdapter { def arraySrcName() : string { return "" } + def invoke_param_type() : TypeDeclPtr { // invoke-param spelling for the source argument (join lane's lead param) + var top = arrayTop() + return top != null ? invoke_src_param_type(top) : null + } } // Decorator adapter — absorbs a leading `_select(f)` source projection: binds `projName = f(innerBind)` atop @@ -163,11 +167,9 @@ class ProjectedSourceAdapter : SourceAdapter { } } -// ===== Field-pruning row-usage scanner (shared by the XML / JSON source adapters) ===== -// Scan a chain body for which Row fields it reads, so a deferred source's per-element materialization -// reads only those fields. Pure AST (bind name → field reads); each adapter keeps its own per-field read -// / full-row build. A whole-`it` ref (to_array push_clone, identity select, pass-to-fn) sets -// allFieldsUsed → the caller falls back to full materialization. +// ===== Field-pruning row-usage scanner (shared by the XML / JSON / table source adapters) ===== +// Scans a chain body for which Row fields it reads so per-element materialization fetches only those; +// a whole-`it` ref (to_array push_clone, identity select, pass-to-fn) sets allFieldsUsed → full row. class RowUsageScanner : AstVisitor { bindName : string @@ -5541,6 +5543,120 @@ def extract_join_lead_where(var c : Captures) : Expression? { return call != null ? call.arguments[1] : null } +// JoinPairCore — per-matched-(a,b)-pair statements shared by the hashed-join builder (wraps them in the +// bucket walk) and the table-srcB probe builder (runs them once per key hit). pairStmts reference +// tupAName + bElemName. Group joins are NOT built here (their result consumes the whole bucket). +struct JoinPairCore { + preludeStmts : array + pairStmts : array + returnStmt : Expression? + invokeRetType : TypeDeclPtr + cntName : string // count terminator's accumulator; "" otherwise +} + +[macro_function] +def build_join_pair_core( + var joinCall : ExprCall?; + var whereLam : Expression?; + var selectLam : Expression?; + countOnly : bool; + tupAName : string; + bElemName : string; + namePrefix : string; + at : LineInfo + ) : JoinPairCore? { + let cntName = qn(namePrefix + "_cnt", at) + let bufName = qn(namePrefix + "_buf", at) + let resBindName = qn(namePrefix + "_res", at) + var preludeStmts : array + var pairStmts : array + var returnStmt : Expression? + var invokeRetType : TypeDeclPtr + if (countOnly) { + invokeRetType = new TypeDecl(baseType = Type.tInt, at = at) + preludeStmts |> push <| qmacro_expr() { + var $i(cntName) : int = 0 + } + if (whereLam == null) { + pairStmts |> push <| qmacro_expr() { + $i(cntName) ++ + } + } else { + // HAVING-shape: bind result, evaluate predicate, conditional incr. + var resultLam = joinCall.arguments[4] + if (resultLam == null || resultLam._type == null || resultLam._type.firstType == null) return null + var resultBody = peel_lambda_rename_2vars(resultLam, tupAName, bElemName) + if (resultBody == null) return null + let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) + var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) + pairStmts |> push_from <| qmacro_block_to_array() { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + if ($e(wherePred)) { + $i(cntName) ++ + } + } + } + returnStmt = qmacro_expr() { + return $i(cntName) + } + return <- new JoinPairCore(preludeStmts <- preludeStmts, pairStmts <- pairStmts, + returnStmt = returnStmt, invokeRetType = invokeRetType, cntName = cntName) + } + var resultLam = joinCall.arguments[4] + if (resultLam == null || resultLam._type == null || resultLam._type.firstType == null) return null + var resultBody = peel_lambda_rename_2vars(resultLam, tupAName, bElemName) + if (resultBody == null) return null + // Buffer element type = after-select projection when select is trailing, else result-lam return type. Use peel_lambda_single_return universally: selCall._type.firstType may stay as unresolved typedecl(result_selector(type)) when chain doesn't end with to_array() (array-overload select returns array directly so no enclosing wrap forces resolution). Lambda-body's _type is always resolved post inner-first expansion. + var resultType : TypeDeclPtr + if (selectLam != null) { + var selBody = peel_lambda_single_return(selectLam) + if (selBody == null || selBody._type == null) return null + resultType = strip_const_ref(clone_type(selBody._type)) + } else { + // strip_const_ref: a scalar/field result (`$(c,d)=>c.name`, `string const&`) would make the buffer + // `array` and push_clone fail (error[30913]). A named-tuple result is already a + // fresh non-const value (no-op there) — which is why only bare-scalar join projections tripped it. + resultType = strip_const_ref(clone_type(resultLam._type.firstType)) + } + if (resultType == null) return null + invokeRetType = new TypeDecl(baseType = Type.tArray, firstType = clone_type(resultType), at = at) + preludeStmts |> push <| qmacro_expr() { + var $i(bufName) : array<$t(resultType)> + } + let needBind = selectLam != null || whereLam != null + if (needBind) { + let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) + var pushExpr : Expression? + if (selectLam != null) { + var projBody = peel_lambda_replace_var(selectLam, qmacro($i(resBindName))) + pushExpr = qmacro($i(bufName) |> push_clone($e(projBody))) + } else { + pushExpr = qmacro($i(bufName) |> push_clone($i(resBindName))) + } + if (whereLam != null) { + var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) + pairStmts |> push_from <| qmacro_block_to_array() { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + if ($e(wherePred)) { + $e(pushExpr) + } + } + } else { + pairStmts |> push_from <| qmacro_block_to_array() { + let $i(resBindName) : $t(joinResultType) = $e(resultBody) + $e(pushExpr) + } + } + } else { + pairStmts |> push <| qmacro($i(bufName) |> push_clone($e(resultBody))) + } + returnStmt = qmacro_expr() { + return <- $i(bufName) + } + return <- new JoinPairCore(preludeStmts <- preludeStmts, pairStmts <- pairStmts, + returnStmt = returnStmt, invokeRetType = invokeRetType, cntName = "") +} + [macro_function] def build_join_standalone_pieces( var joinCall : ExprCall?; @@ -5558,46 +5674,34 @@ def build_join_standalone_pieces( ) : JoinStandalonePieces? { let bElemName = qn(namePrefix + "_b", at) let arrName = qn(namePrefix + "_arr", at) - let cntName = qn(namePrefix + "_cnt", at) let bufName = qn(namePrefix + "_buf", at) - let resBindName = qn(namePrefix + "_res", at) let emptyArrName = qn(namePrefix + "_empty", at) // group-join: empty bucket for an unmatched left row var preludeStmts : array var probeInnerStmts : array var returnStmt : Expression? var invokeRetType : TypeDeclPtr var groupEmptyExpr : Expression? // group-join only: the `buf |> push_clone(result(a, ))` for the outer branch - if (countOnly) { - invokeRetType = new TypeDecl(baseType = Type.tInt, at = at) - preludeStmts |> push <| qmacro_expr() { - var $i(cntName) : int = 0 - } - if (whereLam == null) { + if (!isGroupJoin) { + var core = build_join_pair_core(joinCall, whereLam, selectLam, countOnly, tupAName, bElemName, namePrefix, at) + if (core == null) return null + preludeStmts <- core.preludeStmts + returnStmt = core.returnStmt + invokeRetType = core.invokeRetType + if (countOnly && whereLam == null) { // Fast path (PR D2-A guarantee): bucket-length sum at bucket granularity, never enters per-pair loop. + let cntName = core.cntName probeInnerStmts |> push <| qmacro_expr() { $i(cntName) += length($i(arrName)) } } else { - // HAVING-shape: bind result, evaluate predicate, conditional incr. - var resultLam = joinCall.arguments[4] - if (resultLam == null || resultLam._type == null || resultLam._type.firstType == null) return null - var resultBody = peel_lambda_rename_2vars(resultLam, tupAName, bElemName) - if (resultBody == null) return null - let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) - var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) + var pairStmts <- core.pairStmts probeInnerStmts |> push <| qmacro_expr() { for ($i(bElemName) in $i(arrName)) { - let $i(resBindName) : $t(joinResultType) = $e(resultBody) - if ($e(wherePred)) { - $i(cntName) ++ - } + $b(pairStmts) } } } - returnStmt = qmacro_expr() { - return $i(cntName) - } - } elif (isGroupJoin) { + } else { // group join: the result's 2nd param is the WHOLE bucket (the group), so we emit one buffer row per // left row (no per-match inner loop). emit_*_join guarantees no trailing where/select/count reaches here. var resultLam = joinCall.arguments[4] @@ -5621,66 +5725,6 @@ def build_join_standalone_pieces( returnStmt = qmacro_expr() { return <- $i(bufName) } - } else { - var resultLam = joinCall.arguments[4] - if (resultLam == null || resultLam._type == null || resultLam._type.firstType == null) return null - var resultBody = peel_lambda_rename_2vars(resultLam, tupAName, bElemName) - if (resultBody == null) return null - // Buffer element type = after-select projection when select is trailing, else result-lam return type. Use peel_lambda_single_return universally: selCall._type.firstType may stay as unresolved typedecl(result_selector(type)) when chain doesn't end with to_array() (array-overload select returns array directly so no enclosing wrap forces resolution). Lambda-body's _type is always resolved post inner-first expansion. - var resultType : TypeDeclPtr - if (selectLam != null) { - var selBody = peel_lambda_single_return(selectLam) - if (selBody == null || selBody._type == null) return null - resultType = strip_const_ref(clone_type(selBody._type)) - } else { - // strip_const_ref: a scalar/field result (`$(c,d)=>c.name`, `string const&`) would make the buffer - // `array` and push_clone fail (error[30913]). A named-tuple result is already a - // fresh non-const value (no-op there) — which is why only bare-scalar join projections tripped it. - resultType = strip_const_ref(clone_type(resultLam._type.firstType)) - } - if (resultType == null) return null - invokeRetType = new TypeDecl(baseType = Type.tArray, firstType = clone_type(resultType), at = at) - preludeStmts |> push <| qmacro_expr() { - var $i(bufName) : array<$t(resultType)> - } - let needBind = selectLam != null || whereLam != null - if (needBind) { - let joinResultType = strip_const_ref(clone_type(resultLam._type.firstType)) - var pushExpr : Expression? - if (selectLam != null) { - var projBody = peel_lambda_replace_var(selectLam, qmacro($i(resBindName))) - pushExpr = qmacro($i(bufName) |> push_clone($e(projBody))) - } else { - pushExpr = qmacro($i(bufName) |> push_clone($i(resBindName))) - } - if (whereLam != null) { - var wherePred = peel_lambda_replace_var(whereLam, qmacro($i(resBindName))) - probeInnerStmts |> push <| qmacro_expr() { - for ($i(bElemName) in $i(arrName)) { - let $i(resBindName) : $t(joinResultType) = $e(resultBody) - if ($e(wherePred)) { - $e(pushExpr) - } - } - } - } else { - probeInnerStmts |> push <| qmacro_expr() { - for ($i(bElemName) in $i(arrName)) { - let $i(resBindName) : $t(joinResultType) = $e(resultBody) - $e(pushExpr) - } - } - } - } else { - probeInnerStmts |> push <| qmacro_expr() { - for ($i(bElemName) in $i(arrName)) { - $i(bufName) |> push_clone($e(resultBody)) - } - } - } - returnStmt = qmacro_expr() { - return <- $i(bufName) - } } // Probe-bucket wrap: get(hash, keya, $(var arr) { probeInnerStmts }) — matches table.get's mutating overload. var probeOuter <- qmacro_expr() { @@ -5721,6 +5765,145 @@ def build_join_standalone_pieces( ) } +// srcb (arguments[1] of `join(...)`) spelled as `each_kv(tab)` / `keys(tab)` — the table-probe candidate. +// Name + table-typed-arg match like extract_table_source (values() carries no key, so it stays hashed). +[macro_function] +def join_srcb_table_call(var joinCall : ExprCall?) : ExprCall? { + if (joinCall == null || (joinCall.arguments |> length) < 2) return null + var srcb = joinCall.arguments[1] + if (srcb == null || !(srcb is ExprCall)) return null + var call = srcb as ExprCall + let name = get_call_short_name(call) + if ((name != "each_kv" && name != "keys") + || call._type == null || !call._type.isIterator || call._type.firstType == null + || (call.arguments |> length) != 1) { + return null + } + let srcT = call.arguments[0]._type + if (srcT == null || !srcT.isGoodTableType) return null + return call +} + +// keyb (peeled, binder renamed to bindName) selects the table key itself: bare `kv.key` (kv lane) or the +// bare element (keys lane). Then the join key IS the table key and a lookup replaces the bucket walk. +[macro_function] +def join_keyb_is_bare_key(var keybBody : Expression?; bindName : string; kvLane : bool) : bool { + var k = keybBody + if (k != null && k is ExprRef2Value) { + k = (k as ExprRef2Value).subexpr + } + if (!kvLane) { + return k != null && k is ExprVar && (k as ExprVar).name == bindName + } + if (k == null || !(k is ExprField)) return false + var f = k as ExprField + if (f.name != "key") return false + var base = f.value + if (base != null && base is ExprRef2Value) { + base = (base as ExprRef2Value).subexpr + } + return base != null && base is ExprVar && (base as ExprVar).name == bindName +} + +// Table-srcB twin of build_join_standalone_pieces: unique table keys ⇒ bucket size ≤ 1, so there is no +// internal hash — probeOuter looks the lead key up in the table bound at tabSrcName. Group joins never +// come here (caller gates); the count-no-where fast path probes key_exists only. +[macro_function] +def build_join_probe_pieces( + var joinCall : ExprCall?; + var whereLam : Expression?; + var selectLam : Expression?; + var leadWhereLam : Expression?; + countOnly : bool; + var keyaBody : Expression?; + tupAName : string; + tabSrcName : string; + srcbKv : bool; + namePrefix : string; + at : LineInfo + ) : JoinStandalonePieces? { + let bElemName = qn(namePrefix + "_b", at) + let kName = qn(namePrefix + "_k", at) + let pName = qn(namePrefix + "_p", at) + var core = build_join_pair_core(joinCall, whereLam, selectLam, countOnly, tupAName, bElemName, namePrefix, at) + if (core == null) return null + var pairStmts <- core.pairStmts + var probeOuter : Expression? + if (countOnly && whereLam == null) { + // membership is enough — pairStmts is the bare `cnt++` (the bucket-length analog for a unique key) + probeOuter = qmacro_expr() { + if (key_exists($i(tabSrcName), $e(keyaBody))) { + $b(pairStmts) + } + } + } elif (srcbKv) { + // kv lane, pruned to what the pair statements read from b: key-only shapes stay on key_exists, + // value shapes bind by reference from the probed pointer (no copy), whole-pair use binds the kv + // tuple. Safe-index is unsafe (rehash-dangling ptr) — fine, the generated invoke never mutates. + let vName = qn(namePrefix + "_v", at) + var pairBlock = qmacro_block() { + $b(pairStmts) + } + var (allUsed, usedFields) = collect_row_usage(pairBlock, bElemName) + if (allUsed) { + probeOuter = qmacro_block() { + let $i(kName) = $e(keyaBody) + let $i(pName) = unsafe($i(tabSrcName)?[$i(kName)]) + if ($i(pName) != null) { + let $i(bElemName) = (key = $i(kName), value = *$i(pName)) + $e(pairBlock) + } + } + } else { + var fieldToLocal <- { "key" => kName, "value" => vName } + var flatBlock = flatten_row_to_locals(pairBlock, bElemName, fieldToLocal) + if (usedFields |> has_value("value")) { + probeOuter = qmacro_block() { + let $i(kName) = $e(keyaBody) + let $i(pName) = unsafe($i(tabSrcName)?[$i(kName)]) + if ($i(pName) != null) { + let $i(vName) & = unsafe(*$i(pName)) + $e(flatBlock) + } + } + } else { + probeOuter = qmacro_block() { + let $i(kName) = $e(keyaBody) + if (key_exists($i(tabSrcName), $i(kName))) { + $e(flatBlock) + } + } + } + } + } else { + // keys lane: the element IS the key + probeOuter = qmacro_block() { + let $i(kName) = $e(keyaBody) + if (key_exists($i(tabSrcName), $i(kName))) { + let $i(bElemName) = $i(kName) + $b(pairStmts) + } + } + } + // Leading `where` — same fusion as the hashed builder: filter srcA inside the per-A probe. + if (leadWhereLam != null) { + var leadPred = peel_lambda_rename_var(leadWhereLam, tupAName) + if (leadPred == null) return null + var probeInner = probeOuter + probeOuter = qmacro_expr() { + if ($e(leadPred)) { + $e(probeInner) + } + } + } + return <- new JoinStandalonePieces( + preludeStmts <- core.preludeStmts, + probeOuter <- probeOuter, + returnStmt = core.returnStmt, + invokeRetType = core.invokeRetType + ) +} + // ── join splice (one pattern for every source) ─────── // `can_join()` admits the adapter; `emit_join_hook()` supplies the source-specific emit + srcb gate // (null → tier-2). decs / array / xml each override emit_join_hook — no parallel per-source pattern. @@ -5750,7 +5933,9 @@ def emit_join(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expression? [macro_function] def emit_array_join(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expression? { // nolint:STYLE014 — emits a hashed equi-join as a 2-source invoke (mirrors Zip's 2-source wrap): - // build a hash from srcB, probe it with srcA. Generated code (canonical no-count/no-where/no-select shape): + // build a hash from srcB, probe it with srcA. The lead loop and param spelling come from the adapter + // (wrap_source_loop / invoke_param_type), so any direct-return loop source rides this — array `for`, + // table slot walk. Generated code (canonical no-count/no-where/no-select shape, array lead): // invoke( // (srcA, srcB) { // var buf : array @@ -5770,6 +5955,9 @@ def emit_array_join(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expres // count() → `var cnt = 0`; probe body is `cnt += length(bucket)` (bucket-length fast path, no per-pair loop); `return cnt` // where(p) → per pair: `let res = result(a, b); if (p(res)) { ...push/incr... }` // select(f) → push `f(res)` instead of `res` (buffer element type = f's return type) + // Table-srcB probe: when srcB is `each_kv(tab)` / `keys(tab)` joined on its bare key (inner join only), + // there is no hash/build loop — srcB binds the table itself and the per-A probe is a key lookup + // (build_join_probe_pieces). Unique table keys make probe ≡ hash semantics exactly. var joinCall = c.single["join"] if (!srcb_is_array_shaped(c, "join")) return null // srcb must be array/iterator-shaped (formerly the array_join_srcb_is_array requires gate, moved here so the unified pattern asks the adapter) let isGroupJoin = c.single_name |> key_exists("join") && c.single_name["join"] == "group_join" @@ -5801,41 +5989,55 @@ def emit_array_join(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expres if (keyaLam == null || keybLam == null || keyaLam._type == null) return null let keyType = strip_const_ref(clone_type(keyaLam._type.firstType)) if (!is_primitive_join_key_type(keyType)) return null - let srcAName = qn("ajoin_srcA", at) + let srcAName = ctx.src->arraySrcName() let srcBName = qn("ajoin_srcB", at) - let tupAName = qn("ajoin_tup_a", at) + let tupAName = ctx.src->bind_name(at) let tupBName = qn("ajoin_tup_b", at) let hashName = qn("ajoin_hash", at) var keyaBody = peel_lambda_rename_var(keyaLam, tupAName) var keybBody = peel_lambda_rename_var(keybLam, tupBName) if (keyaBody == null || keybBody == null) return null let tupBType = strip_const_ref(clone_type(srcbSrc._type.firstType)) - var pieces = build_join_standalone_pieces(joinCall, whereLam, selectLam, leadWhereLam, countOnly, keyaBody, tupAName, hashName, tupBType, isGroupJoin, "ajoin", at) + var srcbTab = join_srcb_table_call(joinCall) + let srcbKv = srcbTab != null && get_call_short_name(srcbTab) == "each_kv" + let probeMode = srcbTab != null && !isGroupJoin && join_keyb_is_bare_key(keybBody, tupBName, srcbKv) + var pieces : JoinStandalonePieces? + if (probeMode) { + pieces = build_join_probe_pieces(joinCall, whereLam, selectLam, leadWhereLam, countOnly, keyaBody, tupAName, srcBName, srcbKv, "ajoin", at) + } else { + pieces = build_join_standalone_pieces(joinCall, whereLam, selectLam, leadWhereLam, countOnly, keyaBody, tupAName, hashName, tupBType, isGroupJoin, "ajoin", at) + } if (pieces == null) return null var allStmts : array allStmts |> push_from(pieces.preludeStmts) - allStmts |> push <| qmacro_expr() { - var $i(hashName) : table<$t(keyType); array<$t(tupBType)>> - } - allStmts |> push <| qmacro_expr() { - for ($i(tupBName) in $i(srcBName)) { - // nolint:PERF006 per-key bucket size unknown ahead of time - $i(hashName)[$e(keybBody)] |> push_clone($i(tupBName)) + if (!probeMode) { + allStmts |> push <| qmacro_expr() { + var $i(hashName) : table<$t(keyType); array<$t(tupBType)>> } - } - allStmts |> push <| qmacro_expr() { - for ($i(tupAName) in $i(srcAName)) { - $e(pieces.probeOuter) + allStmts |> push <| qmacro_expr() { + for ($i(tupBName) in $i(srcBName)) { + // nolint:PERF006 per-key bucket size unknown ahead of time + $i(hashName)[$e(keybBody)] |> push_clone($i(tupBName)) + } } } + allStmts |> push <| ctx.src->wrap_source_loop(LoopDispatch(Each = null), pieces.probeOuter, at) allStmts |> push(pieces.returnStmt) var invokeRetType = pieces.invokeRetType var topClone = clone_expression(topSrc) topClone.genFlags.alwaysSafe = true - var srcbClone = clone_expression(srcbSrc) + var srcbArg = probeMode ? srcbTab.arguments[0] : srcbSrc + var srcbClone = clone_expression(srcbArg) srcbClone.genFlags.alwaysSafe = true - let srcAParamType = invoke_src_param_type(topSrc) - let srcBParamType = invoke_src_param_type(srcbSrc) + let srcAParamType = ctx.src->invoke_param_type() + var srcBParamType : TypeDeclPtr + if (probeMode) { + // bind the user's table itself, const-accepting (a non-const source adds-const cleanly) + srcBParamType = strip_const_ref(clone_type(srcbArg._type)) + srcBParamType.flags.constant = true + } else { + srcBParamType = invoke_src_param_type(srcbSrc) + } var emission = qmacro(invoke( $($i(srcAName) : $t(srcAParamType), $i(srcBName) : $t(srcBParamType)) : $t(invokeRetType) { $b(allStmts) diff --git a/daslib/linq_fold_table.das b/daslib/linq_fold_table.das index c9fc01181..89b20b630 100644 --- a/daslib/linq_fold_table.das +++ b/daslib/linq_fold_table.das @@ -122,6 +122,12 @@ class TableAdapter : SourceAdapter { def override can_reserve_by_length() : bool { return true // length(tab) is O(1); the shared reserve hint reads arrayTop/arraySrcName } + def override const can_join() : bool { + return true // rides emit_array_join: direct-return lead loop via wrap_source_loop + } + def override emit_join_hook(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expression? { + return emit_array_join(c, ctx, at) + } def override arrayTop() : Expression? { // Feeds the reserve hint (type_has_length covers tables). The backward-index reverse lanes that // also read arrayTop gate on array_source, which is false here — matchTop stays iterator-typed. @@ -146,10 +152,14 @@ class TableAdapter : SourceAdapter { var breakGuard = qmacro($i(takenName) >= $i(takeLimName)) return build_table_walk(lane, srcName, bindName, perElement, breakGuard, at) } - def override wrap_invoke(var stmts : array; retType : TypeDeclPtr; wrapIter : bool; at : LineInfo) : Expression? { + def override invoke_param_type() : TypeDeclPtr { // Const-accepting param: the source table is often a `let`, and a non-const source adds-const cleanly. var tabType = strip_const_ref(clone_type(tabExpr._type)) tabType.flags.constant = true + return tabType + } + def override wrap_invoke(var stmts : array; retType : TypeDeclPtr; wrapIter : bool; at : LineInfo) : Expression? { + var tabType = invoke_param_type() var tabClone = clone_expression(tabExpr) tabClone.genFlags.alwaysSafe = true let sn = srcName diff --git a/doc/source/reference/linq_das.rst b/doc/source/reference/linq_das.rst index 6d699c139..124867e38 100644 --- a/doc/source/reference/linq_das.rst +++ b/doc/source/reference/linq_das.rst @@ -361,6 +361,11 @@ source is built exactly like the first (untyped → array/table, typed → the ``from_in`` dispatch), so it may be a different kind of source than the left — a table works on either side (its kv pair is that side's row, e.g. ``on c.brand equals p.key``); note a table left source walks in slot order. +A right-side table joined on its **bare key** — ``equals p.key``, or the bare +element for a ``table`` set — fuses as a per-row key probe of that table +(no internal join hash gets built); any other right-key expression keeps the +ordinary hashed join. Either way the results are identical — table keys are +unique, so the probed "bucket" is the same 0-or-1 rows the hash would hold. The reader picks one of two emit shapes from the **post-join** clauses (it transpiles before type inference and cannot see the source, so it decides diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index 3ced024a9..c7ef01e48 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -150,7 +150,7 @@ Source-side entry points - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy). * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)`` - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``) - - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. ``can_join`` / ``can_group_by`` are off and reverse has no backward slot walk — those shapes cascade to tier-2 (the join probe is staged: see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. + - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. ``can_group_by`` is off and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. * - ``unsafe(from_json(jv, type))`` - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``) - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML. @@ -423,12 +423,16 @@ Array-array equi-join ``emit_array_join`` is the array-source mirror of ``emit_decs_join`` — hashed equi-join over two array / iterator sources. Algorithm is identical (collect srcb into ``table>`` in one pass, -then walk srca and probe via ``table.get``) but the per-source -iteration is a plain ``for (elem in src) { ... }`` loop instead of -``for_each_archetype + build_decs_inner_for``. Both sources bind as -invoke parameters (2-source wrap, mirrors ``Zip``). Same primitive -equi-key gate as the decs side; non-primitive keys cascade to -``join_impl_const``. +then walk srca and probe via ``table.get``) but the lead iteration +comes from the adapter (``wrap_source_loop`` / ``bind_name`` / +``invoke_param_type``), so any direct-return loop source rides it — +``ArrayAdapter`` frames a plain ``for (elem in src)``, ``TableAdapter`` +its pruned slot walk (vs ``for_each_archetype + build_decs_inner_for`` +on the decs side). Both sources bind as invoke parameters (2-source +wrap, mirrors ``Zip``). Same primitive equi-key gate as the decs side; +non-primitive keys cascade to ``join_impl_const``. When srcB is a +table walked on its bare key, the internal hash is skipped entirely — +see the table-source row above and the probe row below. .. list-table:: :header-rows: 1 @@ -468,6 +472,28 @@ equi-key gate as the decs side; non-primitive keys cascade to ``join`` is the separate trailing slot. Composes with the trailing ``_where`` / ``_select`` forms. Wrapping lives in the shared ``build_join_standalone_pieces``, so decs / XML / JSON inherit it. + * - ``arrA |> _join(unsafe(each_kv(tab)), , ...)`` (or ``keys(set)`` with a bare-element key; any terminator/where/select form above) + - probe mode (``join_srcb_table_call`` + ``join_keyb_is_bare_key`` → ``build_join_probe_pieces``) + - **Table-srcB probe**: the b-key selector IS the table key, so no + hash and no build loop — srcB binds the user's table itself + (const param) and the per-A probe is a key lookup. Unique table + keys ⇒ bucket ≤ 1 ⇒ probe ≡ hash semantics exactly (b-key is a + bare field read, so skipping its per-B evaluation is + unobservable). Usage-pruned like the point-lookup fold: + count-no-where / key-only shapes probe ``key_exists`` (value + never touched), value shapes bind by reference from + ``tab?[k]``, a whole-pair use binds the kv tuple. Non-bare + b-keys and ``group_join`` keep the hashed build over the kv + iterator. Composes with every lead the emitter serves (array + lead, table lead — table×table probes both sides). + * - ``unsafe(each_kv(tabA)) |> _join(srcB, on, into) |> ...`` (table lead; ``keys`` / ``values`` lanes too) + - pattern ``join_general`` → ``TableAdapter.emit_join_hook`` → ``emit_array_join`` + - **Table lead**: same emitter, lead loop framed by + ``TableAdapter.wrap_source_loop`` — the kv usage-pruner sees the + whole probe body (key lambda + result + trailing where/select), + so a join touching only ``c.value.*`` walks ``values(tab)`` + alone. All srcB modes compose (hashed array/iterator srcB, + table-srcB probe); ``group_join`` stays outer over every slot. * - ``arrA |> _group_join(arrB, on, into)`` (+ optional leading ``_where``) - pattern ``join_general`` with the ``group_join`` literal (``isGroupJoin``) - C# GroupJoin (**outer**): one result row per srcA row — ``result(a, @@ -477,10 +503,11 @@ equi-key gate as the decs side; non-primitive keys cascade to "group_join"]``; ``isGroupJoin`` threads through ``build_join_standalone_pieces``, which rebinds the result lambda's 2nd param to the whole bucket (``array``) so the per-group aggregate - runs inside the result. **Array sources only** — decs / XML / JSON group joins - defer to tier-2 (their ``emit_join_hook`` returns ``null`` for + runs inside the result. **Array / table leads only** — decs / XML / JSON + group joins defer to tier-2 (their ``emit_join_hook`` returns ``null`` for ``group_join``); a trailing ``where`` / ``select`` / ``count`` over the - group rows also defers. + group rows also defers, and a table srcB keeps the hashed build (the + probe never serves group joins). * - ``arrA |> _join(arrB, ...) |> _group_by(K) |> _select(reduce) |> count() / to_array()`` - ``plan_group_by_core`` via ``SourceAdapter.ArrayJoin`` (chunk N+2) - Cross-arm composition. ``emit_group_by``'s Array branch diff --git a/tests/linq/test_linq_das.das b/tests/linq/test_linq_das.das index bdfce8c2b..7c37fcd07 100644 --- a/tests/linq/test_linq_das.das +++ b/tests/linq/test_linq_das.das @@ -286,7 +286,8 @@ def test_table_arbitrary_range_var_name(t : T?) { [test] def test_table_as_join_right_source(t : T?) { - // a table works on either side of a join (tier-2; the kv pair is that side's row). + // a table works on either side of a join (the kv pair is that side's row). A right-side table + // joined on its bare `.key` fuses as a key probe — no internal join hash. // left side is the array → result follows array order, deterministic let cars <- mk_cars() let prio <- { "eco" => 10, "lux" => 99 } @@ -301,7 +302,7 @@ def test_table_as_join_right_source(t : T?) { def test_table_as_join_left_source(t : T?) { let cars <- mk_cars() let prio <- { "eco" => 10, "lux" => 99 } - // left side is the table → slot order, so sort before asserting + // left side is the table → fused slot walk; slot order, so sort before asserting var rows <- %linq! from p in prio join c in cars on p.key equals c.brand select "{c.name}={p.value}" %% rows |> sort() t |> equal(length(rows), 3) @@ -310,6 +311,29 @@ def test_table_as_join_left_source(t : T?) { t |> equal(rows[2], "mid=10") } +[test] +def test_set_as_join_source(t : T?) { + // a table set joins on its bare element — membership probe + let cars <- mk_cars() + let vip : table <- { "lux" } + let names <- %linq! from c in cars join b in vip on c.brand equals b select c.name %% + t |> equal(length(names), 1) + t |> equal(names[0], "lux") +} + +[test] +def test_table_join_into_group(t : T?) { + // `into` (group join) over a table right source stays OUTER — the probe never fires (its result + // consumes the whole bucket), the hashed walk does + let cars <- mk_cars() + let prio <- { "eco" => 10 } + let rows <- %linq! from c in cars join p in prio on c.brand equals p.key into ps select (Name = c.name, N = length(ps)) %% + t |> equal(length(rows), 3) + t |> equal(rows[0].N, 1) + t |> equal(rows[1].N, 1) + t |> equal(rows[2].N, 0) +} + // ===== orderby (single key, optional `descending`) ===== [test] diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das index 630960d18..ed0330600 100644 --- a/tests/linq/test_linq_table_source.das +++ b/tests/linq/test_linq_table_source.das @@ -278,6 +278,196 @@ def test_table_point_lookup(t : T?) { } } +typedef IKV = tuple + +// Joins: a table in the srcB slot joined on its bare key (`d.key` / bare set element) probes the user's +// table instead of building the join's internal hash; a table lead rides the same emitter through the +// pruned slot walk. Either way must agree with the hash/hand-loop semantics: inner joins drop misses, +// keep lead duplicates; group joins stay outer. + +[test] +def test_table_join_srcb_probe(t : T?) { + t |> run("inner join on the table key: count + rows in lead order") @(t : T?) { + var ids <- [0, 1, 1, 4, 9, 2, 0, 7] // dups stay, 9/7 miss + var dtab <- make_int_table(5) + let n = _fold(ids |> _join(unsafe(each_kv(dtab)), + $(a : int, d : IKV) => a == d.key, + $(a : int, d : IKV) => d.value) + |> count()) + t |> equal(n, 6) + var got <- _fold(ids |> _join(unsafe(each_kv(dtab)), + $(a : int, d : IKV) => a == d.key, + $(a : int, d : IKV) => (A = a, V = d.value)) + |> to_array()) + var expected <- [(A = 0, V = 0), (A = 1, V = 10), (A = 1, V = 10), (A = 4, V = 40), (A = 2, V = 20), (A = 0, V = 0)] + t |> equal(length(got), length(expected)) + for (i in range(length(expected))) { + t |> equal(got[i].A, expected[i].A) + t |> equal(got[i].V, expected[i].V) + } + delete got + delete expected + delete dtab + delete ids + } + t |> run("trailing where + select on the probed pair") @(t : T?) { + var ids <- [0, 1, 1, 4, 9, 2, 0, 7] + var dtab <- make_int_table(5) + var got <- _fold(ids |> _join(unsafe(each_kv(dtab)), + $(a : int, d : IKV) => a == d.key, + $(a : int, d : IKV) => (A = a, V = d.value)) + |> _where(_.V > 5) + |> _select("{_.A}:{_.V}") + |> to_array()) + t |> equal(length(got), 4) + t |> equal(got[0], "1:10") + t |> equal(got[1], "1:10") + t |> equal(got[2], "4:40") + t |> equal(got[3], "2:20") + delete got + delete dtab + delete ids + } + t |> run("empty table srcB matches nothing") @(t : T?) { + var ids <- [0, 1, 2] + let e : table + let n = _fold(ids |> _join(unsafe(each_kv(e)), + $(a : int, d : IKV) => a == d.key, + $(a : int, d : IKV) => d.value) + |> count()) + t |> equal(n, 0) + delete ids + } + t |> run("set srcB joins on membership") @(t : T?) { + var ids <- [0, 1, 1, 4, 9] + var s : table <- { 1, 4 } + let n = _fold(ids |> _join(unsafe(keys(s)), + $(a : int, k : int) => a == k, + $(a : int, k : int) => a) + |> count()) + t |> equal(n, 3) + delete s + delete ids + } + t |> run("non-bare b key stays hashed and agrees") @(t : T?) { + var ids <- [0, 1, 1, 4, 9, 2, 0, 7] + var dtab <- make_int_table(5) + // value/10 == key for this fixture, so the hashed walk must find the same 6 pairs + let n = _fold(ids |> _join(unsafe(each_kv(dtab)), + $(a : int, d : IKV) => a == d.value / 10, + $(a : int, d : IKV) => d.value) + |> count()) + t |> equal(n, 6) + delete dtab + delete ids + } + t |> run("group join over a table srcB stays outer") @(t : T?) { + var ids <- [0, 9, 4] + var dtab <- make_int_table(5) + var got <- _fold(ids |> _group_join(unsafe(each_kv(dtab)), + $(a : int, d : IKV) => a == d.key, + $(a : int, ds : array) => (A = a, N = length(ds))) + |> to_array()) + t |> equal(length(got), 3) + t |> equal(got[0].N, 1) + t |> equal(got[1].N, 0) + t |> equal(got[2].N, 1) + delete got + delete dtab + delete ids + } + t |> run("table lead joining a table srcB probes both sides") @(t : T?) { + var ctab <- make_int_table(8) + var dtab <- make_int_table(5) + let n = _fold(each_kv(ctab) |> _join(unsafe(each_kv(dtab)), + $(c : IKV, d : IKV) => c.key == d.key, + $(c : IKV, d : IKV) => c.value + d.value) + |> count()) + t |> equal(n, 5) + delete dtab + delete ctab + } +} + +[test] +def test_table_join_lead(t : T?) { + t |> run("table lead, array srcB: count keeps bucket multiplicity") @(t : T?) { + var ctab <- make_int_table(10) + var darr <- [0, 2, 2, 4, 11] + let n = _fold(each_kv(ctab) |> _join(darr, + $(c : IKV, d : int) => c.key == d, + $(c : IKV, d : int) => (K = c.key, V = c.value)) + |> count()) + t |> equal(n, 4) + delete darr + delete ctab + } + t |> run("table lead to_array agrees with a hand loop in slot order") @(t : T?) { + var ctab <- make_int_table(10) + var darr <- [0, 2, 2, 4, 11] + var expected : array + for (k, v in keys(ctab), values(ctab)) { + for (d in darr) { + if (k == d) { + expected |> push(v + d) + } + } + } + var got <- _fold(each_kv(ctab) |> _join(darr, + $(c : IKV, d : int) => c.key == d, + $(c : IKV, d : int) => c.value + d) + |> to_array()) + t |> equal(length(got), length(expected)) + for (i in range(length(expected))) { + t |> equal(got[i], expected[i]) + } + delete got + delete expected + delete darr + delete ctab + } + t |> run("lead where filters before the join") @(t : T?) { + var ctab <- make_int_table(10) + var darr <- [0, 2, 2, 4, 11] + let n = _fold(each_kv(ctab) |> _where(_.key > 1) + |> _join(darr, + $(c : IKV, d : int) => c.key == d, + $(c : IKV, d : int) => c.value) + |> count()) + t |> equal(n, 3) + delete darr + delete ctab + } + t |> run("values lane lead") @(t : T?) { + var ctab <- make_int_table(10) + var darr <- [0, 2, 2, 4, 11] + let n = _fold(values(ctab) |> _join(darr, + $(v : int, d : int) => v == d * 10, + $(v : int, d : int) => v) + |> count()) + t |> equal(n, 4) + delete darr + delete ctab + } + t |> run("table lead group_join stays outer over every slot") @(t : T?) { + var ctab <- make_int_table(10) + var darr <- [0, 2, 2, 4, 11] + var got <- _fold(each_kv(ctab) |> _group_join(darr, + $(c : IKV, d : int) => c.key == d, + $(c : IKV, ds : array) => length(ds)) + |> to_array()) + t |> equal(length(got), 10) + var total = 0 + for (n in got) { + total += n + } + t |> equal(total, 4) + delete got + delete darr + delete ctab + } +} + // Tier-2 over the raw each_kv iterator (no _fold) — the [unsafe_outside_of_for] contract requires the // explicit unsafe(...) wrap at a bare chain head; fused chains rewrite the head before inference. From 0b90ee59b8c3f5f0db93e8531f327497f8bbd766 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 03:05:08 -0700 Subject: [PATCH 08/11] linq table arc: record fixed-array-rework merge + each_kv re-validation in the plan doc Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 6cd55c533..e3fb497cc 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -97,6 +97,16 @@ work. Cut the PR only after the rework has landed and been merged in here. At th re-validate the `each_kv` dim-array-value reject overload and `auto(valT)[]` matching — fixed arrays are exactly what is being reworked. +**Merge done (2026-06-11, after stage 5):** rework (#3095) merged in; one conflict in +`daslib/builtin.das` — master deleted the dim-array `values()` overloads (plain `auto(valT)` +now binds the whole `T[N]`), our `each_kv` block kept. Re-validation: `auto(valT)[]` in table +value position still matches dim-valued tables (the reject overload fires its 31400), and the +plain `each_kv` generic still does NOT match `table` (table-position generic matching +doesn't bind fixed-array values), so the explicit rejects remain the right design — without +them the dim case would be a cryptic 30341, not a workable path. The dim-array-valued each_kv +deferred edge is therefore engine-gated (table generic matcher), not ours. Gates green: full +INTERP 10965/10971 (0 failed, 6 skipped), AOT linq 1949/1949, JIT linq 1949/1949. + PR1 findings: - **Pre-existing generator-lowering bug, fixed in PR1**: the yield-for lowering emitted `loop &&= _builtin_iterator_first(...)` per source — short-circuiting `first()` on later From b72f62515cb3a927220d990b8f2a7a2271b83a2a Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 03:53:59 -0700 Subject: [PATCH 09/11] =?UTF-8?q?linq=5Ffold:=20to=5Ftable=20sink=20?= =?UTF-8?q?=E2=80=94=20fused=20insert-loop=20terminator=20+=20selector-fre?= =?UTF-8?q?e=20tier-2=20forms?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Stage 6 of the table arc (benchmarks/sql/LINQ_TO_TABLE.md), closing the arc. Two layers: 1. Tier-2 surface (daslib/linq.das): selector-free to_table over iterators and arrays — iterator> -> table map, iterator -> table set, plus borrowing array forms with reserve. Iterator params are const-qualified (the 50609 mangler-ICE defuse) so each_kv's -const flavor and to_sequence's -& flavor converge on one instantiation. Duplicate keys keep the last occurrence (das insert semantics, not C#'s throw). 2. Fused emit: to_table joins loop_terminator_family + the ARRAY materializer lane; the new arm rides emit_fold_array_lane via FoldArraySpec.bufDeclStmt (table buffer instead of the array decl) — where/select/ranges plumbing all shared. A (k => v) MakeTuple projection splits so key and value evaluate exactly once; other projections bind to a local; pass-through spells the kv access with the element tuple's real field names so the kv usage-pruner maps them. Reserve fires on unfiltered walks only (table over-reserve is worse than an array's slack), with the take-min variant. Map-vs-set falls out of the resolved terminator type. Declines that keep tier-2: the 3-arg selector form, decs sources (explicit guard — the decs lane's implicit-to_array fall-through would mis-emit an array for a table-typed expr). m7: to_table 32.5 vs to_table_staged (materialize + builtin to_table_move) 68.3 ns/elem INTERP (28.8 vs 41.6 JIT). 13 new tests (58/58 in the arc file); full INTERP 10978/10984 0 failed, AOT linq 1962/1962, JIT linq 1962/1962, Sphinx -W clean. results.md re-swept (82 families); skills/linq.md gains the table-source + to_table section (end-of-arc item). Co-Authored-By: Claude Fable 5 --- benchmarks/sql/LINQ_TO_TABLE.md | 34 ++- benchmarks/sql/results.md | 285 ++++++++++---------- benchmarks/sql/table.das | 27 ++ daslib/linq.das | 41 +++ daslib/linq_fold.md | 1 + daslib/linq_fold_common.das | 108 +++++++- daslib/linq_fold_decs.das | 2 + doc/source/reference/linq_fold_patterns.rst | 10 +- skills/linq.md | 24 ++ tests/linq/test_linq_table_source.das | 125 +++++++++ 10 files changed, 508 insertions(+), 149 deletions(-) diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index e3fb497cc..36122143f 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,9 +4,37 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 5 committed** (join probe + table-lead joins; stage 4 = point-lookup folds, -ac441c4a0; stage 3 = `%linq!` table sources, 29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; -stage 1 = `each_kv` builtin, 8751bb9ba). +Status: **stage 6 committed — arc complete** (to_table sink; stage 5 = join probe + table-lead +joins, 2742f6db2; stage 4 = point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources, +29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba; +master's fixed-array rework merged in after stage 5, 1ab3e6a67). + +Stage 6 findings: +- **Tier-2 surface required for typing**: `_fold`'s argument must fully type before the macro + runs, so the selector-free `to_table` generics in `daslib/linq.das` are load-bearing — map vs + set in the fused emit falls out of the *resolved* terminator type (`secondType == void`), not + from chain inspection. Iterator forms are const-qualified (`tuple<…> const` / `auto(keyT) + const`) — the standard 50609 mangler-ICE defuse — and the named kv tuple matches the + positional `tuple` generic directly. +- **The fused arm is ~60 lines riding existing machinery**: `to_table` joins + `loop_terminator_family` + the ARRAY materializer lane; a new `FoldArraySpec.bufDeclStmt` slot + swaps the array buffer decl for the table decl and `emit_fold_array_lane` does the rest + (where/select/ranges/reserve plumbing shared with to_array chains). +- **Field names matter for the kv pruner**: the pass-through insert must spell `it.key` / + `it.value` (the element tuple's real field names), not positional `._0`/`._1` — the row-usage + scanner maps named fields only, and an unmapped reference leaves the bind var undeclared. + A `(k => v)` MakeTuple projection splits so each side evaluates exactly once. +- **`to_table_move` is not a chain terminator**: over an iterator there is nothing to steal — + elements are yielded temporaries, so "move" reduces to clone. The consuming builtin + `to_table_move(array)` forms still serve materialized arrays (the bench staged baseline uses + exactly that); a fused move of non-copyable select-temps stays a deferred edge + (fused-kv-non-copyable). +- **decs needs an explicit decline**: `emit_loop_or_count_lane_decs` falls through unknown + terminators to its implicit-to_array arm, which would mis-emit an array for a table-typed + expr — guarded with `if (termName == "to_table") return null` (tier-2 cascade). +- **where-after-select + any terminator already cascades** (pre-existing lane behavior, count + and to_table alike) — not a stage-6 regression; left as-is. +- m7: `to_table` 32.3 vs `to_table_staged` 71.5 ns/elem INTERP (~2.2× over materialize-then-convert). Stage 5 findings: - **`emit_array_join` generalized instead of a parallel `emit_table_join`**: the lead loop, bind diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 4a9015f62..30402ff49 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -19,7 +19,9 @@ are stable now). values-only / zipped slot walks; key-equality `where` + terminator folds to an O(1) probe — the `point_lookup` / `point_lookup_scan` pair measures it; joins fuse on either side, and a table srcB joined on its bare key probes the table instead of building the join hash — the `join_probe` / - `join_probe_build` pair measures it; group_by / reverse defer to tier-2 until their stages land). + `join_probe_build` pair measures it; a trailing `to_table()` inserts straight into the result + table with no intermediate array — the `to_table` / `to_table_staged` pair measures it; + group_by / reverse defer to tier-2). `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`. @@ -34,171 +36,175 @@ signal, JIT deltas as indicative.** | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 34.8 | 5.9 | 5.8 | 60.6 | 159.5 | 19.2 | -| `all_match` | 27.5 | 3.5 | 3.4 | 56.1 | 154.1 | 16.4 | +| `aggregate_match` | 34.9 | 5.9 | 5.8 | 60.8 | 158.9 | 19.5 | +| `all_match` | 27.7 | 3.5 | 3.4 | 56.0 | 153.2 | 15.9 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.6 | 5.9 | 8.8 | 58.4 | 164.3 | 17.3 | -| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.6 | -| `bare_order_where` | 284.5 | 117.8 | 126.7 | 300.9 | 291.5 | 163.8 | -| `chained_select_collapse` | — | 18.3 | 17.5 | 70.4 | 162.2 | 28.0 | -| `chained_where` | 36.1 | 6.6 | 7.1 | 104.9 | 183.8 | 24.1 | -| `contains_match` | 0.0 | 2.2 | 1.4 | 29.1 | 72.0 | 6.6 | -| `count_aggregate` | 29.8 | 4.1 | 4.1 | 63.7 | 155.9 | 20.3 | -| `cross_join` | 12556.2 | 3697.8 | — | 4012.8 | 4069.8 | — | +| `average_aggregate` | 30.3 | 5.9 | 8.7 | 58.5 | 163.4 | 17.3 | +| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.1 | +| `bare_order_where` | 282.9 | 118.2 | 125.0 | 300.5 | 290.8 | 163.1 | +| `chained_select_collapse` | — | 17.8 | 17.5 | 70.4 | 161.7 | 27.7 | +| `chained_where` | 41.5 | 6.6 | 7.1 | 104.8 | 182.1 | 24.0 | +| `contains_match` | 0.0 | 2.2 | 1.4 | 28.9 | 71.5 | 6.6 | +| `count_aggregate` | 29.6 | 4.3 | 4.1 | 63.5 | 154.0 | 20.2 | +| `cross_join` | 12896.3 | 3681.4 | — | 4018.5 | 4096.4 | — | | `decs_count_bare_pred` | — | — | 4.1 | — | — | — | -| `distinct_by_count` | 41.0 | 15.7 | 15.6 | 70.6 | 160.7 | 26.6 | -| `distinct_by_order_take` | 239.3 | 22.1 | 23.4 | 123.7 | 163.1 | 48.5 | -| `distinct_by_order_to_array` | 238.9 | 22.1 | 23.5 | 124.2 | 163.1 | 48.8 | -| `distinct_count` | 41.0 | 15.8 | 15.8 | 70.8 | 162.4 | 27.0 | -| `distinct_count_pred` | 254.3 | 15.8 | 15.9 | 112.2 | 177.8 | 26.8 | +| `distinct_by_count` | 41.4 | 15.7 | 15.7 | 70.4 | 161.3 | 26.8 | +| `distinct_by_order_take` | 239.9 | 22.3 | 23.3 | 123.9 | 162.0 | 48.8 | +| `distinct_by_order_to_array` | 237.8 | 22.3 | 23.3 | 124.3 | 162.5 | 48.8 | +| `distinct_count` | 41.8 | 15.9 | 15.7 | 70.7 | 161.8 | 27.0 | +| `distinct_count_pred` | 252.1 | 15.8 | 15.6 | 111.9 | 176.7 | 26.6 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.8 | 29.2 | 29.3 | 123.7 | 197.4 | — | -| `groupby_count` | 141.9 | 19.5 | 19.5 | 75.0 | 167.5 | 162.7 | -| `groupby_first` | 252.6 | 19.5 | 20.2 | 72.2 | 162.7 | — | -| `groupby_having_count` | 141.8 | 19.5 | 19.5 | 74.8 | 169.1 | — | -| `groupby_having_hidden_sum` | 175.7 | 23.3 | 22.6 | 118.8 | 192.7 | — | -| `groupby_having_post_where` | 171.2 | 20.8 | 20.8 | 114.6 | 189.2 | — | -| `groupby_max` | 173.9 | 24.9 | 25.4 | 120.5 | 193.1 | — | -| `groupby_min` | 173.7 | 25.0 | 25.1 | 120.0 | 192.9 | — | -| `groupby_multi_reducer` | 190.8 | 30.2 | 30.6 | 124.9 | 196.2 | — | -| `groupby_select_order` | 170.9 | 20.8 | 20.8 | 114.8 | 188.6 | — | -| `groupby_select_sum` | 198.9 | 38.6 | 38.2 | 101.7 | 195.2 | — | -| `groupby_sum` | 170.8 | 20.8 | 20.8 | 114.9 | 188.4 | 192.8 | -| `groupby_where_count` | 76.0 | 14.1 | 14.3 | 116.6 | 186.3 | — | -| `groupby_where_sum` | 86.7 | 14.1 | 14.7 | 116.4 | 186.4 | — | -| `join_count` | 38.3 | 51.3 | 64.6 | 113.1 | 183.4 | 65.6 | -| `join_groupby_count` | 157.6 | 77.4 | 88.8 | 177.7 | 230.9 | — | -| `join_groupby_to_array` | 189.1 | 78.0 | 90.6 | 215.4 | 213.5 | — | -| `join_probe` | — | — | — | — | — | 47.3 | -| `join_probe_build` | — | — | — | — | — | 79.1 | -| `join_select` | 152.6 | 72.5 | 84.7 | 188.7 | 214.4 | — | -| `join_where_count` | 48.6 | 61.6 | 76.8 | 160.4 | 199.8 | 81.4 | -| `last_match` | 0.0 | 6.1 | 13.9 | 65.1 | 159.7 | 31.0 | -| `long_count_aggregate` | 29.1 | 4.1 | 4.1 | 63.4 | 154.3 | 21.2 | -| `max_aggregate` | 30.7 | 6.0 | 6.8 | 58.6 | 163.1 | 17.0 | -| `min_aggregate` | 31.2 | 6.0 | 6.9 | 58.7 | 163.6 | 17.0 | -| `order_by_multi_key` | 348.8 | 272.2 | 282.9 | 458.7 | 449.2 | 334.0 | -| `order_distinct_take` | 137.8 | 15.9 | 99.3 | 72.5 | 162.8 | 31.3 | -| `order_reverse_normalized` | 38.1 | 16.3 | 20.0 | 70.7 | 170.6 | — | -| `order_take_desc` | 38.5 | 16.2 | 20.4 | 70.1 | 170.4 | 33.3 | +| `groupby_average` | 171.0 | 29.4 | 29.0 | 123.0 | 196.4 | — | +| `groupby_count` | 142.4 | 19.2 | 19.1 | 74.8 | 167.1 | 164.5 | +| `groupby_first` | 251.1 | 19.2 | 19.7 | 72.1 | 162.2 | — | +| `groupby_having_count` | 142.0 | 19.1 | 19.1 | 74.7 | 166.3 | — | +| `groupby_having_hidden_sum` | 176.6 | 22.3 | 22.3 | 118.0 | 187.9 | — | +| `groupby_having_post_where` | 173.2 | 20.5 | 20.4 | 114.4 | 187.4 | — | +| `groupby_max` | 173.5 | 24.9 | 24.8 | 119.6 | 191.4 | — | +| `groupby_min` | 173.8 | 25.3 | 24.8 | 119.6 | 192.5 | — | +| `groupby_multi_reducer` | 190.5 | 30.4 | 30.0 | 124.7 | 196.1 | — | +| `groupby_select_order` | 172.1 | 20.5 | 20.4 | 114.3 | 188.6 | — | +| `groupby_select_sum` | 199.6 | 38.5 | 38.0 | 101.5 | 194.4 | — | +| `groupby_sum` | 172.1 | 20.5 | 20.4 | 114.6 | 187.6 | 194.6 | +| `groupby_where_count` | 76.4 | 14.1 | 14.2 | 115.1 | 185.8 | — | +| `groupby_where_sum` | 87.5 | 14.2 | 14.5 | 116.0 | 186.7 | — | +| `join_count` | 38.4 | 51.4 | 63.6 | 112.9 | 183.8 | 65.4 | +| `join_groupby_count` | 158.4 | 77.8 | 87.8 | 177.4 | 233.1 | — | +| `join_groupby_to_array` | 189.8 | 78.7 | 89.6 | 214.7 | 214.1 | — | +| `join_probe` | — | — | — | — | — | 46.9 | +| `join_probe_build` | — | — | — | — | — | 79.5 | +| `join_select` | 151.8 | 72.8 | 84.9 | 189.5 | 217.4 | — | +| `join_where_count` | 39.7 | 61.7 | 78.7 | 160.5 | 199.8 | 81.6 | +| `last_match` | 0.0 | 5.9 | 14.0 | 65.0 | 159.2 | 31.0 | +| `long_count_aggregate` | 29.9 | 4.1 | 4.1 | 63.4 | 154.0 | 20.1 | +| `max_aggregate` | 31.1 | 6.0 | 6.8 | 58.7 | 162.1 | 16.9 | +| `min_aggregate` | 31.0 | 6.0 | 6.9 | 58.7 | 162.9 | 17.0 | +| `order_by_multi_key` | 340.9 | 270.9 | 279.5 | 459.2 | 446.7 | 336.4 | +| `order_distinct_take` | 138.7 | 15.9 | 98.6 | 72.6 | 162.8 | 31.6 | +| `order_reverse_normalized` | 38.8 | 16.3 | 19.8 | 70.9 | 169.9 | — | +| `order_take_desc` | 38.5 | 16.3 | 19.9 | 70.1 | 170.8 | 33.3 | | `point_lookup` | — | — | — | — | — | 0.0 | -| `point_lookup_scan` | — | — | — | — | — | 8.4 | -| `reverse_distinct_by` | 295.5 | 21.3 | 28.0 | 70.9 | 162.2 | — | -| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.2 | 58.8 | -| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | — | -| `select_count` | 0.1 | 0.0 | 2.2 | 69.3 | 2.2 | 0.0 | -| `select_many` | — | 190.7 | — | — | — | — | -| `select_where` | 207.9 | 11.2 | 19.5 | 195.5 | 188.7 | 37.6 | -| `select_where_count` | 32.4 | 5.1 | 7.4 | 64.6 | 158.7 | 21.7 | -| `select_where_order_take` | 36.3 | 12.3 | 15.1 | 72.7 | 164.5 | 34.5 | -| `select_where_sum` | 37.2 | 7.5 | 7.5 | 66.5 | 164.6 | 23.3 | -| `single_match` | 0.0 | 2.9 | 5.5 | 58.4 | 151.5 | 22.6 | -| `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | 0.3 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 59.9 | 153.1 | 18.3 | -| `sort_first` | 37.9 | 11.0 | 13.3 | 64.9 | 167.0 | 32.0 | -| `sort_take` | 38.4 | 16.3 | 20.9 | 70.5 | 171.5 | 33.3 | -| `sort_take_select` | 38.2 | 16.3 | 20.9 | 71.0 | 170.8 | 33.2 | -| `sum_aggregate` | 29.6 | 2.1 | 2.1 | 54.4 | 153.0 | 13.5 | -| `sum_where` | 32.1 | 4.4 | 11.5 | 63.8 | 154.6 | 21.3 | -| `take_count` | 3.9 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | +| `point_lookup_scan` | — | — | — | — | — | 8.3 | +| `reverse_distinct_by` | 295.3 | 21.3 | 28.2 | 71.1 | 161.9 | — | +| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.5 | +| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.1 | — | +| `select_count` | 0.1 | 0.0 | 2.2 | 68.3 | 2.2 | 0.0 | +| `select_many` | — | 191.7 | — | — | — | — | +| `select_where` | 204.1 | 11.2 | 19.3 | 197.1 | 183.4 | 37.7 | +| `select_where_count` | 32.5 | 5.1 | 7.4 | 64.9 | 156.9 | 22.7 | +| `select_where_order_take` | 37.1 | 12.3 | 14.8 | 72.8 | 165.4 | 35.3 | +| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.5 | 161.9 | 25.0 | +| `single_match` | 0.0 | 2.9 | 5.5 | 58.2 | 151.2 | 22.6 | +| `skip_take` | 0.5 | 0.1 | 0.2 | 3.1 | 2.8 | 0.3 | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 60.0 | 153.2 | 18.2 | +| `sort_first` | 38.4 | 11.1 | 13.3 | 65.1 | 166.7 | 32.2 | +| `sort_take` | 38.7 | 16.3 | 20.0 | 70.8 | 170.4 | 33.1 | +| `sort_take_select` | 38.7 | 16.3 | 20.1 | 71.3 | 170.6 | 33.3 | +| `sum_aggregate` | 30.5 | 2.1 | 2.1 | 54.6 | 153.2 | 13.4 | +| `sum_where` | 33.2 | 4.3 | 4.2 | 63.4 | 154.6 | 20.4 | +| `take_count` | 3.8 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | 0.3 | | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | -| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.2 | 75.6 | 16.4 | -| `to_array_filter` | 70.2 | 11.8 | 11.8 | 71.5 | 165.1 | 29.0 | -| `where_join_count` | 41.2 | 29.1 | 41.7 | 132.7 | 168.6 | — | -| `zip_count_pred` | 39.3 | 15.9 | — | 315.0 | 321.2 | — | -| `zip_dot_product` | 46.2 | 12.6 | 10.6 | 309.2 | 319.0 | — | -| `zip_dot_product_3arg` | 46.2 | 12.8 | — | 309.4 | 320.7 | — | -| `zip_reverse_to_array` | — | 31.7 | — | 345.0 | 353.4 | — | +| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.1 | 76.2 | 16.4 | +| `to_array_filter` | 71.1 | 11.8 | 11.7 | 71.3 | 164.3 | 28.9 | +| `to_table` | — | — | — | — | — | 32.5 | +| `to_table_staged` | — | — | — | — | — | 68.3 | +| `where_join_count` | 41.5 | 28.8 | 40.9 | 132.1 | 167.0 | — | +| `zip_count_pred` | 39.5 | 15.8 | — | 318.5 | 320.2 | — | +| `zip_dot_product` | 47.2 | 12.6 | 10.8 | 312.7 | 318.6 | — | +| `zip_dot_product_3arg` | 47.1 | 12.7 | — | 312.8 | 317.5 | — | +| `zip_reverse_to_array` | — | 31.4 | — | 348.8 | 352.2 | — | ## JIT | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.0 | 0.3 | 0.6 | 21.7 | 27.1 | 13.5 | -| `all_match` | 27.9 | 0.3 | 0.2 | 18.1 | 26.2 | 13.5 | +| `aggregate_match` | 35.1 | 0.3 | 0.6 | 22.8 | 26.2 | 13.5 | +| `all_match` | 27.9 | 0.3 | 0.2 | 17.5 | 25.3 | 13.6 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.5 | 1.0 | 3.6 | 18.1 | 25.7 | 13.5 | -| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.2 | -| `bare_order_where` | 188.1 | 35.3 | 35.5 | 106.7 | 53.3 | 79.0 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.5 | 14.1 | -| `chained_where` | 36.1 | 0.6 | 0.8 | 35.7 | 32.0 | 17.7 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 9.2 | 4.7 | -| `count_aggregate` | 29.6 | 0.3 | 0.6 | 20.6 | 26.4 | 13.5 | -| `cross_join` | 5976.1 | 733.7 | — | 837.5 | 767.7 | — | +| `average_aggregate` | 30.5 | 1.0 | 3.5 | 17.4 | 24.7 | 13.5 | +| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.1 | +| `bare_order_where` | 186.2 | 34.1 | 35.0 | 104.9 | 52.8 | 78.8 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | +| `chained_where` | 36.9 | 0.6 | 0.8 | 34.7 | 31.3 | 17.8 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 8.9 | 4.7 | +| `count_aggregate` | 29.7 | 0.3 | 0.6 | 17.5 | 25.5 | 13.4 | +| `cross_join` | 5965.9 | 731.0 | — | 833.2 | 770.0 | — | | `decs_count_bare_pred` | — | — | 0.6 | — | — | — | -| `distinct_by_count` | 41.2 | 1.1 | 1.1 | 20.6 | 33.6 | 14.1 | -| `distinct_by_order_take` | 239.4 | 1.7 | 2.6 | 47.4 | 39.2 | 30.1 | -| `distinct_by_order_to_array` | 239.3 | 1.7 | 2.7 | 47.4 | 38.9 | 30.1 | -| `distinct_count` | 41.3 | 1.1 | 1.1 | 20.5 | 33.7 | 14.1 | -| `distinct_count_pred` | 252.4 | 1.1 | 1.3 | 37.4 | 43.4 | 14.1 | +| `distinct_by_count` | 41.7 | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | +| `distinct_by_order_take` | 239.3 | 1.7 | 2.6 | 46.3 | 38.8 | 30.1 | +| `distinct_by_order_to_array` | 240.2 | 1.7 | 2.7 | 46.4 | 38.7 | 30.3 | +| `distinct_count` | 41.6 | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | +| `distinct_count_pred` | 251.7 | 1.1 | 1.3 | 37.7 | 43.8 | 14.0 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 170.7 | 1.6 | 1.9 | 35.9 | 44.3 | — | -| `groupby_count` | 141.5 | 1.3 | 1.5 | 20.6 | 32.7 | 42.9 | -| `groupby_first` | 252.2 | 1.3 | 2.3 | 20.6 | 33.3 | — | -| `groupby_having_count` | 141.3 | 1.3 | 1.5 | 20.6 | 33.3 | — | -| `groupby_having_hidden_sum` | 175.6 | 1.5 | 1.7 | 36.0 | 45.2 | — | -| `groupby_having_post_where` | 171.9 | 1.6 | 2.0 | 35.9 | 44.3 | — | -| `groupby_max` | 172.8 | 1.5 | 1.9 | 36.0 | 45.9 | — | -| `groupby_min` | 173.4 | 1.5 | 1.8 | 35.9 | 45.9 | — | -| `groupby_multi_reducer` | 190.6 | 1.6 | 2.0 | 36.2 | 46.1 | — | -| `groupby_select_order` | 170.6 | 1.4 | 1.9 | 35.7 | 44.2 | — | -| `groupby_select_sum` | 198.6 | 2.8 | 3.2 | 32.2 | 39.7 | — | -| `groupby_sum` | 170.3 | 1.4 | 1.7 | 35.8 | 44.2 | 51.5 | -| `groupby_where_count` | 76.0 | 0.9 | 1.3 | 36.1 | 41.8 | — | -| `groupby_where_sum` | 86.7 | 0.9 | 1.3 | 36.0 | 41.7 | — | -| `join_count` | 38.3 | 10.9 | 11.7 | 43.5 | 71.4 | 33.1 | -| `join_groupby_count` | 157.6 | 18.2 | 20.1 | 68.5 | 89.9 | — | -| `join_groupby_to_array` | 189.7 | 17.6 | 19.5 | 80.3 | 36.2 | — | -| `join_probe` | — | — | — | — | — | 24.2 | +| `groupby_average` | 171.5 | 1.5 | 1.9 | 35.5 | 45.5 | — | +| `groupby_count` | 142.1 | 1.3 | 1.5 | 20.6 | 33.8 | 42.7 | +| `groupby_first` | 251.9 | 1.3 | 2.3 | 20.6 | 34.4 | — | +| `groupby_having_count` | 141.3 | 1.3 | 1.5 | 20.6 | 33.9 | — | +| `groupby_having_hidden_sum` | 176.9 | 1.5 | 1.7 | 35.5 | 45.2 | — | +| `groupby_having_post_where` | 171.1 | 1.4 | 1.9 | 35.5 | 44.1 | — | +| `groupby_max` | 173.4 | 1.5 | 1.9 | 35.5 | 45.8 | — | +| `groupby_min` | 172.8 | 1.5 | 1.8 | 35.6 | 45.8 | — | +| `groupby_multi_reducer` | 190.2 | 1.6 | 1.9 | 35.8 | 46.1 | — | +| `groupby_select_order` | 170.9 | 1.4 | 1.9 | 35.4 | 44.3 | — | +| `groupby_select_sum` | 200.0 | 2.8 | 3.2 | 31.8 | 39.9 | — | +| `groupby_sum` | 170.9 | 1.4 | 1.6 | 35.5 | 44.3 | 51.2 | +| `groupby_where_count` | 76.3 | 0.9 | 1.3 | 35.6 | 41.8 | — | +| `groupby_where_sum` | 87.6 | 0.9 | 1.3 | 35.6 | 41.9 | — | +| `join_count` | 38.2 | 10.9 | 11.8 | 42.6 | 71.5 | 32.2 | +| `join_groupby_count` | 156.9 | 17.6 | 19.5 | 68.3 | 89.8 | — | +| `join_groupby_to_array` | 189.8 | 17.5 | 19.4 | 79.3 | 36.1 | — | +| `join_probe` | — | — | — | — | — | 24.0 | | `join_probe_build` | — | — | — | — | — | 38.1 | -| `join_select` | 95.4 | 19.7 | 21.7 | 75.0 | 94.3 | — | -| `join_where_count` | 39.4 | 18.9 | 20.8 | 64.4 | 78.4 | 37.9 | -| `last_match` | 0.0 | 0.5 | 1.4 | 18.9 | 26.8 | 22.9 | -| `long_count_aggregate` | 29.0 | 0.3 | 0.6 | 20.5 | 26.4 | 13.5 | -| `max_aggregate` | 30.7 | 0.3 | 0.5 | 18.4 | 27.7 | 13.5 | -| `min_aggregate` | 30.7 | 0.3 | 0.5 | 18.4 | 27.7 | 13.5 | -| `order_by_multi_key` | 252.6 | 53.4 | 55.0 | 125.4 | 71.9 | 129.1 | -| `order_distinct_take` | 137.9 | 1.1 | 75.7 | 20.9 | 36.0 | 14.0 | -| `order_reverse_normalized` | 38.2 | 0.7 | 1.4 | 24.6 | 28.5 | — | -| `order_take_desc` | 38.1 | 0.7 | 1.4 | 24.6 | 28.4 | 17.7 | +| `join_select` | 94.0 | 19.6 | 21.7 | 73.8 | 95.2 | — | +| `join_where_count` | 39.8 | 18.9 | 20.8 | 63.5 | 78.3 | 37.8 | +| `last_match` | 0.0 | 0.5 | 1.4 | 18.2 | 25.9 | 22.9 | +| `long_count_aggregate` | 29.2 | 0.3 | 0.6 | 17.5 | 25.5 | 13.4 | +| `max_aggregate` | 31.0 | 0.3 | 0.5 | 17.4 | 27.1 | 13.4 | +| `min_aggregate` | 31.1 | 0.3 | 0.5 | 17.4 | 27.0 | 13.5 | +| `order_by_multi_key` | 250.0 | 53.1 | 54.7 | 123.6 | 71.3 | 129.4 | +| `order_distinct_take` | 138.1 | 1.1 | 75.3 | 20.9 | 35.7 | 14.0 | +| `order_reverse_normalized` | 38.5 | 0.7 | 1.3 | 22.0 | 27.7 | — | +| `order_take_desc` | 38.2 | 0.7 | 1.3 | 22.0 | 27.5 | 17.8 | | `point_lookup` | — | — | — | — | — | 0.0 | | `point_lookup_scan` | — | — | — | — | — | 6.0 | -| `reverse_distinct_by` | 295.4 | 1.5 | 3.2 | 20.6 | 34.6 | — | -| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 26.9 | +| `reverse_distinct_by` | 295.7 | 1.5 | 3.2 | 20.5 | 34.4 | — | +| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 26.9 | | `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.9 | — | -| `select_count` | 0.1 | 0.0 | 0.0 | 66.0 | 0.0 | 0.0 | -| `select_many` | — | 62.7 | — | — | — | — | -| `select_where` | 109.1 | 4.1 | 5.3 | 76.2 | 23.0 | 28.1 | -| `select_where_count` | 32.3 | 0.3 | 0.6 | 18.5 | 27.2 | 13.4 | -| `select_where_order_take` | 36.5 | 0.7 | 1.4 | 19.0 | 27.9 | 23.0 | -| `select_where_sum` | 37.1 | 0.4 | 0.6 | 18.0 | 26.3 | 13.4 | -| `single_match` | 0.0 | 0.4 | 1.1 | 46.3 | 23.2 | 17.4 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 | -| `skip_while_match` | 3.5 | 0.4 | 0.4 | 45.8 | 22.7 | 13.3 | -| `sort_first` | 37.9 | 0.4 | 1.3 | 18.1 | 27.5 | 17.3 | -| `sort_take` | 37.9 | 0.7 | 1.4 | 24.6 | 28.3 | 17.8 | -| `sort_take_select` | 37.8 | 0.7 | 1.4 | 24.6 | 28.4 | 17.8 | -| `sum_aggregate` | 29.9 | 0.3 | 0.1 | 23.2 | 25.6 | 13.5 | -| `sum_where` | 32.1 | 0.3 | 0.6 | 18.5 | 27.2 | 13.4 | -| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.2 | +| `select_count` | 0.1 | 0.0 | 0.0 | 67.0 | 0.0 | 0.0 | +| `select_many` | — | 62.5 | — | — | — | — | +| `select_where` | 110.7 | 4.1 | 5.3 | 74.8 | 22.1 | 27.9 | +| `select_where_count` | 32.6 | 0.3 | 0.6 | 17.4 | 26.3 | 13.4 | +| `select_where_order_take` | 36.7 | 0.7 | 1.3 | 18.4 | 27.3 | 23.1 | +| `select_where_sum` | 37.2 | 0.4 | 0.6 | 17.4 | 25.6 | 13.4 | +| `single_match` | 0.0 | 0.4 | 1.1 | 46.2 | 22.3 | 17.3 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.3 | 0.2 | +| `skip_while_match` | 3.4 | 0.4 | 0.4 | 46.0 | 21.8 | 13.2 | +| `sort_first` | 38.4 | 0.4 | 1.3 | 17.4 | 26.7 | 17.2 | +| `sort_take` | 38.6 | 0.7 | 1.3 | 22.0 | 27.9 | 17.8 | +| `sort_take_select` | 38.3 | 0.7 | 1.3 | 21.9 | 27.7 | 17.8 | +| `sum_aggregate` | 30.6 | 0.3 | 0.1 | 17.7 | 24.9 | 13.5 | +| `sum_where` | 33.0 | 0.3 | 0.6 | 17.4 | 26.3 | 13.4 | +| `take_count` | 1.9 | 0.1 | 0.1 | 1.2 | 0.3 | 0.2 | | `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.1 | | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | | `take_where_count` | 0.9 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.1 | 9.3 | 13.5 | -| `to_array_filter` | 47.4 | 3.3 | 3.3 | 21.5 | 35.1 | 20.2 | -| `where_join_count` | 39.4 | 5.8 | 6.8 | 49.7 | 42.3 | — | -| `zip_count_pred` | 39.4 | 0.1 | — | 117.0 | 33.9 | — | -| `zip_dot_product` | 46.5 | 0.1 | 0.1 | 117.1 | 33.8 | — | -| `zip_dot_product_3arg` | 46.4 | 0.1 | — | 116.9 | 33.7 | — | -| `zip_reverse_to_array` | — | 4.5 | — | 128.4 | 51.3 | — | +| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.4 | 8.9 | 13.4 | +| `to_array_filter` | 48.9 | 3.2 | 3.3 | 20.8 | 35.5 | 20.2 | +| `to_table` | — | — | — | — | — | 28.8 | +| `to_table_staged` | — | — | — | — | — | 41.6 | +| `where_join_count` | 41.3 | 5.7 | 6.8 | 48.8 | 41.8 | — | +| `zip_count_pred` | 39.8 | 0.1 | — | 115.3 | 33.8 | — | +| `zip_dot_product` | 47.3 | 0.1 | 0.1 | 115.4 | 33.8 | — | +| `zip_dot_product_3arg` | 47.1 | 0.1 | — | 115.3 | 33.7 | — | +| `zip_reverse_to_array` | — | 4.5 | — | 127.0 | 51.4 | — | ## Missing lanes (the `—` cells) @@ -214,9 +220,10 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk). - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) -- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` (table group_by fusion is staged — see `LINQ_TO_TABLE.md`; the two marker cells track the tier-2 cost until then) plus the join-composition lanes (`join_select` / `where_join_count` would fuse today but aren't instantiated; `join_groupby_*` needs the staged group_by), `decs_count_bare_pred` (decs-only). +- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` (table group_by fusion is a named deferred edge — see `LINQ_TO_TABLE.md`; the two marker cells track the tier-2 cost) plus the join-composition lanes (`join_select` / `where_join_count` would fuse today but aren't instantiated; `join_groupby_*` needs the deferred group_by), `decs_count_bare_pred` (decs-only). - **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design. - **`join_probe` / `join_probe_build` non-m7** — m7-only A/B pair: a table srcB joined on its bare key probes the user's table per lead row (no internal join hash, no build loop); the `_build` twin feeds the identical rows pre-materialized to a kv array, forcing the hashed build. Other sources have no keyed-srcB analog by design. +- **`to_table` / `to_table_staged` non-m7** — m7-only A/B pair for the `to_table()` sink: the fused insert-loop lands the kv chain straight in the result table (reserve from O(1) length); the `_staged` twin materializes the same projection to an array first, then converts via the consuming builtin `to_table_move` — the shape every chain had before the sink arm. The sink itself works over any direct-loop source (the array lane fuses it too); only the bench pair is table-scoped. ## Accepted floors diff --git a/benchmarks/sql/table.das b/benchmarks/sql/table.das index e33e7ee64..d0afb6557 100644 --- a/benchmarks/sql/table.das +++ b/benchmarks/sql/table.das @@ -687,3 +687,30 @@ def to_array_filter_m7(b : B?) { } } } + +[benchmark] +def to_table_m7(b : B?) { + // fused insert-loop sink: the kv chain lands straight in the result table, reserve from O(1) length + b |> run("to_table", N) { + var tab <- _fold(unsafe(each_kv(g_t)) |> _select((_.key => _.value.price)) |> to_table()) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + +[benchmark] +def to_table_staged_m7(b : B?) { + // staged baseline: materialize the kv tuples to an array, then convert — the shape without the sink arm + b |> run("to_table_staged", N) { + var rows <- _fold(unsafe(each_kv(g_t)) |> _select((_.key => _.value.price)) |> to_array()) + var tab <- to_table_move(rows) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} diff --git a/daslib/linq.das b/daslib/linq.das index 206ac14d7..2906aa443 100644 --- a/daslib/linq.das +++ b/daslib/linq.das @@ -150,6 +150,47 @@ def to_table(a : array; key : block<(v : TT -&) : auto>; elementSelect return <- to_table_impl_const(a, type, key, elementSelector) } +def to_table(var a : iterator const>) : table { + //! Collects an iterator of `(key, value)` tuples (e.g. an `each_kv` chain or a `k => v` + //! projection) into a `table`. Duplicate keys keep the last occurrence. + var tab : table + for (x in a) { + tab[x._0] := x._1 + } + return <- tab +} + +def to_table(var a : iterator) : table { + //! Collects an iterator of bare hashable keys into the `table` set form. + var tab : table + for (at in a) { + __builtin_table_set_insert(tab, at) + } + return <- tab +} + +def to_table(a : array>) : table { + //! Collects an array of `(key, value)` tuples into a `table` without consuming + //! the source (values are cloned). Duplicate keys keep the last occurrence. + var tab : table + tab |> reserve(length(a)) + for (x in a) { + tab[x._0] := x._1 + } + return <- tab +} + +def to_table(a : array) : table { + //! Collects an array of bare hashable keys into the `table` set form without + //! consuming the source. + var tab : table + tab |> reserve(length(a)) + for (at in a) { + __builtin_table_set_insert(tab, at) + } + return <- tab +} + [unused_argument(tt)] def private concat_impl(var a; var b; tt : auto(TT); reserveSize : int) : array { //! Concatenates two arrays or iterators diff --git a/daslib/linq_fold.md b/daslib/linq_fold.md index 67388d8db..741099e13 100644 --- a/daslib/linq_fold.md +++ b/daslib/linq_fold.md @@ -661,6 +661,7 @@ The imperative code has a few subtle co-occurrence rules that may not map cleanl - **2026-05-31 (deferred materialization — `last` + group-by `first`)** — extends the element-handle deferral to the two remaining survivors-≪-N reducers: the full-walk `last`/`last_or_default` terminator (in `emit_early_exit_lane`) and `first`-per-group inside `plan_group_by_core`. `last` cloned the whole `Car` (`lst := it`) on *every* match and kept only the final one; over a deferring source it now stores the node **handle** per match and runs `materialize_handle` once, for the single survivor. `group_by(brand) |> select((key, first per group))` pinned the whole row (`slot := it`) in `mk_reducer_first`, forcing `wrap_source_loop` to build every element; a new `mk_reducer_first_deferred` materializes from the handle *inside the table miss-branch*, so the walk field-prunes to just the group key and `build_xml_row` runs only once per distinct group. Both ride the same four `SourceAdapter` hooks — only `XmlAdapter` defers; `array`/`decs` pass `null`/no-defer and stay byte-identical (the `emit_reducer_branches` adapter param defaults to `null`; the group-by gate also requires the bind be the raw element — `itName == bind_name`, i.e. no upstream `_select` rebinds it — since the handle yields the raw row). **Design validated by hand-coded micro-bench first** (the `last_match` / `groupby_first` lanes in `benchmarks/micro/sort_distinct_take_shapes.das`). Wins (m5f INTERP / JIT, string clones 100 000 → K): `last_match` 219 → 65 / 21 (K=1), `groupby_first` 339 → 72 / 22 (K=#brands). Closes `groupby_first` (the last item on the prior entry's floor list). Still not deferred: `bare_order_where` / `order_reverse_normalized` (all rows out), `reverse_distinct_by` (tier-2, no fused emit). - **2026-05-31 (forward keep-last — `reverse |> distinct[_by]` over forward sources)** — the only buffered shape still falling to tier-2 over a forward source. `reverse() |> distinct_by(K) |> to_array()` means "keep the LAST forward row per key, output in reverse-discovery order." The sole fused emit was `emit_reverse_backward_walk_dset_gate` — a backward **index** walk (`src[len-1-k]`) gated `array_source`, so XML / decs / plain iterators (forward-only, no random access) cascaded: `reverse()` materialized all N, then `distinct_by` walked. New `emit_reverse_distinct_forward_keeplast` (R-2b, gated by the exact complement `non_array_source`) does a single forward pass instead — `table`, **OVERWRITE** the slot per element (so it ends at the last forward occurrence + its seq), then sort survivors by **descending seq** (`build_surrogate_cmp(true)`) and emit. Output-identical to the backward walk (descending forward-index of each last occurrence), proven by parity vs both `m3f` (array backward walk) and the tier-2 cascade. It rides `emit_terminator_lane` → `wrap_source_loop`, so it's source-generic: **XML defers** (the table holds `(seq, xml_node)` and `build_xml_row` runs only for the K survivors — field-pruned to the key); **decs / iterator** store the full element (no handle), winning single-pass over the cascade's reverse-buffer + second walk. `ctx.top` is `null` for decs (bridge-driven), so `elemType` falls back to `ctx.src->element_type()`; arrays still match the backward-walk row first (registered earlier), so they're byte-identical. **Design validated by hand-coded micro-bench first** (the `reverse_distinct_by` lane in `benchmarks/micro/sort_distinct_take_shapes.das`: INTERP 405.8 → 88.6, JIT 162.6 → 37.0, string clones 100 000 → #keys). Wins: `reverse_distinct_by` m5f **429 → 74 INTERP / 166.6 → 22 JIT** (clones 100 000 → 5), and the previously-`—` decs **m4 lights up at 27.7 / 5.0** (near the array fast path). Closes `reverse_distinct_by` — the last forward-source buffered floor. - **2026-06-11 (table joins — adapter-generalized `emit_array_join` + table-srcB probe)** — table-arc stage 5 (branch `bbatkin/linq-table-each-kv`; plan: `benchmarks/sql/LINQ_TO_TABLE.md`). Two halves. (1) **Lead generalization**: `emit_array_join` no longer hand-rolls its `for (tup_a in srcA)` — the lead loop, bind name, and lead invoke-param spelling come from the adapter (`wrap_source_loop(LoopDispatch(Each=null))` / `bind_name(at)` / new `SourceAdapter.invoke_param_type()` capability, default `invoke_src_param_type(arrayTop())`), so `TableAdapter` just sets `can_join() = true` and routes `emit_join_hook` to the same emitter: a table-lead join walks the kv usage-pruned slot iterator(s) — a join body touching only `c.value.*` walks `values(tab)` alone — and group joins stay outer over every slot. decs/xml/json hooks untouched (nested-callback walks). (2) **Table-srcB probe**: when the join's srcb is `each_kv(tab)` / `keys(set)` joined on its **bare key** (`join_srcb_table_call` + `join_keyb_is_bare_key` on the peeled keyb), the emitter skips the internal `table>` + build loop entirely — srcB binds the user's table (const param) and the per-A probe is a key lookup, usage-pruned like the point-lookup fold (count-no-where / key-only → `key_exists`, value shapes → by-ref bind off `unsafe(tab?[k])`, whole-pair → kv-tuple bind). Unique table keys ⇒ probe ≡ hash semantics exactly; a bare field read is pure by construction so skipping keyb's per-B evaluation is unobservable; non-bare keybs and `group_join` (result consumes the whole bucket) keep the hashed build. Plumbing: per-pair statements factored into `build_join_pair_core` (`JoinPairCore`), shared by `build_join_standalone_pieces` (keeps the group-join arm + `get`-bucket wrap — hash-mode AST unchanged for the decs/xml/json callers) and the new `build_join_probe_pieces`. m7: `join_count` / `join_where_count` (table lead) leave tier-2; new `join_probe` vs `join_probe_build` A/B lanes. +- **2026-06-11 (`to_table` sink — fused insert-loop terminator)** — table-arc stage 6 (branch `bbatkin/linq-table-each-kv`; plan: `benchmarks/sql/LINQ_TO_TABLE.md`). Two layers. (1) **Tier-2 surface** (`daslib/linq.das`): selector-free `to_table` over iterators and arrays — `iterator const>` → `table` map (insert via `tab[x._0] := x._1`, builtin `to_table` clone semantics), `iterator` → `table` set (`__builtin_table_set_insert`), plus borrowing `array>` / `array` forms with reserve (builtin only had the consuming `to_table_move` for dynamic arrays). The iterator params are **const-qualified** (`tuple<…> const` / `auto(keyT) const`) — the 50609 mangler-ICE defuse — so the `-const` flavor from `each_kv` chains and the `-&` flavor from `to_sequence` converge on one instantiation. The named kv tuple (`tuple`) matches the positional `tuple` generic directly. Duplicate keys keep the last occurrence (das insert semantics, not C#'s throw). (2) **Fused emit**: `to_table` joins `loop_terminator_family` + `classify_terminator`'s ARRAY (materializer) lane; the new arm in `emit_loop_or_count_lane` rides `emit_fold_array_lane` via a new `FoldArraySpec.bufDeclStmt` slot (replaces the array buffer decl with `var acc : table<…>`) — where/select/ranges/post-take-where plumbing all shared. Per-element insert by shape: a `(k => v)` `ExprMakeTuple` projection **splits** so key and value each evaluate exactly once (`acc[k] = v` direct, no tuple temp); other projections bind `let kvb := proj` once then index; pass-through spells the kv access with the element tuple's **real field names** (`.key`/`.value`) so the kv usage-pruner maps them (positional `._0` would not bind) — a bare `each_kv(tab).to_table()` is a reserve-ahead table clone through the pruned walk, and `keys(tab)` chains land in the set form via `insert`. Reserve fires only on unfiltered walks (`can_reserve_by_length` + no where — a thinned table over-reserves hash buckets, stricter than the array arm), with the take-min variant. Map-vs-set falls out of the terminator call's resolved type (`secondType == void`). Declines that keep tier-2: the 3-arg selector `to_table(key, elementSelector)`, decs sources (explicit guard in `emit_loop_or_count_lane_decs` — its implicit-to_array fall-through would mis-emit an array for a table-typed expr), MakeTuple projections of arity ≠ 2. m7: `to_table` 32.3 vs `to_table_staged` (fused-to_array + builtin `to_table_move`) 71.5 ns/elem INTERP (~2.2×). ## Open questions diff --git a/daslib/linq_fold_common.das b/daslib/linq_fold_common.das index 3037749f2..dbe37f40a 100644 --- a/daslib/linq_fold_common.das +++ b/daslib/linq_fold_common.das @@ -427,7 +427,7 @@ var alias_table : table> <- { "min_max_average", "min_max_average_by", "any", "all", "contains", "first", "first_or_default", "last", "last_or_default", "single", "single_or_default", - "element_at", "element_at_or_default"], + "element_at", "element_at_or_default", "to_table"], // PR D1 — order-by-with-key (excludes bare `order` / `order_descending` which lack a key arg). // Used by plan_group_by's trailing_order slot. "order_by_family" => ["order_by", "order_by_descending"] @@ -956,8 +956,9 @@ enum LinqLane { def classify_terminator(name : string) : LinqLane { if (name == "count") return LinqLane.COUNTER // take/skip/take_while/skip_while trailing (after to_array strip) → ARRAY lane with implicit materialization. + // to_table is a materializer like the no-terminator (to_array) shape — same ARRAY lane, table buffer. if (name == "where_" || name == "select" || name == "take" || name == "skip" - || name == "take_while" || name == "skip_while") return LinqLane.ARRAY + || name == "take_while" || name == "skip_while" || name == "to_table") return LinqLane.ARRAY if (name == "sum" || name == "min" || name == "max" || name == "average" || name == "long_count") return LinqLane.ACCUMULATOR // EARLY_EXIT is also the dispatch lane for full-walk single-return terminators (last/single/element_at/aggregate) — same emit_early_exit_lane shape, different per-op state. if (name == "first" || name == "first_or_default" || name == "any" || name == "all" || name == "contains" @@ -1354,6 +1355,7 @@ struct FoldArraySpec { prologueStmts : array // BEFORE bufDecl — takeN bind, etc. bufElemType : TypeDeclPtr bufName : string + bufDeclStmt : Expression? // non-null replaces the default `var buf : array` decl (e.g. a table buffer for to_table) postBufDeclStmts : array // AFTER bufDecl + optional dsetDecl, BEFORE reserve — e.g. early-return guards (bounded_heap) reserveStmts : array // AFTER postBufDecl, BEFORE source-loop — caller composes preCondStmts : array // bound-vars INSIDE for-loop body, OUTSIDE if-gate @@ -1429,8 +1431,12 @@ def emit_fold_array_lane(var spec : FoldArraySpec; var adapter : SourceAdapter?; bodyStmts |> push_from(spec.prologueStmts) let bufName = spec.bufName var bufElemType = spec.bufElemType - bodyStmts |> push <| qmacro_expr() { - var $i(bufName) : array<$t(bufElemType)> + if (spec.bufDeclStmt != null) { + bodyStmts |> push <| spec.bufDeclStmt + } else { + bodyStmts |> push <| qmacro_expr() { + var $i(bufName) : array<$t(bufElemType)> + } } if (spec.distinctGate != null) { let dg = spec.distinctGate @@ -2366,6 +2372,100 @@ def emit_loop_or_count_lane(var c : Captures; var ctx : EmitCtx; at : LineInfo) prepend_binds(stmts, intermediateBinds) wrap_with_ranges(stmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) loopBody = prepend_precond(wrap_with_condition(stmts_to_expr(stmts), whereCond), preCondStmts) + } elif (lastName == "to_table") { + // to_table materializer — the array arm's skeleton with a table buffer and key inserts. + // Map form splits a `(k => v)` projection so each side evaluates once; other projections + // bind to a local first. Duplicate keys keep the last occurrence (das insert semantics). + var termCall = c.single["term"] + // null type, or the selector-based to_table(key, elementSelector) — keep the tier-2 path + if (termCall._type == null || length(termCall.arguments) != 1) return null + var tabType = strip_const_ref(clone_type(termCall._type)) + let isSet = tabType.secondType == null || tabType.secondType.baseType == Type.tVoid + var stmts : array + var pushExpr : Expression? + if (isSet) { + var keyExpr = projection != null ? projection : qmacro($i(itName)) + pushExpr = qmacro_expr() { + $i(accName) |> insert($e(keyExpr)) + } + } elif (projection == null) { + // spell the kv access with the element tuple's real field names — the kv usage-pruner + // maps named fields (`.key`/`.value` on an each_kv lane); positional `._0` would not bind + var f0 = "_0" + var f1 = "_1" + let elemTupT = top._type.firstType + if (elemTupT != null && elemTupT.argNames |> length == 2) { + f0 = string(elemTupT.argNames[0]) + f1 = string(elemTupT.argNames[1]) + } + pushExpr = qmacro_expr() { + $i(accName)[$i(itName).$f(f0)] := $i(itName).$f(f1) + } + } else { + var proj = projection + if (proj is ExprRef2Value) { + proj = (proj as ExprRef2Value).subexpr + } + if (proj is ExprMakeTuple) { + var mt = proj as ExprMakeTuple + if (mt.values |> length != 2) return null + var keyExpr = mt.values[0] + var valExpr = mt.values[1] + pushExpr = qmacro_expr() { + $i(accName)[$e(keyExpr)] := $e(valExpr) + } + } else { + let kvbName = qn("kvb", at) + var bindInit = qmacro_expr() { + let $i(kvbName) := $e(projection) + } + var bindInsert = qmacro_expr() { + $i(accName)[$i(kvbName)._0] := $i(kvbName)._1 + } + var bindStmts : array <- [bindInit, bindInsert] + pushExpr = stmts_to_expr(bindStmts) + } + } + stmts |> push(wrap_with_condition(pushExpr, postTakeWhereCond)) + prepend_binds(stmts, intermediateBinds) + wrap_with_ranges(stmts, skipExpr, takeExpr, skipWhileCond, takeWhileCond, names) + var perElementPush = stmts_to_expr(stmts) + var prologueStmts : array + append_ranges_prelude(prologueStmts, skipExpr, takeExpr, skipWhileCond, names) + // Reserve only on an unfiltered walk — a where-thinned table over-reserves hash buckets + // (worse than an array's slack), so the gate is stricter than the array arm's. + var reserveStmts : array + if (ctx.src->can_reserve_by_length() && whereCond == null && postTakeWhereCond == null) { + let rtop = ctx.src->arrayTop() + if (rtop != null && rtop._type != null && type_has_length(rtop._type)) { + if (takeExpr != null) { + reserveStmts |> push <| qmacro_expr() { + $i(accName) |> reserve($e(takeExpr) < length($i(srcName)) ? $e(takeExpr) : length($i(srcName))) + } + } else { + reserveStmts |> push <| qmacro_expr() { + $i(accName) |> reserve(length($i(srcName))) + } + } + } + } + var bufDecl = qmacro_expr() { + var $i(accName) : $t(tabType) + } + var tailStmts : array + tailStmts |> push(buffer_return(accName, false)) + return emit_fold_array_lane(FoldArraySpec( + bufDeclStmt = bufDecl, + bufElemType = elementType, + bufName = accName, + prologueStmts <- prologueStmts, + reserveStmts <- reserveStmts, + preCondStmts <- preCondStmts, + whereCond = whereCond, + perElementPush = perElementPush, + tailStmts <- tailStmts, + wrapIter = false + ), ctx.src, at) } else { // Range-prelude in prologueStmts so the lane emits it BEFORE bufDecl — matches the emit_array_lane shape. var stmts : array diff --git a/daslib/linq_fold_decs.das b/daslib/linq_fold_decs.das index 67076715a..1d6b661e4 100644 --- a/daslib/linq_fold_decs.das +++ b/daslib/linq_fold_decs.das @@ -324,6 +324,8 @@ def emit_loop_or_count_lane_decs(var bridge : DecsBridgeShape?; tupName : string if (rangeInfo.postTakeWhereCond == null) return null } // Terminator classification mirrors plan_decs_unroll imperative (linq_fold.das:5781-5791); differs from classify_terminator's 4-lane split because decs has dedicated min_max_by / walk / element_at emit fns with hoisted state. + // to_table is not implemented for decs — decline before the implicit-to_array arm mis-emits an array for a table-typed expr. + if (termName == "to_table") return null let isAccum = (termName == "count" || termName == "long_count" || termName == "sum" || termName == "min" || termName == "max" || termName == "average") let isEarlyExit = (termName == "first" || termName == "first_or_default" diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index c7ef01e48..fb22a137e 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -150,7 +150,7 @@ Source-side entry points - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy). * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)`` - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``) - - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. ``can_group_by`` is off and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. + - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. ``can_group_by`` is off and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). **``to_table()`` sinks fuse** (table-buffer materializer row above): the chain inserts straight into the result table — a bare ``each_kv(tab).to_table()`` is a reserve-ahead table clone through the fused walk, and a ``keys(tab)`` chain lands in the ``table`` set form. ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. * - ``unsafe(from_json(jv, type))`` - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``) - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML. @@ -192,6 +192,9 @@ Array-source patterns * - ``._where(P).take_while(P2).<...>`` / ``.skip_while(P2).<...>`` - ``plan_loop_or_count`` (predicate-driven ranges) - ``take_while`` exits on first non-match; ``skip_while`` toggles state. + * - ``._where(P)._select(K => V).to_table()`` (and bare / set forms) + - ``plan_loop_or_count`` (table-buffer materializer) + - Insert-loop straight into the result table — no intermediate array. A ``(k => v)`` tuple projection splits so key and value each evaluate once; other tuple projections bind to a local; a scalar chain lands in the ``table`` set form. Reserve from O(1) source length on unfiltered walks. Duplicate keys keep the last occurrence (das ``insert`` semantics, not C#'s throw). The selector-based ``to_table(key, elementSelector)`` and decs sources keep the tier-2 path. * - ``._order_by(K).first()`` / ``.first_or_default()`` - ``plan_order_family`` (streaming-min) → ``emit_streaming_min`` - Single ``var best`` + ``var seen``, no buffer; one comparison per element. @@ -676,8 +679,9 @@ Common cases that fall back: ``join_impl``. - **Aggregations on lazy groupings**: ``_group_by_lazy(K)._select(F)`` with a non-bucket-reducing ``_select``. -- **Materialization-only chains** that the standard linq surface - already lowers efficiently — e.g. ``to_table()`` on a finite array. +- **Selector-based ``to_table(key, elementSelector)``** — the 3-arg form + keeps its tier-2 generic; only the selector-free ``to_table()`` + terminator splices (see the table-buffer materializer row above). - **Chained ``_select(f) |> _select(g)`` with an impure inner** (``_ % N``, ``_ / N``, user-call inner that the typer can't prove pure). The ``collapse_chained_selects`` pre-pass is gated on diff --git a/skills/linq.md b/skills/linq.md index 8be49b8db..675c06be0 100644 --- a/skills/linq.md +++ b/skills/linq.md @@ -98,6 +98,30 @@ There are several tripwires to know about — they're not arbitrary, they fall o - **String `join(arr, sep)` lives in `strings` / `strings_boost`** — `linq` itself has a different `join` (SQL-style two-iterator inner join with key + result projection). They coexist; the typer picks the right one by argument types. If you see "module strings_boost is not visible" and "missing argument blk" pointed at your join call, you're missing `require daslib/strings_boost` (or `require strings`). - **The `_` placeholder is local to the closest enclosing `_(...)`.** If you nest, give inner closures explicit names (`@@(x) => ...`). Don't try to shadow `_` between outer and inner shorthand calls. +### Table sources and the `to_table` sink + +A `table` (or `table` set) is a first-class chain source — no key/value arrays needed: + +```das +// each_kv yields (key, value) named tuples; keys/values give one lane. +// Wrap the head in unsafe(...) — the sources are [unsafe_outside_of_for]. +let pricey = _fold(unsafe(each_kv(cars)) |> _where(_.value.price > 500) |> count()) +let ids <- _fold(unsafe(keys(cars)) |> _where(_ > 100) |> to_array()) + +// to_table() lands a chain in a table: kv (or any (k => v) tuple) chain → table, +// scalar chain → table set. Duplicate keys keep the last occurrence. +var byId <- _fold(each(orders) |> _select(_.id => _.total) |> to_table()) +var index <- _fold(unsafe(each_kv(cars)) |> _where(_.value.in_stock) |> to_table()) +``` + +The fused emitter walks only the iterators the chain touches (a `.value`-only chain never +touches keys), folds `where(kv.key == X) … first/any/count` to an O(1) probe, joins on a bare +table key by probing the table instead of hashing, and inserts straight into the `to_table` +result with no intermediate array. `%linq!` queries dispatch table sources automatically +(`from kv in tab`). Slot order is unspecified — don't write order-sensitive expectations over +table chains. The 3-arg `to_table(it, keyBlock, elementSelectorBlock)` ToDictionary form also +exists (tier-2 only). Full pattern reference: `doc/source/reference/linq_fold_patterns.rst`. + ## Don't mix styles Pick **one** style per transformation and stay in it: diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das index ed0330600..8900c05f2 100644 --- a/tests/linq/test_linq_table_source.das +++ b/tests/linq/test_linq_table_source.das @@ -280,6 +280,10 @@ def test_table_point_lookup(t : T?) { typedef IKV = tuple +def flip_kv(kv : IKV) : tuple { + return (kv.value => kv.key) +} + // Joins: a table in the srcB slot joined on its bare key (`d.key` / bare set element) probes the user's // table instead of building the join's internal hash; a table lead rides the same emitter through the // pruned slot walk. Either way must agree with the hash/hand-loop semantics: inner joins drop misses, @@ -494,3 +498,124 @@ def test_each_kv_tier2(t : T?) { t |> equal(n, 3) } } + +[test] +def test_to_table_sink(t : T?) { + t |> run("kv pass-through + where agrees with a hand loop") @(t : T?) { + var src <- make_int_table(6) + var expected : table + for (k, v in keys(src), values(src)) { + if (k > 1) { + expected[k] = v + } + } + var got <- _fold(each_kv(src) |> _where(_.key > 1) |> to_table()) + t |> equal(length(got), length(expected)) + for (k, v in keys(expected), values(expected)) { + t |> equal(got?[k] ?? -1, v) + } + delete got + delete expected + delete src + } + t |> run("projection remaps keys and values") @(t : T?) { + var src <- make_int_table(5) + var got <- _fold(each_kv(src) |> _select((_.key * 2) => _.value + 1) |> to_table()) + t |> equal(length(got), 5) + t |> equal(got?[8] ?? -1, 41) + delete got + delete src + } + t |> run("bare to_table clones the table through the fused walk") @(t : T?) { + var src <- make_int_table(5) + var got <- _fold(each_kv(src) |> to_table()) + t |> equal(length(got), 5) + t |> equal(got?[0] ?? -1, 0) + t |> equal(got?[4] ?? -1, 40) + delete got + delete src + } + t |> run("keys chain lands in the set form") @(t : T?) { + var src <- make_int_table(5) + var got <- _fold(keys(src) |> _where(_ != 3) |> to_table()) + t |> equal(length(got), 4) + t |> success(key_exists(got, 0) && !key_exists(got, 3)) + delete got + delete src + } + t |> run("array source with a tuple projection") @(t : T?) { + var arr <- [for (i in range(4)); i + 1] + var got <- _fold(arr |> _select(_ => _ * _) |> to_table()) + t |> equal(length(got), 4) + t |> equal(got?[3] ?? -1, 9) + delete got + delete arr + } + t |> run("duplicate keys keep the last occurrence") @(t : T?) { + var arr <- [for (i in range(6)); i] + var got <- _fold(arr |> _select((_ % 2) => _) |> to_table()) + t |> equal(length(got), 2) + t |> equal(got?[0] ?? -1, 4) + t |> equal(got?[1] ?? -1, 5) + delete got + delete arr + } + t |> run("take bounds the walk and the reserve") @(t : T?) { + var src <- make_int_table(5) + var got <- _fold(keys(src) |> take(2) |> to_table()) + t |> equal(length(got), 2) + delete got + delete src + } + t |> run("non-tuple-literal projection rides the bind arm") @(t : T?) { + var src <- make_int_table(5) + var got <- _fold(each_kv(src) |> _select(flip_kv(_)) |> to_table()) + t |> equal(length(got), 5) + t |> equal(got?[40] ?? -1, 4) + delete got + delete src + } + t |> run("string keys through the fused map arm") @(t : T?) { + var src <- make_int_table(4) + var got <- _fold(each_kv(src) |> _select(("k{_.key}" => _.value)) |> to_table()) + t |> equal(length(got), 4) + t |> equal(got?["k2"] ?? -1, 20) + delete got + delete src + } + t |> run("tier-2 iterator to_table agrees with the fused emit") @(t : T?) { + var src <- make_int_table(6) + var fused <- _fold(each_kv(src) |> _where(_.key % 2 == 0) |> to_table()) + var tier2 <- to_table(unsafe(each_kv(src)) |> _where(_.key % 2 == 0)) + t |> equal(length(fused), length(tier2)) + for (k, v in keys(tier2), values(tier2)) { + t |> equal(fused?[k] ?? -1, v) + } + delete fused + delete tier2 + delete src + } + t |> run("array-input to_table forms borrow the source") @(t : T?) { + var pairs <- [for (i in range(3)); (i => i * i)] + var m <- to_table(pairs) + t |> equal(length(pairs), 3) // source intact + t |> equal(length(m), 3) + t |> equal(m?[2] ?? -1, 4) + var bare <- [for (i in range(4)); i * 100] + var s <- to_table(bare) + t |> equal(length(bare), 4) + t |> success(key_exists(s, 300)) + delete s + delete bare + delete m + delete pairs + } + t |> run("selector-based to_table keeps its tier-2 path") @(t : T?) { + var arr <- [for (i in range(4)); i] + var got <- to_table(to_sequence(arr), $(x : int) => x, $(x : int) => x * 10) + t |> equal(length(got), 4) + t |> equal(got?[3] ?? -1, 30) + delete got + delete arr + } +} From 9331bbc2ce26016528b061008a8199b3cfdcd6a3 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 09:14:37 -0700 Subject: [PATCH 10/11] bench: fill the m7 column + light up to_table across all in-memory sources MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The m7 column had 26 empty cells; only 7 were principled (zip_* x4 / cross_join — lockstep pairing over an unordered slot walk is meaningless; select_many — flat fixture; decs_count_bare_pred — decs-only). The rest were scoping debt: - join_select / where_join_count — fuse today via the stage-5 join work; lanes simply hadn't been written. where_join_count lands at 46.8 ns/elem INTERP (lead-where pruned join); join_select 222.9 (iterator-typed join bail, tier-2). - 12 groupby_* + join_groupby_count/to_array + order_reverse_normalized / reverse_take_select / reverse_distinct_by — instantiated as tier-2-cascade cells (table group_by fusion and a backward slot walk are named deferred edges); the cells now show the cost a future fix would improve. to_table / to_table_staged gain m3f/m4/m5f/m6f lanes (only SQL stays absent — _sql has no table sink): array fuses at 18.7 vs 54.8 staged (~3x), XML 118.2 vs 144.8, JSON 144.3 vs 166.8; decs declines by design and its 144.0 vs 56.8 staged gap is the motivating number for a future decs sink hook. results.md re-swept (all 82 families, m7 dashes 26 -> 7); missing-lanes prose rewritten to match. Co-Authored-By: Claude Fable 5 --- benchmarks/sql/array.das | 27 ++++ benchmarks/sql/decs.das | 27 ++++ benchmarks/sql/json.das | 27 ++++ benchmarks/sql/results.md | 294 +++++++++++++++++++------------------- benchmarks/sql/table.das | 284 ++++++++++++++++++++++++++++++++++++ benchmarks/sql/xml.das | 27 ++++ 6 files changed, 539 insertions(+), 147 deletions(-) diff --git a/benchmarks/sql/array.das b/benchmarks/sql/array.das index 882eabcef..5259714e0 100644 --- a/benchmarks/sql/array.das +++ b/benchmarks/sql/array.das @@ -942,6 +942,33 @@ def to_array_filter_m3f(b : B?) { } } +[benchmark] +def to_table_m3f(b : B?) { + // fused insert-loop sink: the chain lands straight in the result table + b |> run("to_table", N) { + var tab <- _fold(each(g_arr) |> _select((_.id => _.price)) |> to_table()) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + +[benchmark] +def to_table_staged_m3f(b : B?) { + // staged baseline: materialize the kv tuples to an array, then convert + b |> run("to_table_staged", N) { + var rows <- _fold(each(g_arr) |> _select((_.id => _.price)) |> to_array()) + var tab <- to_table_move(rows) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + [benchmark] def where_join_count_m3f(b : B?) { b |> run("where_join_count", N) { diff --git a/benchmarks/sql/decs.das b/benchmarks/sql/decs.das index 8c44b5fd8..8ad4ff833 100644 --- a/benchmarks/sql/decs.das +++ b/benchmarks/sql/decs.das @@ -914,6 +914,33 @@ def to_array_filter_m4(b : B?) { } } +[benchmark] +def to_table_m4(b : B?) { + // fused insert-loop sink: the chain lands straight in the result table + b |> run("to_table", N) { + var tab <- _fold(from_decs_template(type) |> _select((_.id => _.price)) |> to_table()) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + +[benchmark] +def to_table_staged_m4(b : B?) { + // staged baseline: materialize the kv tuples to an array, then convert + b |> run("to_table_staged", N) { + var rows <- _fold(from_decs_template(type) |> _select((_.id => _.price)) |> to_array()) + var tab <- to_table_move(rows) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + [benchmark] def where_join_count_m4(b : B?) { b |> run("where_join_count", N) { diff --git a/benchmarks/sql/json.das b/benchmarks/sql/json.das index 6533ae6d8..b5e5acc45 100644 --- a/benchmarks/sql/json.das +++ b/benchmarks/sql/json.das @@ -894,6 +894,33 @@ def to_array_filter_m6f(b : B?) { } } +[benchmark] +def to_table_m6f(b : B?) { + // fused insert-loop sink: the chain lands straight in the result table + b |> run("to_table", N) { + var tab <- _fold(unsafe(from_json(g_jv, type)) |> _select((_.id => _.price)) |> to_table()) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + +[benchmark] +def to_table_staged_m6f(b : B?) { + // staged baseline: materialize the kv tuples to an array, then convert + b |> run("to_table_staged", N) { + var rows <- _fold(unsafe(from_json(g_jv, type)) |> _select((_.id => _.price)) |> to_array()) + var tab <- to_table_move(rows) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + [benchmark] def where_join_count_m6f(b : B?) { b |> run("where_join_count", N) { diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 30402ff49..bb13ccbc7 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -36,175 +36,175 @@ signal, JIT deltas as indicative.** | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 34.9 | 5.9 | 5.8 | 60.8 | 158.9 | 19.5 | -| `all_match` | 27.7 | 3.5 | 3.4 | 56.0 | 153.2 | 15.9 | +| `aggregate_match` | 35.0 | 5.9 | 5.9 | 60.5 | 159.7 | 19.0 | +| `all_match` | 27.7 | 3.5 | 3.4 | 56.1 | 153.8 | 15.8 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.3 | 5.9 | 8.7 | 58.5 | 163.4 | 17.3 | +| `average_aggregate` | 30.1 | 6.0 | 8.8 | 60.1 | 163.7 | 17.2 | | `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.1 | -| `bare_order_where` | 282.9 | 118.2 | 125.0 | 300.5 | 290.8 | 163.1 | -| `chained_select_collapse` | — | 17.8 | 17.5 | 70.4 | 161.7 | 27.7 | -| `chained_where` | 41.5 | 6.6 | 7.1 | 104.8 | 182.1 | 24.0 | -| `contains_match` | 0.0 | 2.2 | 1.4 | 28.9 | 71.5 | 6.6 | -| `count_aggregate` | 29.6 | 4.3 | 4.1 | 63.5 | 154.0 | 20.2 | -| `cross_join` | 12896.3 | 3681.4 | — | 4018.5 | 4096.4 | — | -| `decs_count_bare_pred` | — | — | 4.1 | — | — | — | -| `distinct_by_count` | 41.4 | 15.7 | 15.7 | 70.4 | 161.3 | 26.8 | -| `distinct_by_order_take` | 239.9 | 22.3 | 23.3 | 123.9 | 162.0 | 48.8 | -| `distinct_by_order_to_array` | 237.8 | 22.3 | 23.3 | 124.3 | 162.5 | 48.8 | -| `distinct_count` | 41.8 | 15.9 | 15.7 | 70.7 | 161.8 | 27.0 | -| `distinct_count_pred` | 252.1 | 15.8 | 15.6 | 111.9 | 176.7 | 26.6 | +| `bare_order_where` | 278.1 | 117.1 | 126.5 | 302.8 | 288.8 | 163.0 | +| `chained_select_collapse` | — | 17.9 | 17.6 | 70.7 | 172.6 | 27.9 | +| `chained_where` | 36.9 | 6.6 | 7.1 | 105.4 | 183.4 | 23.8 | +| `contains_match` | 0.0 | 2.2 | 1.4 | 29.0 | 72.4 | 6.5 | +| `count_aggregate` | 29.7 | 4.2 | 4.1 | 63.6 | 154.3 | 20.1 | +| `cross_join` | 12597.0 | 3721.0 | — | 4040.3 | 4113.3 | — | +| `decs_count_bare_pred` | — | — | 4.2 | — | — | — | +| `distinct_by_count` | 41.6 | 16.4 | 15.8 | 70.8 | 162.9 | 26.9 | +| `distinct_by_order_take` | 241.1 | 22.1 | 23.7 | 124.2 | 162.4 | 49.2 | +| `distinct_by_order_to_array` | 241.0 | 22.2 | 23.8 | 125.0 | 163.2 | 48.9 | +| `distinct_count` | 41.8 | 15.7 | 15.9 | 70.7 | 162.9 | 27.1 | +| `distinct_count_pred` | 253.4 | 15.9 | 15.9 | 112.7 | 179.4 | 26.7 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.0 | 29.4 | 29.0 | 123.0 | 196.4 | — | -| `groupby_count` | 142.4 | 19.2 | 19.1 | 74.8 | 167.1 | 164.5 | -| `groupby_first` | 251.1 | 19.2 | 19.7 | 72.1 | 162.2 | — | -| `groupby_having_count` | 142.0 | 19.1 | 19.1 | 74.7 | 166.3 | — | -| `groupby_having_hidden_sum` | 176.6 | 22.3 | 22.3 | 118.0 | 187.9 | — | -| `groupby_having_post_where` | 173.2 | 20.5 | 20.4 | 114.4 | 187.4 | — | -| `groupby_max` | 173.5 | 24.9 | 24.8 | 119.6 | 191.4 | — | -| `groupby_min` | 173.8 | 25.3 | 24.8 | 119.6 | 192.5 | — | -| `groupby_multi_reducer` | 190.5 | 30.4 | 30.0 | 124.7 | 196.1 | — | -| `groupby_select_order` | 172.1 | 20.5 | 20.4 | 114.3 | 188.6 | — | -| `groupby_select_sum` | 199.6 | 38.5 | 38.0 | 101.5 | 194.4 | — | -| `groupby_sum` | 172.1 | 20.5 | 20.4 | 114.6 | 187.6 | 194.6 | -| `groupby_where_count` | 76.4 | 14.1 | 14.2 | 115.1 | 185.8 | — | -| `groupby_where_sum` | 87.5 | 14.2 | 14.5 | 116.0 | 186.7 | — | -| `join_count` | 38.4 | 51.4 | 63.6 | 112.9 | 183.8 | 65.4 | -| `join_groupby_count` | 158.4 | 77.8 | 87.8 | 177.4 | 233.1 | — | -| `join_groupby_to_array` | 189.8 | 78.7 | 89.6 | 214.7 | 214.1 | — | -| `join_probe` | — | — | — | — | — | 46.9 | -| `join_probe_build` | — | — | — | — | — | 79.5 | -| `join_select` | 151.8 | 72.8 | 84.9 | 189.5 | 217.4 | — | -| `join_where_count` | 39.7 | 61.7 | 78.7 | 160.5 | 199.8 | 81.6 | -| `last_match` | 0.0 | 5.9 | 14.0 | 65.0 | 159.2 | 31.0 | -| `long_count_aggregate` | 29.9 | 4.1 | 4.1 | 63.4 | 154.0 | 20.1 | -| `max_aggregate` | 31.1 | 6.0 | 6.8 | 58.7 | 162.1 | 16.9 | -| `min_aggregate` | 31.0 | 6.0 | 6.9 | 58.7 | 162.9 | 17.0 | -| `order_by_multi_key` | 340.9 | 270.9 | 279.5 | 459.2 | 446.7 | 336.4 | -| `order_distinct_take` | 138.7 | 15.9 | 98.6 | 72.6 | 162.8 | 31.6 | -| `order_reverse_normalized` | 38.8 | 16.3 | 19.8 | 70.9 | 169.9 | — | -| `order_take_desc` | 38.5 | 16.3 | 19.9 | 70.1 | 170.8 | 33.3 | +| `groupby_average` | 173.6 | 29.2 | 29.3 | 123.6 | 195.4 | 198.4 | +| `groupby_count` | 144.5 | 19.2 | 19.2 | 75.0 | 168.4 | 164.3 | +| `groupby_first` | 253.9 | 19.1 | 19.8 | 72.7 | 163.4 | 164.1 | +| `groupby_having_count` | 142.6 | 19.2 | 19.2 | 75.4 | 168.9 | 186.7 | +| `groupby_having_hidden_sum` | 176.8 | 22.2 | 22.9 | 118.6 | 192.0 | 216.5 | +| `groupby_having_post_where` | 172.4 | 20.5 | 20.5 | 114.6 | 188.7 | 194.8 | +| `groupby_max` | 175.4 | 24.9 | 25.2 | 120.0 | 192.4 | 202.4 | +| `groupby_min` | 175.2 | 24.9 | 25.3 | 120.7 | 193.2 | 204.5 | +| `groupby_multi_reducer` | 192.0 | 30.8 | 30.2 | 125.6 | 196.8 | 232.3 | +| `groupby_select_order` | 172.8 | 20.5 | 20.5 | 115.2 | 188.1 | 195.3 | +| `groupby_select_sum` | 199.8 | 38.7 | 38.7 | 102.3 | 193.5 | 191.2 | +| `groupby_sum` | 176.8 | 20.5 | 20.5 | 114.9 | 188.1 | 194.5 | +| `groupby_where_count` | 76.2 | 13.8 | 14.5 | 116.0 | 185.7 | 165.2 | +| `groupby_where_sum` | 87.7 | 14.1 | 14.9 | 116.5 | 187.3 | 180.6 | +| `join_count` | 38.0 | 51.3 | 64.7 | 113.3 | 183.5 | 66.0 | +| `join_groupby_count` | 160.1 | 76.7 | 89.9 | 178.6 | 230.9 | 259.6 | +| `join_groupby_to_array` | 194.1 | 78.4 | 91.4 | 216.3 | 212.7 | 290.0 | +| `join_probe` | — | — | — | — | — | 46.6 | +| `join_probe_build` | — | — | — | — | — | 79.9 | +| `join_select` | 150.3 | 72.7 | 86.0 | 190.7 | 215.4 | 222.9 | +| `join_where_count` | 39.1 | 61.6 | 79.4 | 161.2 | 198.1 | 80.1 | +| `last_match` | 0.0 | 5.9 | 13.9 | 64.8 | 159.6 | 31.0 | +| `long_count_aggregate` | 29.5 | 4.1 | 4.1 | 63.2 | 155.1 | 20.2 | +| `max_aggregate` | 30.8 | 6.2 | 7.0 | 58.6 | 163.2 | 17.4 | +| `min_aggregate` | 31.1 | 6.2 | 6.8 | 58.6 | 163.5 | 17.3 | +| `order_by_multi_key` | 336.9 | 274.1 | 281.9 | 458.6 | 445.5 | 335.6 | +| `order_distinct_take` | 140.6 | 15.9 | 99.4 | 72.3 | 163.8 | 31.6 | +| `order_reverse_normalized` | 38.6 | 16.3 | 20.0 | 70.1 | 170.7 | 33.1 | +| `order_take_desc` | 38.3 | 16.5 | 20.6 | 70.1 | 170.9 | 33.1 | | `point_lookup` | — | — | — | — | — | 0.0 | -| `point_lookup_scan` | — | — | — | — | — | 8.3 | -| `reverse_distinct_by` | 295.3 | 21.3 | 28.2 | 71.1 | 161.9 | — | -| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.1 | 58.5 | -| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.1 | — | -| `select_count` | 0.1 | 0.0 | 2.2 | 68.3 | 2.2 | 0.0 | -| `select_many` | — | 191.7 | — | — | — | — | -| `select_where` | 204.1 | 11.2 | 19.3 | 197.1 | 183.4 | 37.7 | -| `select_where_count` | 32.5 | 5.1 | 7.4 | 64.9 | 156.9 | 22.7 | -| `select_where_order_take` | 37.1 | 12.3 | 14.8 | 72.8 | 165.4 | 35.3 | -| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.5 | 161.9 | 25.0 | -| `single_match` | 0.0 | 2.9 | 5.5 | 58.2 | 151.2 | 22.6 | -| `skip_take` | 0.5 | 0.1 | 0.2 | 3.1 | 2.8 | 0.3 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 60.0 | 153.2 | 18.2 | -| `sort_first` | 38.4 | 11.1 | 13.3 | 65.1 | 166.7 | 32.2 | -| `sort_take` | 38.7 | 16.3 | 20.0 | 70.8 | 170.4 | 33.1 | -| `sort_take_select` | 38.7 | 16.3 | 20.1 | 71.3 | 170.6 | 33.3 | -| `sum_aggregate` | 30.5 | 2.1 | 2.1 | 54.6 | 153.2 | 13.4 | -| `sum_where` | 33.2 | 4.3 | 4.2 | 63.4 | 154.6 | 20.4 | -| `take_count` | 3.8 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | +| `point_lookup_scan` | — | — | — | — | — | 8.4 | +| `reverse_distinct_by` | 308.2 | 21.2 | 27.9 | 70.8 | 163.1 | 44.6 | +| `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.4 | 58.9 | +| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.3 | 58.6 | +| `select_count` | 0.1 | 0.0 | 2.2 | 68.5 | 2.2 | 0.0 | +| `select_many` | — | 192.1 | — | — | — | — | +| `select_where` | 197.4 | 11.2 | 19.4 | 196.4 | 183.1 | 37.8 | +| `select_where_count` | 32.6 | 5.1 | 7.4 | 64.4 | 157.5 | 22.8 | +| `select_where_order_take` | 36.6 | 12.5 | 15.1 | 72.3 | 164.9 | 35.1 | +| `select_where_sum` | 37.1 | 7.4 | 7.5 | 66.3 | 162.5 | 23.6 | +| `single_match` | 0.0 | 2.8 | 5.4 | 58.0 | 151.0 | 22.8 | +| `skip_take` | 0.5 | 0.1 | 0.2 | 3.0 | 2.8 | 0.3 | +| `skip_while_match` | 3.4 | 5.3 | 5.3 | 59.9 | 153.2 | 18.3 | +| `sort_first` | 37.9 | 11.1 | 13.4 | 65.2 | 166.1 | 32.2 | +| `sort_take` | 38.3 | 16.3 | 20.4 | 70.3 | 171.0 | 33.1 | +| `sort_take_select` | 38.3 | 16.3 | 20.2 | 70.7 | 170.5 | 33.2 | +| `sum_aggregate` | 30.2 | 2.1 | 2.1 | 53.9 | 153.3 | 13.5 | +| `sum_where` | 32.8 | 4.2 | 4.3 | 63.4 | 154.2 | 20.5 | +| `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.3 | 1.1 | 0.3 | | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | -| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.1 | 76.2 | 16.4 | -| `to_array_filter` | 71.1 | 11.8 | 11.7 | 71.3 | 164.3 | 28.9 | -| `to_table` | — | — | — | — | — | 32.5 | -| `to_table_staged` | — | — | — | — | — | 68.3 | -| `where_join_count` | 41.5 | 28.8 | 40.9 | 132.1 | 167.0 | — | -| `zip_count_pred` | 39.5 | 15.8 | — | 318.5 | 320.2 | — | -| `zip_dot_product` | 47.2 | 12.6 | 10.8 | 312.7 | 318.6 | — | -| `zip_dot_product_3arg` | 47.1 | 12.7 | — | 312.8 | 317.5 | — | -| `zip_reverse_to_array` | — | 31.4 | — | 348.8 | 352.2 | — | +| `take_while_match` | 7.8 | 2.4 | 2.4 | 30.1 | 75.7 | 16.4 | +| `to_array_filter` | 70.3 | 11.8 | 11.8 | 70.9 | 163.7 | 29.0 | +| `to_table` | — | 18.7 | 144.0 | 118.2 | 144.3 | 32.2 | +| `to_table_staged` | — | 54.8 | 56.8 | 144.8 | 166.8 | 69.0 | +| `where_join_count` | 41.2 | 29.1 | 41.8 | 131.7 | 167.5 | 46.8 | +| `zip_count_pred` | 39.4 | 15.9 | — | 317.3 | 319.1 | — | +| `zip_dot_product` | 46.6 | 12.7 | 10.6 | 314.0 | 316.5 | — | +| `zip_dot_product_3arg` | 46.8 | 12.8 | — | 313.0 | 316.7 | — | +| `zip_reverse_to_array` | — | 31.7 | — | 349.3 | 351.4 | — | ## JIT | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.1 | 0.3 | 0.6 | 22.8 | 26.2 | 13.5 | -| `all_match` | 27.9 | 0.3 | 0.2 | 17.5 | 25.3 | 13.6 | +| `aggregate_match` | 35.0 | 0.3 | 0.7 | 29.8 | 27.2 | 13.5 | +| `all_match` | 27.9 | 0.3 | 0.2 | 18.8 | 26.2 | 13.5 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.5 | 1.0 | 3.5 | 17.4 | 24.7 | 13.5 | -| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.1 | -| `bare_order_where` | 186.2 | 34.1 | 35.0 | 104.9 | 52.8 | 78.8 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | -| `chained_where` | 36.9 | 0.6 | 0.8 | 34.7 | 31.3 | 17.8 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 8.9 | 4.7 | -| `count_aggregate` | 29.7 | 0.3 | 0.6 | 17.5 | 25.5 | 13.4 | -| `cross_join` | 5965.9 | 731.0 | — | 833.2 | 770.0 | — | +| `average_aggregate` | 30.2 | 1.0 | 3.5 | 18.8 | 25.7 | 13.5 | +| `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 17.2 | +| `bare_order_where` | 185.1 | 34.2 | 35.0 | 105.5 | 53.0 | 78.8 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 33.9 | 14.0 | +| `chained_where` | 36.9 | 0.6 | 0.8 | 36.6 | 32.1 | 17.8 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 17.5 | 9.4 | 4.7 | +| `count_aggregate` | 29.5 | 0.3 | 0.6 | 29.5 | 26.4 | 13.5 | +| `cross_join` | 5991.6 | 734.4 | — | 834.6 | 771.2 | — | | `decs_count_bare_pred` | — | — | 0.6 | — | — | — | -| `distinct_by_count` | 41.7 | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | -| `distinct_by_order_take` | 239.3 | 1.7 | 2.6 | 46.3 | 38.8 | 30.1 | -| `distinct_by_order_to_array` | 240.2 | 1.7 | 2.7 | 46.4 | 38.7 | 30.3 | -| `distinct_count` | 41.6 | 1.1 | 1.1 | 20.6 | 33.5 | 14.0 | -| `distinct_count_pred` | 251.7 | 1.1 | 1.3 | 37.7 | 43.8 | 14.0 | +| `distinct_by_count` | 42.1 | 1.1 | 1.1 | 20.6 | 33.9 | 14.1 | +| `distinct_by_order_take` | 249.6 | 1.7 | 2.6 | 45.2 | 39.0 | 30.3 | +| `distinct_by_order_to_array` | 252.5 | 1.7 | 2.7 | 45.5 | 38.9 | 30.2 | +| `distinct_count` | 41.7 | 1.1 | 1.1 | 20.6 | 33.7 | 14.1 | +| `distinct_count_pred` | 265.8 | 1.1 | 1.3 | 37.8 | 43.6 | 14.0 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.5 | 1.5 | 1.9 | 35.5 | 45.5 | — | -| `groupby_count` | 142.1 | 1.3 | 1.5 | 20.6 | 33.8 | 42.7 | -| `groupby_first` | 251.9 | 1.3 | 2.3 | 20.6 | 34.4 | — | -| `groupby_having_count` | 141.3 | 1.3 | 1.5 | 20.6 | 33.9 | — | -| `groupby_having_hidden_sum` | 176.9 | 1.5 | 1.7 | 35.5 | 45.2 | — | -| `groupby_having_post_where` | 171.1 | 1.4 | 1.9 | 35.5 | 44.1 | — | -| `groupby_max` | 173.4 | 1.5 | 1.9 | 35.5 | 45.8 | — | -| `groupby_min` | 172.8 | 1.5 | 1.8 | 35.6 | 45.8 | — | -| `groupby_multi_reducer` | 190.2 | 1.6 | 1.9 | 35.8 | 46.1 | — | -| `groupby_select_order` | 170.9 | 1.4 | 1.9 | 35.4 | 44.3 | — | -| `groupby_select_sum` | 200.0 | 2.8 | 3.2 | 31.8 | 39.9 | — | -| `groupby_sum` | 170.9 | 1.4 | 1.6 | 35.5 | 44.3 | 51.2 | -| `groupby_where_count` | 76.3 | 0.9 | 1.3 | 35.6 | 41.8 | — | -| `groupby_where_sum` | 87.6 | 0.9 | 1.3 | 35.6 | 41.9 | — | -| `join_count` | 38.2 | 10.9 | 11.8 | 42.6 | 71.5 | 32.2 | -| `join_groupby_count` | 156.9 | 17.6 | 19.5 | 68.3 | 89.8 | — | -| `join_groupby_to_array` | 189.8 | 17.5 | 19.4 | 79.3 | 36.1 | — | -| `join_probe` | — | — | — | — | — | 24.0 | -| `join_probe_build` | — | — | — | — | — | 38.1 | -| `join_select` | 94.0 | 19.6 | 21.7 | 73.8 | 95.2 | — | -| `join_where_count` | 39.8 | 18.9 | 20.8 | 63.5 | 78.3 | 37.8 | -| `last_match` | 0.0 | 0.5 | 1.4 | 18.2 | 25.9 | 22.9 | -| `long_count_aggregate` | 29.2 | 0.3 | 0.6 | 17.5 | 25.5 | 13.4 | -| `max_aggregate` | 31.0 | 0.3 | 0.5 | 17.4 | 27.1 | 13.4 | -| `min_aggregate` | 31.1 | 0.3 | 0.5 | 17.4 | 27.0 | 13.5 | -| `order_by_multi_key` | 250.0 | 53.1 | 54.7 | 123.6 | 71.3 | 129.4 | -| `order_distinct_take` | 138.1 | 1.1 | 75.3 | 20.9 | 35.7 | 14.0 | -| `order_reverse_normalized` | 38.5 | 0.7 | 1.3 | 22.0 | 27.7 | — | -| `order_take_desc` | 38.2 | 0.7 | 1.3 | 22.0 | 27.5 | 17.8 | +| `groupby_average` | 177.2 | 1.6 | 1.9 | 37.2 | 45.6 | 51.9 | +| `groupby_count` | 145.8 | 1.3 | 1.5 | 20.6 | 34.1 | 43.9 | +| `groupby_first` | 265.0 | 1.3 | 2.3 | 20.7 | 34.6 | 43.7 | +| `groupby_having_count` | 144.6 | 1.3 | 1.5 | 20.7 | 34.1 | 46.7 | +| `groupby_having_hidden_sum` | 180.4 | 1.5 | 1.7 | 37.0 | 45.4 | 55.0 | +| `groupby_having_post_where` | 177.4 | 1.4 | 2.0 | 37.0 | 44.2 | 51.4 | +| `groupby_max` | 179.1 | 1.5 | 1.9 | 37.1 | 46.0 | 52.0 | +| `groupby_min` | 179.2 | 1.5 | 1.8 | 37.0 | 46.1 | 52.4 | +| `groupby_multi_reducer` | 195.2 | 1.6 | 2.0 | 37.1 | 45.9 | 61.3 | +| `groupby_select_order` | 176.3 | 1.4 | 1.9 | 37.0 | 44.4 | 51.4 | +| `groupby_select_sum` | 205.9 | 2.8 | 3.2 | 33.2 | 39.7 | 73.0 | +| `groupby_sum` | 175.9 | 1.4 | 1.6 | 37.0 | 44.5 | 51.9 | +| `groupby_where_count` | 76.5 | 0.9 | 1.3 | 37.2 | 41.9 | 52.2 | +| `groupby_where_sum` | 87.7 | 0.9 | 1.3 | 36.9 | 42.0 | 56.1 | +| `join_count` | 38.7 | 11.0 | 11.7 | 40.9 | 71.4 | 31.8 | +| `join_groupby_count` | 160.1 | 17.4 | 19.7 | 66.4 | 90.1 | 72.9 | +| `join_groupby_to_array` | 194.1 | 17.9 | 19.8 | 78.4 | 36.1 | 81.1 | +| `join_probe` | — | — | — | — | — | 24.2 | +| `join_probe_build` | — | — | — | — | — | 39.8 | +| `join_select` | 94.0 | 19.7 | 21.8 | 72.2 | 94.4 | 70.1 | +| `join_where_count` | 39.6 | 19.3 | 20.6 | 63.2 | 78.2 | 38.0 | +| `last_match` | 0.0 | 0.5 | 1.4 | 19.6 | 26.9 | 22.8 | +| `long_count_aggregate` | 29.8 | 0.3 | 0.6 | 29.4 | 26.5 | 13.8 | +| `max_aggregate` | 31.0 | 0.3 | 0.5 | 29.8 | 27.9 | 13.5 | +| `min_aggregate` | 31.2 | 0.3 | 0.5 | 29.8 | 27.7 | 13.5 | +| `order_by_multi_key` | 251.0 | 54.8 | 54.8 | 124.4 | 71.8 | 129.5 | +| `order_distinct_take` | 142.6 | 1.1 | 75.8 | 21.0 | 35.8 | 14.0 | +| `order_reverse_normalized` | 38.7 | 0.7 | 1.4 | 19.8 | 28.6 | 17.8 | +| `order_take_desc` | 38.6 | 0.7 | 1.3 | 19.7 | 28.4 | 17.8 | | `point_lookup` | — | — | — | — | — | 0.0 | | `point_lookup_scan` | — | — | — | — | — | 6.0 | -| `reverse_distinct_by` | 295.7 | 1.5 | 3.2 | 20.5 | 34.4 | — | -| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 26.9 | -| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.9 | — | -| `select_count` | 0.1 | 0.0 | 0.0 | 67.0 | 0.0 | 0.0 | -| `select_many` | — | 62.5 | — | — | — | — | -| `select_where` | 110.7 | 4.1 | 5.3 | 74.8 | 22.1 | 27.9 | -| `select_where_count` | 32.6 | 0.3 | 0.6 | 17.4 | 26.3 | 13.4 | -| `select_where_order_take` | 36.7 | 0.7 | 1.3 | 18.4 | 27.3 | 23.1 | -| `select_where_sum` | 37.2 | 0.4 | 0.6 | 17.4 | 25.6 | 13.4 | -| `single_match` | 0.0 | 0.4 | 1.1 | 46.2 | 22.3 | 17.3 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.3 | 0.2 | -| `skip_while_match` | 3.4 | 0.4 | 0.4 | 46.0 | 21.8 | 13.2 | -| `sort_first` | 38.4 | 0.4 | 1.3 | 17.4 | 26.7 | 17.2 | -| `sort_take` | 38.6 | 0.7 | 1.3 | 22.0 | 27.9 | 17.8 | -| `sort_take_select` | 38.3 | 0.7 | 1.3 | 21.9 | 27.7 | 17.8 | -| `sum_aggregate` | 30.6 | 0.3 | 0.1 | 17.7 | 24.9 | 13.5 | -| `sum_where` | 33.0 | 0.3 | 0.6 | 17.4 | 26.3 | 13.4 | -| `take_count` | 1.9 | 0.1 | 0.1 | 1.2 | 0.3 | 0.2 | -| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.4 | 0.1 | 0.1 | +| `reverse_distinct_by` | 297.0 | 1.6 | 3.1 | 20.6 | 34.6 | 18.8 | +| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 27.0 | +| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.9 | 27.1 | +| `select_count` | 0.1 | 0.0 | 0.0 | 68.1 | 0.0 | 0.0 | +| `select_many` | — | 62.7 | — | — | — | — | +| `select_where` | 108.3 | 4.1 | 5.3 | 75.4 | 23.1 | 28.2 | +| `select_where_count` | 32.9 | 0.3 | 0.6 | 29.9 | 27.2 | 13.5 | +| `select_where_order_take` | 37.0 | 0.7 | 1.4 | 19.8 | 27.9 | 23.3 | +| `select_where_sum` | 37.4 | 0.4 | 0.6 | 20.4 | 26.2 | 13.4 | +| `single_match` | 0.0 | 0.4 | 1.1 | 46.1 | 23.2 | 17.4 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.2 | +| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.0 | 22.2 | 13.3 | +| `sort_first` | 38.3 | 0.4 | 1.3 | 18.9 | 27.5 | 17.3 | +| `sort_take` | 38.3 | 0.7 | 1.4 | 19.7 | 28.5 | 17.8 | +| `sort_take_select` | 38.4 | 0.7 | 1.3 | 19.8 | 28.4 | 17.7 | +| `sum_aggregate` | 30.4 | 0.3 | 0.1 | 23.3 | 25.6 | 13.5 | +| `sum_where` | 33.1 | 0.3 | 0.6 | 29.5 | 27.1 | 13.5 | +| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.2 | 0.3 | +| `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.5 | 0.1 | 0.2 | | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_where_count` | 0.9 | 0.0 | 0.0 | 0.2 | 0.0 | 0.1 | -| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.4 | 8.9 | 13.4 | -| `to_array_filter` | 48.9 | 3.2 | 3.3 | 20.8 | 35.5 | 20.2 | -| `to_table` | — | — | — | — | — | 28.8 | -| `to_table_staged` | — | — | — | — | — | 41.6 | -| `where_join_count` | 41.3 | 5.7 | 6.8 | 48.8 | 41.8 | — | -| `zip_count_pred` | 39.8 | 0.1 | — | 115.3 | 33.8 | — | -| `zip_dot_product` | 47.3 | 0.1 | 0.1 | 115.4 | 33.8 | — | -| `zip_dot_product_3arg` | 47.1 | 0.1 | — | 115.3 | 33.7 | — | -| `zip_reverse_to_array` | — | 4.5 | — | 127.0 | 51.4 | — | +| `take_where_count` | 0.9 | 0.0 | 0.0 | 0.3 | 0.0 | 0.1 | +| `take_while_match` | 7.8 | 0.2 | 0.3 | 17.3 | 9.3 | 13.5 | +| `to_array_filter` | 48.5 | 3.3 | 3.4 | 22.2 | 35.4 | 20.3 | +| `to_table` | — | 14.1 | 37.4 | 49.7 | 54.3 | 29.2 | +| `to_table_staged` | — | 25.8 | 26.1 | 53.5 | 64.1 | 42.1 | +| `where_join_count` | 39.6 | 5.8 | 6.8 | 47.7 | 42.1 | 26.9 | +| `zip_count_pred` | 39.2 | 0.1 | — | 112.6 | 34.2 | — | +| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 112.4 | 34.1 | — | +| `zip_dot_product_3arg` | 46.9 | 0.1 | — | 112.4 | 34.1 | — | +| `zip_reverse_to_array` | — | 4.6 | — | 123.5 | 51.8 | — | ## Missing lanes (the `—` cells) @@ -220,10 +220,10 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk). - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) -- **m7 absent families** — `zip_*` / `cross_join` (lockstep over an unordered slot walk is meaningless), `select_many` (flat fixture, no nested array field), `order_reverse_normalized` / `reverse_take_select` / `reverse_distinct_by` (no backward slot walk; `reverse_take` is kept as the single deferral marker), the group-by tail beyond `groupby_count`/`groupby_sum` (table group_by fusion is a named deferred edge — see `LINQ_TO_TABLE.md`; the two marker cells track the tier-2 cost) plus the join-composition lanes (`join_select` / `where_join_count` would fuse today but aren't instantiated; `join_groupby_*` needs the deferred group_by), `decs_count_bare_pred` (decs-only). +- **m7 absent families** — `zip_*` / `cross_join` (lockstep pairing over an unordered slot walk is meaningless) and `select_many` (flat fixture, no nested array field; array-only). Everything else in the m7 column is instantiated — but read the `groupby_*` / `join_groupby_*` / reverse-family cells as the **tier-2 cascade cost**, not a fused emit: table group_by fusion and a backward slot walk are named deferred edges (see `LINQ_TO_TABLE.md`), so those cells are the numbers a fix would improve. - **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design. - **`join_probe` / `join_probe_build` non-m7** — m7-only A/B pair: a table srcB joined on its bare key probes the user's table per lead row (no internal join hash, no build loop); the `_build` twin feeds the identical rows pre-materialized to a kv array, forcing the hashed build. Other sources have no keyed-srcB analog by design. -- **`to_table` / `to_table_staged` non-m7** — m7-only A/B pair for the `to_table()` sink: the fused insert-loop lands the kv chain straight in the result table (reserve from O(1) length); the `_staged` twin materializes the same projection to an array first, then converts via the consuming builtin `to_table_move` — the shape every chain had before the sink arm. The sink itself works over any direct-loop source (the array lane fuses it too); only the bench pair is table-scoped. +- **`to_table` / `to_table_staged` SQL** — `to_table` isn't an SQL terminator (`_sql` pass-through has no table sink). All in-memory sources are instantiated: array / XML / JSON / table fuse the insert-loop sink (`_staged` is the materialize-then-`to_table_move` shape every chain had before the sink arm); decs declines by design (explicit guard in its loop_or_count lane), so its `to_table` cell is the full tier-2 cascade — currently slower than its `_staged` twin, which fuses the array materialization first. That gap is the motivating number for a future decs sink hook. ## Accepted floors diff --git a/benchmarks/sql/table.das b/benchmarks/sql/table.das index d0afb6557..0a28d7afd 100644 --- a/benchmarks/sql/table.das +++ b/benchmarks/sql/table.das @@ -260,6 +260,21 @@ def first_or_default_match_m7(b : B?) { } } +[benchmark] +def groupby_average_m7(b : B?) { + b |> run("groupby_average", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + AvgPrice = _._1 |> select($(c : CarKV) => c.value.price) |> average())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + [benchmark] def groupby_count_m7(b : B?) { b |> run("groupby_count", N) { @@ -274,6 +289,145 @@ def groupby_count_m7(b : B?) { } } +[benchmark] +def groupby_first_m7(b : B?) { + b |> run("groupby_first", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + FirstCar = _._1 |> first())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_having_count_m7(b : B?) { + b |> run("groupby_having_count", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._having(_._1 |> length >= 5) + ._select((Brand = _._0, N = _._1 |> length)) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_having_hidden_sum_m7(b : B?) { + b |> run("groupby_having_hidden_sum", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._having(_._1 |> select($(c : CarKV) => c.value.price) |> sum > 50000) + ._select((Brand = _._0, N = _._1 |> length)) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_having_post_where_m7(b : B?) { + b |> run("groupby_having_post_where", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + Total = _._1 |> select($(c : CarKV) => c.value.price) |> sum())) + ._where(_.Total > 9000000) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_max_m7(b : B?) { + b |> run("groupby_max", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + MaxPrice = _._1 |> select($(c : CarKV) => c.value.price) |> max())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_min_m7(b : B?) { + b |> run("groupby_min", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + MinPrice = _._1 |> select($(c : CarKV) => c.value.price) |> min())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_multi_reducer_m7(b : B?) { + b |> run("groupby_multi_reducer", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + N = _._1 |> length, + TotalPrice = _._1 |> select($(c : CarKV) => c.value.price) |> sum(), + MaxPrice = _._1 |> select($(c : CarKV) => c.value.price) |> max())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_select_order_m7(b : B?) { + b |> run("groupby_select_order", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._group_by(_.value.brand) + ._select((Brand = _._0, + Total = _._1 |> select($(c : CarKV) => c.value.price) |> sum())) + ._order_by(_.Total) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_select_sum_m7(b : B?) { + b |> run("groupby_select_sum", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._select(_.value.price) + ._group_by(_ % 100) + ._select((K = _._0, S = _._1 |> sum())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + [benchmark] def groupby_sum_m7(b : B?) { b |> run("groupby_sum", N) { @@ -289,6 +443,37 @@ def groupby_sum_m7(b : B?) { } } +[benchmark] +def groupby_where_count_m7(b : B?) { + b |> run("groupby_where_count", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._where(_.value.price > 500) + ._group_by(_.value.brand) + ._select((Brand = _._0, N = _._1 |> length)) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def groupby_where_sum_m7(b : B?) { + b |> run("groupby_where_sum", N) { + let groups <- _fold(unsafe(each_kv(g_t)) + ._where(_.value.price > 500) + ._group_by(_.value.brand) + ._select((Brand = _._0, + TotalPrice = _._1 |> select($(c : CarKV) => c.value.price) |> sum())) + .to_array()) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + [benchmark] def join_count_m7(b : B?) { b |> run("join_count", N) { @@ -303,6 +488,37 @@ def join_count_m7(b : B?) { } } +[benchmark] +def join_groupby_count_m7(b : B?) { + b |> run("join_groupby_count", N) { + let groups <- _fold(unsafe(each_kv(g_t)) |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, d : Dealer) => (Brand = c.value.brand, DealerId = d.id)) + |> _group_by(_.Brand) + |> _select((Brand = _._0, N = _._1 |> count()))) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + +[benchmark] +def join_groupby_to_array_m7(b : B?) { + b |> run("join_groupby_to_array", N) { + let groups <- _fold(unsafe(each_kv(g_t)) |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, _d : Dealer) => (Brand = c.value.brand, Price = c.value.price)) + |> _group_by(_.Brand) + |> _select((Brand = _._0, + Total = _._1 |> select($(t : tuple) => t.Price) |> sum()))) + b |> accept(groups) + if (empty(groups)) { + b->failNow() + } + } +} + [benchmark] def join_probe_m7(b : B?) { // srcB is a table joined on its bare key → fused key probe, no internal join hash @@ -335,6 +551,20 @@ def join_probe_build_m7(b : B?) { } } +[benchmark] +def join_select_m7(b : B?) { + b |> run("join_select", N) { + let rows <- _fold(unsafe(each_kv(g_t)) |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, d : Dealer) => (CarName = c.value.name, DealerId = d.id)) + |> _select(_.CarName)) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } +} + [benchmark] def join_where_count_m7(b : B?) { b |> run("join_where_count", N) { @@ -420,6 +650,19 @@ def order_distinct_take_m7(b : B?) { } } +[benchmark] +def order_reverse_normalized_m7(b : B?) { + b |> run("order_reverse_normalized", N) { + unsafe { + let rows <- _fold(each_kv(g_t)._order_by(_.value.price).reverse().take(10).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + [benchmark] def order_take_desc_m7(b : B?) { b |> run("order_take_desc", N) { @@ -457,6 +700,19 @@ def point_lookup_scan_m7(b : B?) { } } +[benchmark] +def reverse_distinct_by_m7(b : B?) { + b |> run("reverse_distinct_by", N) { + unsafe { + let rows <- _fold(each_kv(g_t).reverse()._distinct_by(_.value.brand).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + [benchmark] def reverse_take_m7(b : B?) { b |> run("reverse_take", N) { @@ -470,6 +726,19 @@ def reverse_take_m7(b : B?) { } } +[benchmark] +def reverse_take_select_m7(b : B?) { + b |> run("reverse_take_select", N) { + unsafe { + let rows <- _fold(each_kv(g_t).reverse().take(10)._select(_.value.name).to_array()) + b |> accept(rows) + if (empty(rows)) { + b->failNow() + } + } + } +} + [benchmark] def select_count_m7(b : B?) { b |> run("select_count", N) { @@ -714,3 +983,18 @@ def to_table_staged_m7(b : B?) { delete tab } } + +[benchmark] +def where_join_count_m7(b : B?) { + b |> run("where_join_count", N) { + let c = _fold(unsafe(each_kv(g_t)) |> _where(_.value.price > 500) + |> _join(g_dealers, + $(c : CarKV, d : Dealer) => c.value.dealer_id == d.id, + $(c : CarKV, d : Dealer) => (CarPrice = c.value.price, DealerId = d.id)) + |> count()) + b |> accept(c) + if (c == 0) { + b->failNow() + } + } +} diff --git a/benchmarks/sql/xml.das b/benchmarks/sql/xml.das index b8efcdcfb..fed22b0ea 100644 --- a/benchmarks/sql/xml.das +++ b/benchmarks/sql/xml.das @@ -904,6 +904,33 @@ def to_array_filter_m5f(b : B?) { } } +[benchmark] +def to_table_m5f(b : B?) { + // fused insert-loop sink: the chain lands straight in the result table + b |> run("to_table", N) { + var tab <- _fold(unsafe(from_xml_node(g_root, type)) |> _select((_.id => _.price)) |> to_table()) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + +[benchmark] +def to_table_staged_m5f(b : B?) { + // staged baseline: materialize the kv tuples to an array, then convert + b |> run("to_table_staged", N) { + var rows <- _fold(unsafe(from_xml_node(g_root, type)) |> _select((_.id => _.price)) |> to_array()) + var tab <- to_table_move(rows) + b |> accept(length(tab)) + if (empty(tab)) { + b->failNow() + } + delete tab + } +} + [benchmark] def where_join_count_m5f(b : B?) { b |> run("where_join_count", N) { From 901b014e0a12bbbfd2b6d73562000b07fb763852 Mon Sep 17 00:00:00 2001 From: Boris Batkin Date: Thu, 11 Jun 2026 10:01:59 -0700 Subject: [PATCH 11/11] docs: declare U+2261/U+21D2/U+00D7 for the LaTeX build MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The arc's linq_fold_patterns.rst additions use ≡ / ⇒ / × in prose; pdflatex halts on undeclared unicode (CI docs job failed on U+2261). conf.py's preamble is the documented place for these — verified locally via sphinx -b latex + pdflatex -halt-on-error pass 1. Co-Authored-By: Claude Fable 5 --- doc/source/conf.py | 3 +++ 1 file changed, 3 insertions(+) diff --git a/doc/source/conf.py b/doc/source/conf.py index c27f78786..addfc9b45 100644 --- a/doc/source/conf.py +++ b/doc/source/conf.py @@ -268,8 +268,11 @@ \DeclareUnicodeCharacter{2194}{\ensuremath{\leftrightarrow}} \DeclareUnicodeCharacter{2195}{\ensuremath{\updownarrow}} \DeclareUnicodeCharacter{2260}{\ensuremath{\neq}} +\DeclareUnicodeCharacter{2261}{\ensuremath{\equiv}} \DeclareUnicodeCharacter{2264}{\ensuremath{\leq}} \DeclareUnicodeCharacter{2265}{\ensuremath{\geq}} +\DeclareUnicodeCharacter{21D2}{\ensuremath{\Rightarrow}} +\DeclareUnicodeCharacter{00D7}{\ensuremath{\times}} ''', # Latex figure (float) alignment