diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md index 36122143f..3ba956a74 100644 --- a/benchmarks/sql/LINQ_TO_TABLE.md +++ b/benchmarks/sql/LINQ_TO_TABLE.md @@ -4,10 +4,22 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco `table` / `table` as the 6th `_fold` source, plus the `to_table` sink. Edited in-place as PRs land. -Status: **stage 6 committed — arc complete** (to_table sink; stage 5 = join probe + table-lead -joins, 2742f6db2; stage 4 = point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources, -29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba; -master's fixed-array rework merged in after stage 5, 1ab3e6a67). +Status: **stage 7 committed** (group_by fusion — `can_group_by` + `build_group_by_adapter` on +`TableAdapter`, riding `plan_group_by_core` with the usage-pruned slot walk as the bucket-fill +loop; stage 6 = to_table sink; stage 5 = join probe + table-lead joins, 2742f6db2; stage 4 = +point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources, 29d23baf6; stage 2 = +TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba; master's fixed-array +rework merged in after stage 5, 1ab3e6a67; the JIT inline slot walk landed separately, #3100). + +Stage 7 findings: +- **Two overrides were the whole change**: the group_by splice pattern was already adapter-generic + (`can_group_by_source` gate → `build_group_by_adapter` → `plan_group_by_core`), so enabling + tables = `can_group_by() == true` + a fresh-`TableAdapter` `build_group_by_adapter`. The kv + usage-pruner sees the whole accumulation body (key expr + reducer updates + upstream + where/select segments), so a group key over `kv.value.brand` walks `values(tab)` alone. +- m7 `groupby_*` INTERP 144–201 → 30–50 ns/op (count 163→31, ~5×); JIT 44–73 → 8.4–11 (count + 43.5→8.4, another ~5× — the fused emit rides #3100's inline slot walk). `join_groupby_*` stays + on the cascade (deferred edge below). Stage 6 findings: - **Tier-2 surface required for typing**: `_fold`'s argument must fully type before the macro @@ -212,6 +224,41 @@ PR1 findings: End of arc: `skills/linq.md` + linq docs mention the table source. +## Late stage (planned) — reducer shapes & general code hygiene + +Cross-source cleanups; none are table-specific. Items 1–2 are user-facing reducer-shape fixes, +items 3–4 are codebase hygiene investigations (the linq_fold surface is workable but "a tad too +unwieldy" — the table adapter took several stages, and many fuses read as "add this hook, because +reasons" rather than falling out of the architecture). + +1. **Identity-lambda reducers**: `_._1 |> max($(v) => v)` (also `min`/`sum`/`average`) fails with + 30303 today — the untyped lambda can't infer on the tier-2 lazy-bucket surface, and + `recognize_reducer_specs` has no identity arm either. Fix both ends: recognize the identity + inner-select and canonicalize to the bare form (`max()`), and make the tier-2 generic accept + it so unfused chains agree. +2. **Untyped inner-select lambda params**: `_._1 |> select($(c) => c.value.price) |> sum()` + requires an explicit param type (`$(c : CarKV)`) — the lazy bucket's `select` doesn't flow the + element type into the lambda. Thread the type through so the annotation becomes optional; + today's explicit-type requirement is a usability trap (the error is an opaque 30303, not + "annotate the param"). +3. **match.das adoption survey (linq_fold* + sqlite_* family)**: flatten_opt's move to + `daslib/match` bought both fewer lines and more readable matchers; the linq spine-walkers are + full of the same hand-rolled `is ExprCall` / `as` / null-guard chains. Prime candidates: + `match_key_probe_side` / `extract_key_probe` (manual `ExprRef2Value` peeling + `ExprField` + checks), `extract_*_source` gates, and the sqlite_linq chain decomposition. Survey first, + convert where the match form is a strict readability win. +4. **SourceAdapter interface audit**: all initially-wanted adapters now exist (11 classes: + Array/Zip/ArrayJoin/Decs/DecsJoin/Xml/XmlJoin/Json/JsonJoin/Table/ProjectedSource) — audit the + ~20-method interface against real usage and remove the scaffolding it forces. Known smells: + `arrayTop()`/`arraySrcName()` are still marked "transitional … removed once all consumers move + into subclass methods" yet remain load-bearing (reserve hint reads them; adapters override + them with comments explaining which distant gate reads what); the `build_group_by_adapter` + upstream-join arm repeats ~30 lines of keyaLam/keybLam/resultLam validation across + Array/Decs/Xml/Json; two overlapping reverse hooks (`emit_reverse_skip_into_tail` vs + `emit_reverse_last_backward`); emit fns reconstruct `headCalls` from stringly-keyed captures + ("mirrors emit_loop_or_count_lane_decs"). Goal: a new source should not need "several stages" + of hook-by-hook enablement for the standard fuse set. + ## Risks / watch items - **Mangler ICE 50609** (iterator element-const collision) — `each_kv` yields `-const` non-ref @@ -224,6 +271,10 @@ End of arc: `skills/linq.md` + linq docs mention the table source. ## Deferred edges (named, not built) +- **`join |> group_by` over a table lead**: `TableAdapter.build_group_by_adapter` declines the + upstream-join arm (returns null → tier-2). The fix is a TableJoin analog of `ArrayJoinAdapter` + (lead loop from the pruned slot walk, srcB hash/probe from the stage-5 pieces); the + `join_groupby_*` m7 cells are the numbers it would improve. Revisit on demand. - **Point-lookup conjunct extraction**: `where(kv.key == X && )` (incl. the collapsed multi-where form) could probe and evaluate the residual on the probed element only. The matcher currently declines compound predicates; add when a real chain wants it. diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md index 6ddc837b6..83b006a1e 100644 --- a/benchmarks/sql/results.md +++ b/benchmarks/sql/results.md @@ -21,9 +21,10 @@ are stable now). joined on its bare key probes the table instead of building the join hash — the `join_probe` / `join_probe_build` pair measures it; a trailing `to_table()` inserts straight into the result table with no intermediate array — the `to_table` / `to_table_staged` pair measures it; - group_by / reverse defer to tier-2). Under JIT, `keys`/`values` for-loop sources compile to an - inline open-addressed slot walk (no per-element C++ iterator calls), so the m7 JIT column is - fused codegen end to end. + group_by fuses through `plan_group_by_core` with the usage-pruned slot walk as the bucket-fill + loop; join+group_by and reverse defer to tier-2). Under JIT, `keys`/`values` for-loop sources + compile to an inline open-addressed slot walk (no per-element C++ iterator calls), so the m7 + JIT column is fused codegen end to end. `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`. @@ -38,175 +39,175 @@ signal, JIT deltas as indicative.** | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 34.9 | 5.9 | 5.9 | 60.5 | 158.9 | 19.8 | -| `all_match` | 27.8 | 3.5 | 3.5 | 56.5 | 156.6 | 15.8 | +| `aggregate_match` | 34.7 | 5.9 | 6.1 | 60.5 | 158.9 | 19.9 | +| `all_match` | 27.8 | 3.5 | 3.5 | 56.1 | 157.6 | 16.1 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.2 | 6.1 | 8.7 | 58.8 | 157.2 | 17.2 | -| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.1 | 30.2 | -| `bare_order_where` | 279.9 | 117.8 | 125.7 | 302.3 | 292.5 | 163.9 | -| `chained_select_collapse` | — | 17.6 | 17.4 | 70.6 | 154.1 | 28.4 | -| `chained_where` | 36.5 | 6.6 | 7.1 | 105.8 | 177.6 | 23.9 | -| `contains_match` | 0.0 | 2.2 | 1.4 | 27.8 | 70.3 | 6.5 | -| `count_aggregate` | 29.7 | 4.2 | 4.1 | 64.1 | 158.1 | 20.3 | -| `cross_join` | 12594.5 | 3704.9 | — | 4030.6 | 4063.0 | — | +| `average_aggregate` | 30.6 | 6.1 | 8.7 | 58.7 | 157.7 | 17.2 | +| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.1 | +| `bare_order_where` | 279.8 | 117.0 | 125.6 | 302.7 | 296.8 | 164.3 | +| `chained_select_collapse` | — | 17.7 | 17.4 | 70.5 | 154.5 | 28.5 | +| `chained_where` | 36.6 | 6.6 | 7.1 | 105.5 | 177.6 | 23.9 | +| `contains_match` | 0.0 | 2.2 | 1.4 | 27.7 | 71.6 | 6.5 | +| `count_aggregate` | 29.4 | 4.2 | 4.1 | 64.2 | 162.2 | 20.3 | +| `cross_join` | 12628.8 | 3713.6 | — | 4051.3 | 4077.4 | — | | `decs_count_bare_pred` | — | — | 4.1 | — | — | — | -| `distinct_by_count` | 41.6 | 15.8 | 15.7 | 70.7 | 156.5 | 27.3 | -| `distinct_by_order_take` | 240.5 | 22.2 | 23.4 | 124.6 | 159.8 | 49.5 | -| `distinct_by_order_to_array` | 240.7 | 22.1 | 23.4 | 125.1 | 163.6 | 49.2 | -| `distinct_count` | 41.2 | 15.6 | 15.6 | 70.6 | 161.2 | 27.5 | -| `distinct_count_pred` | 253.4 | 15.9 | 15.9 | 112.6 | 173.8 | 27.4 | +| `distinct_by_count` | 41.4 | 15.7 | 15.7 | 70.6 | 159.1 | 27.2 | +| `distinct_by_order_take` | 242.2 | 22.0 | 23.4 | 124.9 | 158.7 | 49.5 | +| `distinct_by_order_to_array` | 240.8 | 21.9 | 23.4 | 125.2 | 160.5 | 49.5 | +| `distinct_count` | 41.6 | 15.5 | 15.6 | 70.6 | 154.4 | 27.4 | +| `distinct_count_pred` | 254.8 | 15.8 | 16.0 | 112.5 | 170.7 | 27.3 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.3 | 29.1 | 29.4 | 124.3 | 187.5 | 197.3 | -| `groupby_count` | 143.5 | 19.1 | 19.1 | 74.8 | 159.1 | 163.1 | -| `groupby_first` | 251.7 | 19.0 | 19.8 | 72.4 | 155.1 | 163.1 | -| `groupby_having_count` | 142.2 | 19.1 | 19.1 | 75.2 | 160.2 | 185.4 | -| `groupby_having_hidden_sum` | 175.9 | 22.2 | 22.3 | 119.0 | 183.5 | 215.6 | -| `groupby_having_post_where` | 172.8 | 20.4 | 20.5 | 115.3 | 181.3 | 194.2 | -| `groupby_max` | 173.1 | 24.9 | 24.8 | 120.5 | 184.1 | 201.8 | -| `groupby_min` | 173.6 | 25.6 | 25.2 | 120.7 | 184.3 | 204.0 | -| `groupby_multi_reducer` | 190.4 | 30.4 | 30.3 | 126.3 | 188.2 | 231.5 | -| `groupby_select_order` | 172.9 | 20.5 | 20.4 | 118.6 | 179.8 | 194.6 | -| `groupby_select_sum` | 198.2 | 38.5 | 38.8 | 102.1 | 185.0 | 188.0 | -| `groupby_sum` | 171.1 | 20.4 | 20.4 | 115.0 | 179.5 | 194.2 | -| `groupby_where_count` | 76.8 | 13.9 | 14.5 | 115.3 | 188.6 | 164.2 | -| `groupby_where_sum` | 87.4 | 14.2 | 14.8 | 116.7 | 187.1 | 179.4 | -| `join_count` | 38.6 | 51.8 | 63.9 | 112.3 | 185.6 | 64.2 | -| `join_groupby_count` | 158.4 | 76.9 | 88.5 | 178.1 | 225.8 | 259.3 | -| `join_groupby_to_array` | 190.4 | 78.3 | 90.6 | 215.6 | 212.2 | 290.2 | -| `join_probe` | — | — | — | — | — | 46.6 | -| `join_probe_build` | — | — | — | — | — | 79.5 | -| `join_select` | 150.9 | 73.5 | 84.6 | 187.9 | 207.1 | 223.5 | -| `join_where_count` | 39.8 | 61.9 | 75.8 | 161.2 | 192.9 | 79.8 | -| `last_match` | 0.0 | 5.8 | 13.9 | 65.3 | 157.9 | 30.9 | -| `long_count_aggregate` | 29.7 | 4.2 | 4.1 | 63.7 | 158.0 | 20.1 | -| `max_aggregate` | 31.0 | 6.1 | 6.8 | 58.8 | 157.6 | 17.0 | -| `min_aggregate` | 30.9 | 6.1 | 6.8 | 59.0 | 159.3 | 17.0 | -| `order_by_multi_key` | 338.1 | 274.4 | 282.8 | 459.3 | 445.2 | 341.9 | -| `order_distinct_take` | 138.5 | 15.7 | 98.9 | 72.8 | 155.0 | 31.7 | -| `order_reverse_normalized` | 38.4 | 16.2 | 19.9 | 70.5 | 162.1 | 33.1 | -| `order_take_desc` | 38.3 | 16.4 | 19.9 | 70.6 | 162.5 | 33.0 | +| `groupby_average` | 176.5 | 29.0 | 29.3 | 124.3 | 187.3 | 41.8 | +| `groupby_count` | 143.2 | 19.0 | 19.4 | 74.7 | 160.0 | 31.1 | +| `groupby_first` | 253.3 | 19.0 | 20.1 | 72.4 | 156.2 | 40.7 | +| `groupby_having_count` | 141.7 | 19.0 | 19.1 | 75.1 | 159.4 | 31.2 | +| `groupby_having_hidden_sum` | 176.8 | 22.2 | 22.3 | 118.8 | 183.3 | 34.2 | +| `groupby_having_post_where` | 174.3 | 20.3 | 20.4 | 114.7 | 180.1 | 32.4 | +| `groupby_max` | 175.5 | 24.7 | 24.9 | 120.5 | 184.2 | 35.0 | +| `groupby_min` | 173.9 | 25.5 | 25.2 | 121.0 | 183.6 | 34.8 | +| `groupby_multi_reducer` | 191.1 | 30.3 | 30.3 | 126.3 | 189.1 | 43.5 | +| `groupby_select_order` | 172.0 | 20.3 | 20.4 | 115.0 | 179.9 | 32.3 | +| `groupby_select_sum` | 201.3 | 38.5 | 38.5 | 102.1 | 185.6 | 50.3 | +| `groupby_sum` | 170.5 | 20.4 | 20.4 | 115.0 | 179.7 | 32.4 | +| `groupby_where_count` | 76.5 | 13.9 | 14.5 | 115.8 | 181.4 | 29.9 | +| `groupby_where_sum` | 87.2 | 14.2 | 14.8 | 116.6 | 181.9 | 31.3 | +| `join_count` | 38.3 | 52.0 | 65.0 | 112.3 | 177.5 | 65.1 | +| `join_groupby_count` | 157.7 | 77.3 | 88.9 | 177.9 | 224.8 | 260.4 | +| `join_groupby_to_array` | 191.6 | 79.1 | 91.3 | 215.2 | 211.3 | 289.4 | +| `join_probe` | — | — | — | — | — | 46.5 | +| `join_probe_build` | — | — | — | — | — | 79.1 | +| `join_select` | 150.4 | 74.1 | 85.0 | 188.7 | 211.5 | 222.8 | +| `join_where_count` | 39.7 | 62.3 | 76.2 | 161.1 | 193.0 | 79.6 | +| `last_match` | 0.0 | 5.8 | 14.0 | 65.2 | 159.4 | 30.8 | +| `long_count_aggregate` | 29.8 | 4.2 | 4.1 | 63.8 | 158.0 | 21.4 | +| `max_aggregate` | 31.3 | 6.1 | 6.8 | 58.8 | 157.3 | 16.9 | +| `min_aggregate` | 31.4 | 6.1 | 6.8 | 59.0 | 157.5 | 16.9 | +| `order_by_multi_key` | 341.1 | 274.6 | 283.0 | 459.8 | 450.9 | 334.7 | +| `order_distinct_take` | 138.7 | 15.7 | 99.0 | 72.6 | 155.7 | 31.6 | +| `order_reverse_normalized` | 38.4 | 16.3 | 20.0 | 70.5 | 162.8 | 32.9 | +| `order_take_desc` | 38.5 | 16.4 | 20.0 | 70.6 | 162.2 | 32.9 | | `point_lookup` | — | — | — | — | — | 0.0 | | `point_lookup_scan` | — | — | — | — | — | 8.3 | -| `reverse_distinct_by` | 296.9 | 21.1 | 27.7 | 71.7 | 154.5 | 44.4 | +| `reverse_distinct_by` | 296.8 | 21.1 | 28.3 | 71.3 | 155.8 | 43.8 | | `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.2 | 58.5 | -| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | 58.7 | -| `select_count` | 0.1 | 0.0 | 2.2 | 63.4 | 2.2 | 0.0 | -| `select_many` | — | 189.8 | — | — | — | — | -| `select_where` | 199.2 | 11.2 | 19.2 | 197.4 | 186.5 | 37.8 | -| `select_where_count` | 33.0 | 5.2 | 7.5 | 65.2 | 150.0 | 23.2 | -| `select_where_order_take` | 37.0 | 12.2 | 14.9 | 72.5 | 163.1 | 34.7 | -| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.2 | 158.2 | 24.2 | -| `single_match` | 0.0 | 2.9 | 5.4 | 56.2 | 148.2 | 22.8 | +| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | 58.6 | +| `select_count` | 0.1 | 0.0 | 2.2 | 65.1 | 2.2 | 0.0 | +| `select_many` | — | 189.4 | — | — | — | — | +| `select_where` | 195.0 | 11.2 | 19.3 | 197.8 | 188.5 | 37.6 | +| `select_where_count` | 33.0 | 5.2 | 7.4 | 65.3 | 150.2 | 23.0 | +| `select_where_order_take` | 37.0 | 12.2 | 14.8 | 72.7 | 162.8 | 34.6 | +| `select_where_sum` | 37.4 | 7.4 | 7.5 | 66.2 | 157.7 | 24.1 | +| `single_match` | 0.0 | 2.9 | 5.4 | 55.9 | 148.2 | 23.0 | | `skip_take` | 0.5 | 0.1 | 0.2 | 3.1 | 2.8 | 0.3 | -| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.9 | 150.2 | 18.2 | -| `sort_first` | 38.4 | 11.0 | 13.4 | 65.7 | 162.3 | 31.7 | -| `sort_take` | 38.6 | 16.1 | 20.3 | 70.7 | 163.1 | 33.2 | -| `sort_take_select` | 38.6 | 16.4 | 20.2 | 70.9 | 161.9 | 33.2 | -| `sum_aggregate` | 30.0 | 2.1 | 2.1 | 54.8 | 156.9 | 13.5 | -| `sum_where` | 33.0 | 4.3 | 4.3 | 63.7 | 157.9 | 20.7 | -| `take_count` | 3.7 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | +| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.7 | 149.0 | 18.2 | +| `sort_first` | 38.2 | 11.0 | 13.4 | 65.6 | 159.4 | 31.6 | +| `sort_take` | 38.6 | 16.1 | 20.4 | 70.5 | 163.4 | 33.0 | +| `sort_take_select` | 38.2 | 16.4 | 20.3 | 71.0 | 163.1 | 33.1 | +| `sum_aggregate` | 30.2 | 2.1 | 2.1 | 54.5 | 156.8 | 13.5 | +| `sum_where` | 33.0 | 4.3 | 4.3 | 63.7 | 158.1 | 20.4 | +| `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 | | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.4 | 1.1 | 0.3 | | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 | | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 | -| `take_while_match` | 7.8 | 2.4 | 2.5 | 29.0 | 72.6 | 16.9 | -| `to_array_filter` | 70.7 | 11.7 | 11.8 | 71.7 | 163.7 | 28.9 | -| `to_table` | — | 18.6 | 141.9 | 118.5 | 140.3 | 32.1 | -| `to_table_staged` | — | 54.7 | 56.6 | 143.3 | 165.1 | 68.5 | -| `where_join_count` | 41.6 | 29.4 | 41.0 | 132.1 | 171.8 | 46.7 | -| `zip_count_pred` | 39.4 | 15.8 | — | 316.5 | 317.6 | — | -| `zip_dot_product` | 49.8 | 12.6 | 10.6 | 312.4 | 313.7 | — | -| `zip_dot_product_3arg` | 50.2 | 12.7 | — | 312.5 | 313.9 | — | -| `zip_reverse_to_array` | — | 32.1 | — | 347.2 | 351.8 | — | +| `take_while_match` | 7.8 | 2.4 | 2.5 | 29.0 | 72.5 | 16.6 | +| `to_array_filter` | 70.6 | 11.7 | 11.7 | 71.8 | 161.9 | 28.9 | +| `to_table` | — | 18.7 | 141.8 | 118.5 | 140.1 | 32.2 | +| `to_table_staged` | — | 54.6 | 56.6 | 143.0 | 164.0 | 68.6 | +| `where_join_count` | 39.7 | 29.5 | 41.1 | 132.2 | 166.3 | 46.8 | +| `zip_count_pred` | 39.1 | 15.8 | — | 317.6 | 318.8 | — | +| `zip_dot_product` | 46.9 | 12.6 | 10.6 | 312.4 | 315.1 | — | +| `zip_dot_product_3arg` | 46.8 | 12.7 | — | 311.9 | 315.7 | — | +| `zip_reverse_to_array` | — | 32.1 | — | 348.5 | 349.7 | — | ## JIT | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) | |---|---:|---:|---:|---:|---:|---:| -| `aggregate_match` | 35.0 | 0.3 | 0.7 | 29.7 | 27.3 | 7.3 | -| `all_match` | 27.7 | 0.3 | 0.2 | 18.8 | 25.3 | 7.2 | +| `aggregate_match` | 34.7 | 0.3 | 0.7 | 29.6 | 25.8 | 7.2 | +| `all_match` | 27.5 | 0.3 | 0.2 | 18.8 | 24.9 | 7.2 | | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `average_aggregate` | 30.3 | 1.0 | 3.6 | 18.8 | 24.4 | 7.4 | +| `average_aggregate` | 30.1 | 1.0 | 3.6 | 18.5 | 24.5 | 7.4 | | `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 8.9 | -| `bare_order_where` | 185.9 | 33.8 | 35.0 | 106.0 | 51.7 | 68.2 | -| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 32.1 | 8.1 | -| `chained_where` | 36.7 | 0.6 | 0.9 | 36.4 | 31.8 | 10.4 | -| `contains_match` | 0.0 | 0.2 | 0.1 | 16.8 | 9.2 | 2.5 | -| `count_aggregate` | 29.4 | 0.3 | 0.6 | 29.5 | 25.1 | 7.3 | -| `cross_join` | 5962.9 | 719.2 | — | 833.9 | 771.0 | — | +| `bare_order_where` | 184.9 | 33.8 | 35.0 | 105.9 | 51.7 | 68.4 | +| `chained_select_collapse` | — | 1.1 | 1.1 | 20.5 | 31.9 | 8.1 | +| `chained_where` | 36.6 | 0.6 | 0.9 | 36.3 | 29.9 | 10.6 | +| `contains_match` | 0.0 | 0.2 | 0.1 | 19.3 | 8.8 | 2.5 | +| `count_aggregate` | 29.8 | 0.3 | 0.6 | 29.1 | 25.1 | 7.3 | +| `cross_join` | 5967.3 | 717.5 | — | 831.1 | 766.7 | — | | `decs_count_bare_pred` | — | — | 0.6 | — | — | — | -| `distinct_by_count` | 41.6 | 1.1 | 1.1 | 20.6 | 31.9 | 8.0 | -| `distinct_by_order_take` | 238.9 | 1.7 | 2.6 | 45.3 | 37.2 | 19.6 | -| `distinct_by_order_to_array` | 239.5 | 1.7 | 2.7 | 45.5 | 37.0 | 19.5 | -| `distinct_count` | 41.4 | 1.1 | 1.1 | 20.7 | 33.1 | 8.0 | -| `distinct_count_pred` | 252.1 | 1.1 | 1.3 | 37.7 | 43.6 | 8.0 | +| `distinct_by_count` | 41.4 | 1.1 | 1.1 | 20.5 | 32.0 | 8.0 | +| `distinct_by_order_take` | 238.2 | 1.7 | 2.6 | 45.3 | 37.2 | 19.5 | +| `distinct_by_order_to_array` | 239.9 | 1.7 | 2.7 | 45.3 | 37.1 | 19.7 | +| `distinct_count` | 41.6 | 1.1 | 1.1 | 20.5 | 32.1 | 8.0 | +| `distinct_count_pred` | 252.3 | 1.1 | 1.3 | 37.6 | 41.8 | 8.0 | | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 | | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | -| `groupby_average` | 171.5 | 1.5 | 1.8 | 37.2 | 44.8 | 51.8 | -| `groupby_count` | 162.3 | 1.3 | 1.5 | 20.7 | 32.2 | 43.5 | -| `groupby_first` | 251.8 | 1.3 | 2.3 | 20.7 | 34.0 | 43.4 | -| `groupby_having_count` | 142.7 | 1.3 | 1.5 | 20.6 | 33.4 | 46.2 | -| `groupby_having_hidden_sum` | 175.1 | 1.5 | 1.9 | 37.0 | 43.0 | 54.5 | -| `groupby_having_post_where` | 171.4 | 1.4 | 1.9 | 37.0 | 42.0 | 51.4 | -| `groupby_max` | 173.0 | 1.5 | 1.9 | 37.1 | 43.6 | 51.9 | -| `groupby_min` | 172.4 | 1.5 | 1.9 | 38.2 | 43.4 | 52.6 | -| `groupby_multi_reducer` | 193.2 | 1.6 | 1.9 | 37.2 | 43.7 | 60.8 | -| `groupby_select_order` | 170.5 | 1.4 | 1.6 | 37.9 | 41.9 | 51.8 | -| `groupby_select_sum` | 196.9 | 2.8 | 3.2 | 33.5 | 37.7 | 73.3 | -| `groupby_sum` | 171.5 | 1.4 | 1.9 | 37.9 | 42.0 | 52.1 | -| `groupby_where_count` | 76.5 | 0.9 | 1.3 | 37.2 | 39.7 | 53.7 | -| `groupby_where_sum` | 87.4 | 0.9 | 1.3 | 37.1 | 39.7 | 57.7 | -| `join_count` | 38.3 | 11.2 | 12.5 | 40.8 | 68.0 | 25.2 | -| `join_groupby_count` | 157.5 | 17.2 | 19.3 | 66.4 | 86.0 | 73.1 | -| `join_groupby_to_array` | 190.7 | 17.8 | 19.7 | 78.6 | 35.8 | 81.4 | -| `join_probe` | — | — | — | — | — | 16.7 | -| `join_probe_build` | — | — | — | — | — | 33.2 | -| `join_select` | 91.8 | 19.6 | 21.7 | 73.5 | 89.8 | 70.1 | -| `join_where_count` | 39.2 | 19.2 | 20.6 | 63.3 | 77.3 | 31.7 | -| `last_match` | 0.0 | 0.5 | 1.4 | 19.6 | 25.1 | 12.1 | -| `long_count_aggregate` | 30.0 | 0.3 | 0.6 | 29.4 | 25.1 | 7.3 | -| `max_aggregate` | 31.0 | 0.3 | 0.5 | 29.7 | 26.3 | 7.5 | -| `min_aggregate` | 31.0 | 0.3 | 0.5 | 29.7 | 26.2 | 7.4 | -| `order_by_multi_key` | 242.6 | 53.3 | 54.4 | 124.6 | 70.5 | 119.3 | -| `order_distinct_take` | 138.6 | 1.1 | 75.8 | 20.9 | 34.1 | 8.1 | -| `order_reverse_normalized` | 38.5 | 0.7 | 1.3 | 19.8 | 27.0 | 11.1 | -| `order_take_desc` | 38.8 | 0.7 | 1.3 | 19.8 | 26.9 | 10.0 | +| `groupby_average` | 175.4 | 1.5 | 1.8 | 37.0 | 42.9 | 8.9 | +| `groupby_count` | 153.2 | 1.3 | 1.5 | 20.5 | 32.3 | 8.4 | +| `groupby_first` | 252.8 | 1.3 | 2.3 | 20.5 | 32.9 | 10.0 | +| `groupby_having_count` | 141.8 | 1.3 | 1.5 | 20.5 | 32.3 | 8.5 | +| `groupby_having_hidden_sum` | 176.3 | 1.5 | 1.9 | 36.9 | 43.0 | 8.7 | +| `groupby_having_post_where` | 172.1 | 1.4 | 1.9 | 36.9 | 42.2 | 8.5 | +| `groupby_max` | 176.7 | 1.5 | 1.9 | 37.1 | 45.2 | 8.6 | +| `groupby_min` | 177.2 | 1.5 | 1.9 | 38.3 | 45.6 | 8.5 | +| `groupby_multi_reducer` | 191.7 | 1.6 | 1.9 | 37.2 | 43.7 | 9.0 | +| `groupby_select_order` | 176.6 | 1.4 | 1.6 | 37.8 | 41.9 | 8.4 | +| `groupby_select_sum` | 204.9 | 2.8 | 3.2 | 33.5 | 37.7 | 22.8 | +| `groupby_sum` | 171.6 | 1.4 | 1.9 | 37.7 | 42.0 | 8.4 | +| `groupby_where_count` | 76.9 | 0.9 | 1.3 | 37.1 | 39.7 | 11.2 | +| `groupby_where_sum` | 90.4 | 0.9 | 1.3 | 37.1 | 39.7 | 11.2 | +| `join_count` | 38.6 | 11.2 | 12.6 | 40.7 | 68.3 | 25.1 | +| `join_groupby_count` | 157.4 | 17.2 | 19.2 | 66.2 | 86.0 | 73.5 | +| `join_groupby_to_array` | 191.5 | 17.8 | 19.6 | 78.4 | 35.8 | 80.6 | +| `join_probe` | — | — | — | — | — | 16.6 | +| `join_probe_build` | — | — | — | — | — | 31.6 | +| `join_select` | 93.0 | 19.6 | 21.7 | 73.2 | 90.5 | 69.5 | +| `join_where_count` | 48.9 | 19.1 | 20.6 | 62.9 | 77.6 | 31.5 | +| `last_match` | 0.0 | 0.5 | 1.4 | 19.5 | 25.5 | 12.0 | +| `long_count_aggregate` | 29.7 | 0.3 | 0.6 | 29.3 | 25.1 | 7.3 | +| `max_aggregate` | 30.7 | 0.3 | 0.5 | 29.6 | 26.3 | 7.3 | +| `min_aggregate` | 31.0 | 0.3 | 0.5 | 29.6 | 26.3 | 7.4 | +| `order_by_multi_key` | 245.0 | 53.4 | 54.5 | 124.5 | 70.3 | 118.9 | +| `order_distinct_take` | 138.7 | 1.1 | 75.0 | 20.8 | 34.4 | 8.0 | +| `order_reverse_normalized` | 38.7 | 0.7 | 1.3 | 19.6 | 27.0 | 9.5 | +| `order_take_desc` | 38.7 | 0.7 | 1.3 | 19.6 | 27.0 | 9.3 | | `point_lookup` | — | — | — | — | — | 0.0 | | `point_lookup_scan` | — | — | — | — | — | 3.0 | -| `reverse_distinct_by` | 296.3 | 1.6 | 3.2 | 20.6 | 32.6 | 10.9 | -| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 19.3 | -| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 19.2 | -| `select_count` | 0.1 | 0.0 | 0.0 | 67.0 | 0.0 | 0.0 | -| `select_many` | — | 61.4 | — | — | — | — | -| `select_where` | 107.8 | 4.1 | 5.3 | 76.0 | 22.2 | 17.9 | -| `select_where_count` | 32.6 | 0.3 | 0.6 | 29.5 | 25.9 | 7.4 | -| `select_where_order_take` | 36.8 | 0.7 | 1.4 | 19.8 | 26.6 | 13.0 | -| `select_where_sum` | 37.4 | 0.4 | 0.6 | 20.4 | 24.8 | 7.5 | -| `single_match` | 0.0 | 0.4 | 1.1 | 45.9 | 22.2 | 9.7 | -| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 | -| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.1 | 21.7 | 7.8 | -| `sort_first` | 38.1 | 0.4 | 1.3 | 18.8 | 26.1 | 9.3 | -| `sort_take` | 38.6 | 0.7 | 1.3 | 19.8 | 27.0 | 9.7 | -| `sort_take_select` | 38.6 | 0.7 | 1.3 | 19.8 | 26.9 | 9.6 | -| `sum_aggregate` | 30.2 | 0.3 | 0.0 | 22.5 | 24.2 | 7.3 | -| `sum_where` | 32.9 | 0.3 | 0.6 | 29.6 | 25.8 | 7.3 | -| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.2 | 0.1 | +| `reverse_distinct_by` | 296.5 | 1.6 | 3.2 | 20.5 | 34.2 | 11.1 | +| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 19.5 | +| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 19.0 | +| `select_count` | 0.1 | 0.0 | 0.0 | 64.6 | 0.0 | 0.0 | +| `select_many` | — | 61.3 | — | — | — | — | +| `select_where` | 107.1 | 4.2 | 5.2 | 75.6 | 21.9 | 17.7 | +| `select_where_count` | 32.8 | 0.3 | 0.6 | 29.4 | 25.9 | 7.2 | +| `select_where_order_take` | 37.1 | 0.7 | 1.4 | 19.6 | 26.6 | 12.9 | +| `select_where_sum` | 37.2 | 0.4 | 0.6 | 20.2 | 24.9 | 7.4 | +| `single_match` | 0.0 | 0.4 | 1.1 | 44.2 | 22.3 | 9.2 | +| `skip_take` | 0.3 | 0.0 | 0.0 | 1.2 | 0.2 | 0.1 | +| `skip_while_match` | 3.5 | 0.4 | 0.4 | 44.4 | 21.8 | 7.6 | +| `sort_first` | 38.2 | 0.4 | 1.3 | 18.7 | 26.2 | 9.3 | +| `sort_take` | 38.4 | 0.7 | 1.3 | 19.6 | 27.2 | 9.6 | +| `sort_take_select` | 38.5 | 0.7 | 1.4 | 19.5 | 27.1 | 9.6 | +| `sum_aggregate` | 30.3 | 0.3 | 0.0 | 22.4 | 24.2 | 7.3 | +| `sum_where` | 33.1 | 0.3 | 0.6 | 29.5 | 25.8 | 7.3 | +| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.1 | | `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.5 | 0.1 | 0.0 | | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 | | `take_where_count` | 0.9 | 0.0 | 0.0 | 0.3 | 0.0 | 0.0 | -| `take_while_match` | 7.8 | 0.2 | 0.3 | 16.9 | 9.0 | 7.3 | -| `to_array_filter` | 48.4 | 3.3 | 3.3 | 22.2 | 33.7 | 13.2 | -| `to_table` | — | 14.1 | 37.1 | 49.6 | 52.2 | 20.6 | -| `to_table_staged` | — | 25.8 | 26.2 | 53.2 | 61.3 | 33.6 | -| `where_join_count` | 41.6 | 6.0 | 6.7 | 48.0 | 40.6 | 19.9 | -| `zip_count_pred` | 39.3 | 0.1 | — | 114.0 | 33.7 | — | -| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 113.6 | 33.5 | — | -| `zip_dot_product_3arg` | 46.6 | 0.1 | — | 113.9 | 33.7 | — | -| `zip_reverse_to_array` | — | 4.5 | — | 125.3 | 50.7 | — | +| `take_while_match` | 7.8 | 0.2 | 0.3 | 19.0 | 9.0 | 7.3 | +| `to_array_filter` | 48.4 | 3.3 | 3.3 | 22.0 | 33.5 | 13.0 | +| `to_table` | — | 14.1 | 36.9 | 49.4 | 52.0 | 21.1 | +| `to_table_staged` | — | 25.8 | 26.0 | 53.0 | 61.5 | 33.9 | +| `where_join_count` | 41.7 | 6.0 | 6.7 | 47.6 | 40.5 | 19.7 | +| `zip_count_pred` | 39.6 | 0.1 | — | 113.5 | 33.5 | — | +| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 113.6 | 33.3 | — | +| `zip_dot_product_3arg` | 46.8 | 0.1 | — | 113.6 | 33.3 | — | +| `zip_reverse_to_array` | — | 4.5 | — | 124.9 | 50.6 | — | ## Missing lanes (the `—` cells) @@ -222,7 +223,7 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk). - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity). - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.) -- **m7 absent families** — `zip_*` / `cross_join` (lockstep pairing over an unordered slot walk is meaningless) and `select_many` (flat fixture, no nested array field; array-only). Everything else in the m7 column is instantiated — but read the `groupby_*` / `join_groupby_*` / reverse-family cells as the **tier-2 cascade cost**, not a fused emit: table group_by fusion and a backward slot walk are named deferred edges (see `LINQ_TO_TABLE.md`), so those cells are the numbers a fix would improve. +- **m7 absent families** — `zip_*` / `cross_join` (lockstep pairing over an unordered slot walk is meaningless) and `select_many` (flat fixture, no nested array field; array-only). Everything else in the m7 column is instantiated, and the `groupby_*` family is a fused emit (`plan_group_by_core` over the usage-pruned slot walk). The remaining cascade cells are `join_groupby_*` (join |> group_by over a table lead declines) and the reverse family (no backward slot walk) — both named deferred edges (see `LINQ_TO_TABLE.md`), so those cells are the numbers a fix would improve. - **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design. - **`join_probe` / `join_probe_build` non-m7** — m7-only A/B pair: a table srcB joined on its bare key probes the user's table per lead row (no internal join hash, no build loop); the `_build` twin feeds the identical rows pre-materialized to a kv array, forcing the hashed build. Other sources have no keyed-srcB analog by design. - **`to_table` / `to_table_staged` SQL** — `to_table` isn't an SQL terminator (`_sql` pass-through has no table sink). All in-memory sources are instantiated: array / XML / JSON / table fuse the insert-loop sink (`_staged` is the materialize-then-`to_table_move` shape every chain had before the sink arm); decs declines by design (explicit guard in its loop_or_count lane), so its `to_table` cell is the full tier-2 cascade — currently slower than its `_staged` twin, which fuses the array materialization first. That gap is the motivating number for a future decs sink hook. diff --git a/daslib/linq_fold_table.das b/daslib/linq_fold_table.das index 89b20b630..d744c1884 100644 --- a/daslib/linq_fold_table.das +++ b/daslib/linq_fold_table.das @@ -125,9 +125,18 @@ class TableAdapter : SourceAdapter { def override const can_join() : bool { return true // rides emit_array_join: direct-return lead loop via wrap_source_loop } + def override const can_group_by() : bool { + return true // plan_group_by_core drives the bucket-fill through wrap_source_loop (kv pruning free) + } def override emit_join_hook(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expression? { return emit_array_join(c, ctx, at) } + def override build_group_by_adapter(var c : Captures; var ctx : EmitCtx; at : LineInfo) : SourceAdapter? { + // join |> group_by over a table lead stays a deferred edge (LINQ_TO_TABLE.md) — null bails to tier-2 + if (c.single |> key_exists("upstream_join")) return null + return new TableAdapter(tabExpr = clone_expression(tabExpr), srcName = qn("tsrc", at), + elemType = clone_type(elemType), lane = lane) + } def override arrayTop() : Expression? { // Feeds the reserve hint (type_has_length covers tables). The backward-index reverse lanes that // also read arrayTop gate on array_source, which is false here — matchTop stays iterator-typed. diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst index fb22a137e..b3f17fcf5 100644 --- a/doc/source/reference/linq_fold_patterns.rst +++ b/doc/source/reference/linq_fold_patterns.rst @@ -150,7 +150,7 @@ Source-side entry points - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy). * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)`` - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``) - - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. ``can_group_by`` is off and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). **``to_table()`` sinks fuse** (table-buffer materializer row above): the chain inserts straight into the result table — a bare ``each_kv(tab).to_table()`` is a reserve-ahead table clone through the fused walk, and a ``keys(tab)`` chain lands in the ``table`` set form. ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. + - In-tree source — recognized by name **plus** a table-typed argument (``table`` / ``table``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. **``group_by`` fuses** (``can_group_by`` is on; ``build_group_by_adapter`` hands ``plan_group_by_core`` a fresh ``TableAdapter``, so the bucket-fill loop is the usage-pruned slot walk — a group key over ``kv.value.brand`` walks ``values(tab)`` alone) for the plain-lead shape only: ``join |> group_by`` over a table lead declines (the upstream-join arm returns null) and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). **``to_table()`` sinks fuse** (table-buffer materializer row above): the chain inserts straight into the result table — a bare ``each_kv(tab).to_table()`` is a reserve-ahead table clone through the fused walk, and a ``keys(tab)`` chain lands in the ``table`` set form. ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference. * - ``unsafe(from_json(jv, type))`` - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``) - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML. @@ -497,6 +497,16 @@ see the table-source row above and the probe row below. so a join touching only ``c.value.*`` walks ``values(tab)`` alone. All srcB modes compose (hashed array/iterator srcB, table-srcB probe); ``group_join`` stays outer over every slot. + * - ``unsafe(each_kv(tab)) |> _group_by(K) |> _select(reduce) |> ...`` (``keys`` / ``values`` lanes too; ``having`` / trailing ``where`` / ``order_by`` / ``count`` compose) + - pattern ``group_by`` → ``TableAdapter.build_group_by_adapter`` → ``plan_group_by_core`` + - **Table lead group_by**: ``build_group_by_adapter`` hands the + planner a fresh ``TableAdapter``, so the bucket-fill loop is + framed by ``wrap_source_loop`` and the kv usage-pruner sees the + whole accumulation body (key expr + reducer updates + upstream + where/select segments) — a group key over ``kv.value.brand`` + walks ``values(tab)`` alone. ``join |> group_by`` over a table + lead declines (the upstream-join arm returns null) and cascades + to tier-2. * - ``arrA |> _group_join(arrB, on, into)`` (+ optional leading ``_where``) - pattern ``join_general`` with the ``group_join`` literal (``isGroupJoin``) - C# GroupJoin (**outer**): one result row per srcA row — ``result(a, diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das index 8900c05f2..6c78bec35 100644 --- a/tests/linq/test_linq_table_source.das +++ b/tests/linq/test_linq_table_source.das @@ -619,3 +619,113 @@ def test_to_table_sink(t : T?) { delete arr } } + +[test] +def test_table_group_by(t : T?) { + t |> run("kv lane: count + sum per group agrees with a hand loop") @(t : T?) { + var tab <- make_int_table(20) + let groups <- _fold(each_kv(tab) + ._group_by(_.key % 3) + ._select((K = _._0, N = _._1 |> count(), S = _._1 |> select($(kv : IKV) => kv.value) |> sum())) + .to_array()) + var expN : table + var expS : table + for (k, v in keys(tab), values(tab)) { + expN[k % 3] ++ + expS[k % 3] += v + } + t |> equal(length(groups), length(expN)) + for (g in groups) { + t |> equal(g.N, expN?[g.K] ?? -1) + t |> equal(g.S, expS?[g.K] ?? -1) + } + delete expN + delete expS + delete tab + } + t |> run("values lane: group key + reducer over struct values") @(t : T?) { + var tab <- make_pt_table() + let groups <- _fold(values(tab) + ._group_by(_.x % 2) + ._select((K = _._0, MaxY = _._1 |> select($(p : Pt) => p.y) |> max())) + .to_array()) + t |> equal(length(groups), 2) + for (g in groups) { + t |> equal(g.MaxY, g.K == 0 ? 40 : 30) + } + delete tab + } + t |> run("keys lane: group over raw keys") @(t : T?) { + var tab <- make_int_table(10) + let groups <- _fold(keys(tab)._group_by(_ % 4)._select((K = _._0, N = _._1 |> length)).to_array()) + t |> equal(length(groups), 4) + var total = 0 + for (g in groups) { + total += g.N + } + t |> equal(total, 10) + delete tab + } + t |> run("upstream where + select segments feed the bucket fill") @(t : T?) { + var tab <- make_int_table(20) + let groups <- _fold(each_kv(tab) + ._where(_.key % 2 == 0) + ._select(_.value) + ._group_by(_ % 40) + ._select((K = _._0, N = _._1 |> length)) + .to_array()) + // even keys 0..18 -> values 0,20,..,180; % 40 buckets {0, 20}, 5 each + t |> equal(length(groups), 2) + for (g in groups) { + t |> equal(g.N, 5) + } + delete tab + } + t |> run("having + trailing where + order") @(t : T?) { + var tab <- make_int_table(12) + let groups <- _fold(each_kv(tab) + ._group_by(_.key % 5) + ._having(_._1 |> length >= 2) + ._select((K = _._0, S = _._1 |> select($(kv : IKV) => kv.value) |> sum())) + ._where(_.S > 60) + ._order_by(_.S) + .to_array()) + // buckets 0..4 sizes 3,3,2,2,2; sums 150,180,90,110,130 -> having keeps all, where keeps S>60 + t |> equal(length(groups), 5) + var prev = -1 + for (g in groups) { + t |> success(g.S > 60 && g.S >= prev) + prev = g.S + } + delete tab + } + t |> run("count terminator counts groups") @(t : T?) { + var tab <- make_int_table(10) + t |> equal(_fold(each_kv(tab)._group_by(_.key % 3)._select((K = _._0, N = _._1 |> length)).count()), 3) + delete tab + } + t |> run("empty table yields no groups") @(t : T?) { + let e : table + let groups <- _fold(each_kv(e)._group_by(_.key)._select((K = _._0, N = _._1 |> length)).to_array()) + t |> equal(length(groups), 0) + } + t |> run("fused agrees with the tier-2 iterator path") @(t : T?) { + var tab <- make_int_table(15) + let fused <- _fold(each_kv(tab) + ._group_by(_.key % 4) + ._select((K = _._0, N = _._1 |> count())) + .to_array()) + var tier2 <- unsafe(each_kv(tab)) |> _group_by(_.key % 4) |> _select((K = _._0, N = _._1 |> length)) |> to_array() + t |> equal(length(fused), length(tier2)) + var t2 : table + for (g in tier2) { + t2[g.K] = g.N + } + for (g in fused) { + t |> equal(g.N, t2?[g.K] ?? -1) + } + delete t2 + delete tier2 + delete tab + } +}