diff --git a/benchmarks/sql/LINQ_TO_TABLE.md b/benchmarks/sql/LINQ_TO_TABLE.md
index 36122143f..3ba956a74 100644
--- a/benchmarks/sql/LINQ_TO_TABLE.md
+++ b/benchmarks/sql/LINQ_TO_TABLE.md
@@ -4,10 +4,22 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco
 `table<K;V>` / `table<K>` as the 6th `_fold` source, plus the `to_table` sink.
 Edited in-place as PRs land.
 
-Status: **stage 6 committed — arc complete** (to_table sink; stage 5 = join probe + table-lead
-joins, 2742f6db2; stage 4 = point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources,
-29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba;
-master's fixed-array rework merged in after stage 5, 1ab3e6a67).
+Status: **stage 7 committed** (group_by fusion — `can_group_by` + `build_group_by_adapter` on
+`TableAdapter`, riding `plan_group_by_core` with the usage-pruned slot walk as the bucket-fill
+loop; stage 6 = to_table sink; stage 5 = join probe + table-lead joins, 2742f6db2; stage 4 =
+point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources, 29d23baf6; stage 2 =
+TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba; master's fixed-array
+rework merged in after stage 5, 1ab3e6a67; the JIT inline slot walk landed separately, #3100).
+
+Stage 7 findings:
+- **Two overrides were the whole change**: the group_by splice pattern was already adapter-generic
+  (`can_group_by_source` gate → `build_group_by_adapter` → `plan_group_by_core`), so enabling
+  tables = `can_group_by() == true` + a fresh-`TableAdapter` `build_group_by_adapter`. The kv
+  usage-pruner sees the whole accumulation body (key expr + reducer updates + upstream
+  where/select segments), so a group key over `kv.value.brand` walks `values(tab)` alone.
+- m7 `groupby_*` INTERP 144–201 → 30–50 ns/op (count 163→31, ~5×); JIT 44–73 → 8.4–11 (count
+  43.5→8.4, another ~5× — the fused emit rides #3100's inline slot walk). `join_groupby_*` stays
+  on the cascade (deferred edge below).
 
 Stage 6 findings:
 - **Tier-2 surface required for typing**: `_fold`'s argument must fully type before the macro
@@ -212,6 +224,41 @@ PR1 findings:
 
 End of arc: `skills/linq.md` + linq docs mention the table source.
 
+## Late stage (planned) — reducer shapes & general code hygiene
+
+Cross-source cleanups; none are table-specific. Items 1–2 are user-facing reducer-shape fixes,
+items 3–4 are codebase hygiene investigations (the linq_fold surface is workable but "a tad too
+unwieldy" — the table adapter took several stages, and many fuses read as "add this hook, because
+reasons" rather than falling out of the architecture).
+
+1. **Identity-lambda reducers**: `_._1 |> max($(v) => v)` (also `min`/`sum`/`average`) fails with
+   30303 today — the untyped lambda can't infer on the tier-2 lazy-bucket surface, and
+   `recognize_reducer_specs` has no identity arm either. Fix both ends: recognize the identity
+   inner-select and canonicalize to the bare form (`max()`), and make the tier-2 generic accept
+   it so unfused chains agree.
+2. **Untyped inner-select lambda params**: `_._1 |> select($(c) => c.value.price) |> sum()`
+   requires an explicit param type (`$(c : CarKV)`) — the lazy bucket's `select` doesn't flow the
+   element type into the lambda. Thread the type through so the annotation becomes optional;
+   today's explicit-type requirement is a usability trap (the error is an opaque 30303, not
+   "annotate the param").
+3. **match.das adoption survey (linq_fold* + sqlite_* family)**: flatten_opt's move to
+   `daslib/match` bought both fewer lines and more readable matchers; the linq spine-walkers are
+   full of the same hand-rolled `is ExprCall` / `as` / null-guard chains. Prime candidates:
+   `match_key_probe_side` / `extract_key_probe` (manual `ExprRef2Value` peeling + `ExprField`
+   checks), `extract_*_source` gates, and the sqlite_linq chain decomposition. Survey first,
+   convert where the match form is a strict readability win.
+4. **SourceAdapter interface audit**: all initially-wanted adapters now exist (11 classes:
+   Array/Zip/ArrayJoin/Decs/DecsJoin/Xml/XmlJoin/Json/JsonJoin/Table/ProjectedSource) — audit the
+   ~20-method interface against real usage and remove the scaffolding it forces. Known smells:
+   `arrayTop()`/`arraySrcName()` are still marked "transitional … removed once all consumers move
+   into subclass methods" yet remain load-bearing (reserve hint reads them; adapters override
+   them with comments explaining which distant gate reads what); the `build_group_by_adapter`
+   upstream-join arm repeats ~30 lines of keyaLam/keybLam/resultLam validation across
+   Array/Decs/Xml/Json; two overlapping reverse hooks (`emit_reverse_skip_into_tail` vs
+   `emit_reverse_last_backward`); emit fns reconstruct `headCalls` from stringly-keyed captures
+   ("mirrors emit_loop_or_count_lane_decs"). Goal: a new source should not need "several stages"
+   of hook-by-hook enablement for the standard fuse set.
+
 ## Risks / watch items
 
 - **Mangler ICE 50609** (iterator element-const collision) — `each_kv` yields `-const` non-ref
@@ -224,6 +271,10 @@ End of arc: `skills/linq.md` + linq docs mention the table source.
 
 ## Deferred edges (named, not built)
 
+- **`join |> group_by` over a table lead**: `TableAdapter.build_group_by_adapter` declines the
+  upstream-join arm (returns null → tier-2). The fix is a TableJoin analog of `ArrayJoinAdapter`
+  (lead loop from the pruned slot walk, srcB hash/probe from the stage-5 pieces); the
+  `join_groupby_*` m7 cells are the numbers it would improve. Revisit on demand.
 - **Point-lookup conjunct extraction**: `where(kv.key == X && <residual>)` (incl. the collapsed
   multi-where form) could probe and evaluate the residual on the probed element only. The matcher
   currently declines compound predicates; add when a real chain wants it.
diff --git a/benchmarks/sql/results.md b/benchmarks/sql/results.md
index 6ddc837b6..83b006a1e 100644
--- a/benchmarks/sql/results.md
+++ b/benchmarks/sql/results.md
@@ -21,9 +21,10 @@ are stable now).
   joined on its bare key probes the table instead of building the join hash — the `join_probe` /
   `join_probe_build` pair measures it; a trailing `to_table()` inserts straight into the result
   table with no intermediate array — the `to_table` / `to_table_staged` pair measures it;
-  group_by / reverse defer to tier-2). Under JIT, `keys`/`values` for-loop sources compile to an
-  inline open-addressed slot walk (no per-element C++ iterator calls), so the m7 JIT column is
-  fused codegen end to end.
+  group_by fuses through `plan_group_by_core` with the usage-pruned slot walk as the bucket-fill
+  loop; join+group_by and reverse defer to tier-2). Under JIT, `keys`/`values` for-loop sources
+  compile to an inline open-addressed slot walk (no per-element C++ iterator calls), so the m7
+  JIT column is fused codegen end to end.
 
 `0.00` = early-exit terminator below timer resolution ("free"). Chain shapes are in
 `benchmarks/README.md`; the splice arms each fires are in `doc/source/reference/linq_fold_patterns.rst`.
@@ -38,175 +39,175 @@ signal, JIT deltas as indicative.**
 
 | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) |
 |---|---:|---:|---:|---:|---:|---:|
-| `aggregate_match` | 34.9 | 5.9 | 5.9 | 60.5 | 158.9 | 19.8 |
-| `all_match` | 27.8 | 3.5 | 3.5 | 56.5 | 156.6 | 15.8 |
+| `aggregate_match` | 34.7 | 5.9 | 6.1 | 60.5 | 158.9 | 19.9 |
+| `all_match` | 27.8 | 3.5 | 3.5 | 56.1 | 157.6 | 16.1 |
 | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
-| `average_aggregate` | 30.2 | 6.1 | 8.7 | 58.8 | 157.2 | 17.2 |
-| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.1 | 30.2 |
-| `bare_order_where` | 279.9 | 117.8 | 125.7 | 302.3 | 292.5 | 163.9 |
-| `chained_select_collapse` | — | 17.6 | 17.4 | 70.6 | 154.1 | 28.4 |
-| `chained_where` | 36.5 | 6.6 | 7.1 | 105.8 | 177.6 | 23.9 |
-| `contains_match` | 0.0 | 2.2 | 1.4 | 27.8 | 70.3 | 6.5 |
-| `count_aggregate` | 29.7 | 4.2 | 4.1 | 64.1 | 158.1 | 20.3 |
-| `cross_join` | 12594.5 | 3704.9 | — | 4030.6 | 4063.0 | — |
+| `average_aggregate` | 30.6 | 6.1 | 8.7 | 58.7 | 157.7 | 17.2 |
+| `bare_last` | — | 4.2 | 0.0 | 0.0 | 4.2 | 30.1 |
+| `bare_order_where` | 279.8 | 117.0 | 125.6 | 302.7 | 296.8 | 164.3 |
+| `chained_select_collapse` | — | 17.7 | 17.4 | 70.5 | 154.5 | 28.5 |
+| `chained_where` | 36.6 | 6.6 | 7.1 | 105.5 | 177.6 | 23.9 |
+| `contains_match` | 0.0 | 2.2 | 1.4 | 27.7 | 71.6 | 6.5 |
+| `count_aggregate` | 29.4 | 4.2 | 4.1 | 64.2 | 162.2 | 20.3 |
+| `cross_join` | 12628.8 | 3713.6 | — | 4051.3 | 4077.4 | — |
 | `decs_count_bare_pred` | — | — | 4.1 | — | — | — |
-| `distinct_by_count` | 41.6 | 15.8 | 15.7 | 70.7 | 156.5 | 27.3 |
-| `distinct_by_order_take` | 240.5 | 22.2 | 23.4 | 124.6 | 159.8 | 49.5 |
-| `distinct_by_order_to_array` | 240.7 | 22.1 | 23.4 | 125.1 | 163.6 | 49.2 |
-| `distinct_count` | 41.2 | 15.6 | 15.6 | 70.6 | 161.2 | 27.5 |
-| `distinct_count_pred` | 253.4 | 15.9 | 15.9 | 112.6 | 173.8 | 27.4 |
+| `distinct_by_count` | 41.4 | 15.7 | 15.7 | 70.6 | 159.1 | 27.2 |
+| `distinct_by_order_take` | 242.2 | 22.0 | 23.4 | 124.9 | 158.7 | 49.5 |
+| `distinct_by_order_to_array` | 240.8 | 21.9 | 23.4 | 125.2 | 160.5 | 49.5 |
+| `distinct_count` | 41.6 | 15.5 | 15.6 | 70.6 | 154.4 | 27.4 |
+| `distinct_count_pred` | 254.8 | 15.8 | 16.0 | 112.5 | 170.7 | 27.3 |
 | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
 | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.4 | 0.3 | 0.0 |
 | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
 | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
-| `groupby_average` | 171.3 | 29.1 | 29.4 | 124.3 | 187.5 | 197.3 |
-| `groupby_count` | 143.5 | 19.1 | 19.1 | 74.8 | 159.1 | 163.1 |
-| `groupby_first` | 251.7 | 19.0 | 19.8 | 72.4 | 155.1 | 163.1 |
-| `groupby_having_count` | 142.2 | 19.1 | 19.1 | 75.2 | 160.2 | 185.4 |
-| `groupby_having_hidden_sum` | 175.9 | 22.2 | 22.3 | 119.0 | 183.5 | 215.6 |
-| `groupby_having_post_where` | 172.8 | 20.4 | 20.5 | 115.3 | 181.3 | 194.2 |
-| `groupby_max` | 173.1 | 24.9 | 24.8 | 120.5 | 184.1 | 201.8 |
-| `groupby_min` | 173.6 | 25.6 | 25.2 | 120.7 | 184.3 | 204.0 |
-| `groupby_multi_reducer` | 190.4 | 30.4 | 30.3 | 126.3 | 188.2 | 231.5 |
-| `groupby_select_order` | 172.9 | 20.5 | 20.4 | 118.6 | 179.8 | 194.6 |
-| `groupby_select_sum` | 198.2 | 38.5 | 38.8 | 102.1 | 185.0 | 188.0 |
-| `groupby_sum` | 171.1 | 20.4 | 20.4 | 115.0 | 179.5 | 194.2 |
-| `groupby_where_count` | 76.8 | 13.9 | 14.5 | 115.3 | 188.6 | 164.2 |
-| `groupby_where_sum` | 87.4 | 14.2 | 14.8 | 116.7 | 187.1 | 179.4 |
-| `join_count` | 38.6 | 51.8 | 63.9 | 112.3 | 185.6 | 64.2 |
-| `join_groupby_count` | 158.4 | 76.9 | 88.5 | 178.1 | 225.8 | 259.3 |
-| `join_groupby_to_array` | 190.4 | 78.3 | 90.6 | 215.6 | 212.2 | 290.2 |
-| `join_probe` | — | — | — | — | — | 46.6 |
-| `join_probe_build` | — | — | — | — | — | 79.5 |
-| `join_select` | 150.9 | 73.5 | 84.6 | 187.9 | 207.1 | 223.5 |
-| `join_where_count` | 39.8 | 61.9 | 75.8 | 161.2 | 192.9 | 79.8 |
-| `last_match` | 0.0 | 5.8 | 13.9 | 65.3 | 157.9 | 30.9 |
-| `long_count_aggregate` | 29.7 | 4.2 | 4.1 | 63.7 | 158.0 | 20.1 |
-| `max_aggregate` | 31.0 | 6.1 | 6.8 | 58.8 | 157.6 | 17.0 |
-| `min_aggregate` | 30.9 | 6.1 | 6.8 | 59.0 | 159.3 | 17.0 |
-| `order_by_multi_key` | 338.1 | 274.4 | 282.8 | 459.3 | 445.2 | 341.9 |
-| `order_distinct_take` | 138.5 | 15.7 | 98.9 | 72.8 | 155.0 | 31.7 |
-| `order_reverse_normalized` | 38.4 | 16.2 | 19.9 | 70.5 | 162.1 | 33.1 |
-| `order_take_desc` | 38.3 | 16.4 | 19.9 | 70.6 | 162.5 | 33.0 |
+| `groupby_average` | 176.5 | 29.0 | 29.3 | 124.3 | 187.3 | 41.8 |
+| `groupby_count` | 143.2 | 19.0 | 19.4 | 74.7 | 160.0 | 31.1 |
+| `groupby_first` | 253.3 | 19.0 | 20.1 | 72.4 | 156.2 | 40.7 |
+| `groupby_having_count` | 141.7 | 19.0 | 19.1 | 75.1 | 159.4 | 31.2 |
+| `groupby_having_hidden_sum` | 176.8 | 22.2 | 22.3 | 118.8 | 183.3 | 34.2 |
+| `groupby_having_post_where` | 174.3 | 20.3 | 20.4 | 114.7 | 180.1 | 32.4 |
+| `groupby_max` | 175.5 | 24.7 | 24.9 | 120.5 | 184.2 | 35.0 |
+| `groupby_min` | 173.9 | 25.5 | 25.2 | 121.0 | 183.6 | 34.8 |
+| `groupby_multi_reducer` | 191.1 | 30.3 | 30.3 | 126.3 | 189.1 | 43.5 |
+| `groupby_select_order` | 172.0 | 20.3 | 20.4 | 115.0 | 179.9 | 32.3 |
+| `groupby_select_sum` | 201.3 | 38.5 | 38.5 | 102.1 | 185.6 | 50.3 |
+| `groupby_sum` | 170.5 | 20.4 | 20.4 | 115.0 | 179.7 | 32.4 |
+| `groupby_where_count` | 76.5 | 13.9 | 14.5 | 115.8 | 181.4 | 29.9 |
+| `groupby_where_sum` | 87.2 | 14.2 | 14.8 | 116.6 | 181.9 | 31.3 |
+| `join_count` | 38.3 | 52.0 | 65.0 | 112.3 | 177.5 | 65.1 |
+| `join_groupby_count` | 157.7 | 77.3 | 88.9 | 177.9 | 224.8 | 260.4 |
+| `join_groupby_to_array` | 191.6 | 79.1 | 91.3 | 215.2 | 211.3 | 289.4 |
+| `join_probe` | — | — | — | — | — | 46.5 |
+| `join_probe_build` | — | — | — | — | — | 79.1 |
+| `join_select` | 150.4 | 74.1 | 85.0 | 188.7 | 211.5 | 222.8 |
+| `join_where_count` | 39.7 | 62.3 | 76.2 | 161.1 | 193.0 | 79.6 |
+| `last_match` | 0.0 | 5.8 | 14.0 | 65.2 | 159.4 | 30.8 |
+| `long_count_aggregate` | 29.8 | 4.2 | 4.1 | 63.8 | 158.0 | 21.4 |
+| `max_aggregate` | 31.3 | 6.1 | 6.8 | 58.8 | 157.3 | 16.9 |
+| `min_aggregate` | 31.4 | 6.1 | 6.8 | 59.0 | 157.5 | 16.9 |
+| `order_by_multi_key` | 341.1 | 274.6 | 283.0 | 459.8 | 450.9 | 334.7 |
+| `order_distinct_take` | 138.7 | 15.7 | 99.0 | 72.6 | 155.7 | 31.6 |
+| `order_reverse_normalized` | 38.4 | 16.3 | 20.0 | 70.5 | 162.8 | 32.9 |
+| `order_take_desc` | 38.5 | 16.4 | 20.0 | 70.6 | 162.2 | 32.9 |
 | `point_lookup` | — | — | — | — | — | 0.0 |
 | `point_lookup_scan` | — | — | — | — | — | 8.3 |
-| `reverse_distinct_by` | 296.9 | 21.1 | 27.7 | 71.7 | 154.5 | 44.4 |
+| `reverse_distinct_by` | 296.8 | 21.1 | 28.3 | 71.3 | 155.8 | 43.8 |
 | `reverse_take` | 0.1 | 0.0 | 0.2 | 0.0 | 26.2 | 58.5 |
-| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | 58.7 |
-| `select_count` | 0.1 | 0.0 | 2.2 | 63.4 | 2.2 | 0.0 |
-| `select_many` | — | 189.8 | — | — | — | — |
-| `select_where` | 199.2 | 11.2 | 19.2 | 197.4 | 186.5 | 37.8 |
-| `select_where_count` | 33.0 | 5.2 | 7.5 | 65.2 | 150.0 | 23.2 |
-| `select_where_order_take` | 37.0 | 12.2 | 14.9 | 72.5 | 163.1 | 34.7 |
-| `select_where_sum` | 37.1 | 7.5 | 7.5 | 66.2 | 158.2 | 24.2 |
-| `single_match` | 0.0 | 2.9 | 5.4 | 56.2 | 148.2 | 22.8 |
+| `reverse_take_select` | 0.0 | 0.0 | 0.2 | 0.0 | 26.2 | 58.6 |
+| `select_count` | 0.1 | 0.0 | 2.2 | 65.1 | 2.2 | 0.0 |
+| `select_many` | — | 189.4 | — | — | — | — |
+| `select_where` | 195.0 | 11.2 | 19.3 | 197.8 | 188.5 | 37.6 |
+| `select_where_count` | 33.0 | 5.2 | 7.4 | 65.3 | 150.2 | 23.0 |
+| `select_where_order_take` | 37.0 | 12.2 | 14.8 | 72.7 | 162.8 | 34.6 |
+| `select_where_sum` | 37.4 | 7.4 | 7.5 | 66.2 | 157.7 | 24.1 |
+| `single_match` | 0.0 | 2.9 | 5.4 | 55.9 | 148.2 | 23.0 |
 | `skip_take` | 0.5 | 0.1 | 0.2 | 3.1 | 2.8 | 0.3 |
-| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.9 | 150.2 | 18.2 |
-| `sort_first` | 38.4 | 11.0 | 13.4 | 65.7 | 162.3 | 31.7 |
-| `sort_take` | 38.6 | 16.1 | 20.3 | 70.7 | 163.1 | 33.2 |
-| `sort_take_select` | 38.6 | 16.4 | 20.2 | 70.9 | 161.9 | 33.2 |
-| `sum_aggregate` | 30.0 | 2.1 | 2.1 | 54.8 | 156.9 | 13.5 |
-| `sum_where` | 33.0 | 4.3 | 4.3 | 63.7 | 157.9 | 20.7 |
-| `take_count` | 3.7 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 |
+| `skip_while_match` | 3.5 | 5.3 | 5.3 | 57.7 | 149.0 | 18.2 |
+| `sort_first` | 38.2 | 11.0 | 13.4 | 65.6 | 159.4 | 31.6 |
+| `sort_take` | 38.6 | 16.1 | 20.4 | 70.5 | 163.4 | 33.0 |
+| `sort_take_select` | 38.2 | 16.4 | 20.3 | 71.0 | 163.1 | 33.1 |
+| `sum_aggregate` | 30.2 | 2.1 | 2.1 | 54.5 | 156.8 | 13.5 |
+| `sum_where` | 33.0 | 4.3 | 4.3 | 63.7 | 158.1 | 20.4 |
+| `take_count` | 3.6 | 0.2 | 0.4 | 2.9 | 2.7 | 0.5 |
 | `take_count_filtered` | 1.1 | 0.2 | 0.2 | 1.4 | 1.1 | 0.3 |
 | `take_sum_aggregate` | 0.8 | 0.1 | 0.1 | 0.6 | 0.5 | 0.1 |
 | `take_where_count` | 0.9 | 0.1 | 0.1 | 0.7 | 0.6 | 0.2 |
-| `take_while_match` | 7.8 | 2.4 | 2.5 | 29.0 | 72.6 | 16.9 |
-| `to_array_filter` | 70.7 | 11.7 | 11.8 | 71.7 | 163.7 | 28.9 |
-| `to_table` | — | 18.6 | 141.9 | 118.5 | 140.3 | 32.1 |
-| `to_table_staged` | — | 54.7 | 56.6 | 143.3 | 165.1 | 68.5 |
-| `where_join_count` | 41.6 | 29.4 | 41.0 | 132.1 | 171.8 | 46.7 |
-| `zip_count_pred` | 39.4 | 15.8 | — | 316.5 | 317.6 | — |
-| `zip_dot_product` | 49.8 | 12.6 | 10.6 | 312.4 | 313.7 | — |
-| `zip_dot_product_3arg` | 50.2 | 12.7 | — | 312.5 | 313.9 | — |
-| `zip_reverse_to_array` | — | 32.1 | — | 347.2 | 351.8 | — |
+| `take_while_match` | 7.8 | 2.4 | 2.5 | 29.0 | 72.5 | 16.6 |
+| `to_array_filter` | 70.6 | 11.7 | 11.7 | 71.8 | 161.9 | 28.9 |
+| `to_table` | — | 18.7 | 141.8 | 118.5 | 140.1 | 32.2 |
+| `to_table_staged` | — | 54.6 | 56.6 | 143.0 | 164.0 | 68.6 |
+| `where_join_count` | 39.7 | 29.5 | 41.1 | 132.2 | 166.3 | 46.8 |
+| `zip_count_pred` | 39.1 | 15.8 | — | 317.6 | 318.8 | — |
+| `zip_dot_product` | 46.9 | 12.6 | 10.6 | 312.4 | 315.1 | — |
+| `zip_dot_product_3arg` | 46.8 | 12.7 | — | 311.9 | 315.7 | — |
+| `zip_reverse_to_array` | — | 32.1 | — | 348.5 | 349.7 | — |
 
 ## JIT
 
 | Benchmark | SQL (m1) | Array (m3f) | Decs (m4) | XML fold (m5f) | JSON fold (m6f) | Table fold (m7) |
 |---|---:|---:|---:|---:|---:|---:|
-| `aggregate_match` | 35.0 | 0.3 | 0.7 | 29.7 | 27.3 | 7.3 |
-| `all_match` | 27.7 | 0.3 | 0.2 | 18.8 | 25.3 | 7.2 |
+| `aggregate_match` | 34.7 | 0.3 | 0.7 | 29.6 | 25.8 | 7.2 |
+| `all_match` | 27.5 | 0.3 | 0.2 | 18.8 | 24.9 | 7.2 |
 | `any_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
-| `average_aggregate` | 30.3 | 1.0 | 3.6 | 18.8 | 24.4 | 7.4 |
+| `average_aggregate` | 30.1 | 1.0 | 3.6 | 18.5 | 24.5 | 7.4 |
 | `bare_last` | — | 0.4 | 0.0 | 0.0 | 0.0 | 8.9 |
-| `bare_order_where` | 185.9 | 33.8 | 35.0 | 106.0 | 51.7 | 68.2 |
-| `chained_select_collapse` | — | 1.1 | 1.1 | 20.6 | 32.1 | 8.1 |
-| `chained_where` | 36.7 | 0.6 | 0.9 | 36.4 | 31.8 | 10.4 |
-| `contains_match` | 0.0 | 0.2 | 0.1 | 16.8 | 9.2 | 2.5 |
-| `count_aggregate` | 29.4 | 0.3 | 0.6 | 29.5 | 25.1 | 7.3 |
-| `cross_join` | 5962.9 | 719.2 | — | 833.9 | 771.0 | — |
+| `bare_order_where` | 184.9 | 33.8 | 35.0 | 105.9 | 51.7 | 68.4 |
+| `chained_select_collapse` | — | 1.1 | 1.1 | 20.5 | 31.9 | 8.1 |
+| `chained_where` | 36.6 | 0.6 | 0.9 | 36.3 | 29.9 | 10.6 |
+| `contains_match` | 0.0 | 0.2 | 0.1 | 19.3 | 8.8 | 2.5 |
+| `count_aggregate` | 29.8 | 0.3 | 0.6 | 29.1 | 25.1 | 7.3 |
+| `cross_join` | 5967.3 | 717.5 | — | 831.1 | 766.7 | — |
 | `decs_count_bare_pred` | — | — | 0.6 | — | — | — |
-| `distinct_by_count` | 41.6 | 1.1 | 1.1 | 20.6 | 31.9 | 8.0 |
-| `distinct_by_order_take` | 238.9 | 1.7 | 2.6 | 45.3 | 37.2 | 19.6 |
-| `distinct_by_order_to_array` | 239.5 | 1.7 | 2.7 | 45.5 | 37.0 | 19.5 |
-| `distinct_count` | 41.4 | 1.1 | 1.1 | 20.7 | 33.1 | 8.0 |
-| `distinct_count_pred` | 252.1 | 1.1 | 1.3 | 37.7 | 43.6 | 8.0 |
+| `distinct_by_count` | 41.4 | 1.1 | 1.1 | 20.5 | 32.0 | 8.0 |
+| `distinct_by_order_take` | 238.2 | 1.7 | 2.6 | 45.3 | 37.2 | 19.5 |
+| `distinct_by_order_to_array` | 239.9 | 1.7 | 2.7 | 45.3 | 37.1 | 19.7 |
+| `distinct_count` | 41.6 | 1.1 | 1.1 | 20.5 | 32.1 | 8.0 |
+| `distinct_count_pred` | 252.3 | 1.1 | 1.3 | 37.6 | 41.8 | 8.0 |
 | `distinct_take` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
 | `element_at_match` | 0.0 | 0.0 | 0.0 | 0.1 | 0.0 | 0.0 |
 | `first_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
 | `first_or_default_match` | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 |
-| `groupby_average` | 171.5 | 1.5 | 1.8 | 37.2 | 44.8 | 51.8 |
-| `groupby_count` | 162.3 | 1.3 | 1.5 | 20.7 | 32.2 | 43.5 |
-| `groupby_first` | 251.8 | 1.3 | 2.3 | 20.7 | 34.0 | 43.4 |
-| `groupby_having_count` | 142.7 | 1.3 | 1.5 | 20.6 | 33.4 | 46.2 |
-| `groupby_having_hidden_sum` | 175.1 | 1.5 | 1.9 | 37.0 | 43.0 | 54.5 |
-| `groupby_having_post_where` | 171.4 | 1.4 | 1.9 | 37.0 | 42.0 | 51.4 |
-| `groupby_max` | 173.0 | 1.5 | 1.9 | 37.1 | 43.6 | 51.9 |
-| `groupby_min` | 172.4 | 1.5 | 1.9 | 38.2 | 43.4 | 52.6 |
-| `groupby_multi_reducer` | 193.2 | 1.6 | 1.9 | 37.2 | 43.7 | 60.8 |
-| `groupby_select_order` | 170.5 | 1.4 | 1.6 | 37.9 | 41.9 | 51.8 |
-| `groupby_select_sum` | 196.9 | 2.8 | 3.2 | 33.5 | 37.7 | 73.3 |
-| `groupby_sum` | 171.5 | 1.4 | 1.9 | 37.9 | 42.0 | 52.1 |
-| `groupby_where_count` | 76.5 | 0.9 | 1.3 | 37.2 | 39.7 | 53.7 |
-| `groupby_where_sum` | 87.4 | 0.9 | 1.3 | 37.1 | 39.7 | 57.7 |
-| `join_count` | 38.3 | 11.2 | 12.5 | 40.8 | 68.0 | 25.2 |
-| `join_groupby_count` | 157.5 | 17.2 | 19.3 | 66.4 | 86.0 | 73.1 |
-| `join_groupby_to_array` | 190.7 | 17.8 | 19.7 | 78.6 | 35.8 | 81.4 |
-| `join_probe` | — | — | — | — | — | 16.7 |
-| `join_probe_build` | — | — | — | — | — | 33.2 |
-| `join_select` | 91.8 | 19.6 | 21.7 | 73.5 | 89.8 | 70.1 |
-| `join_where_count` | 39.2 | 19.2 | 20.6 | 63.3 | 77.3 | 31.7 |
-| `last_match` | 0.0 | 0.5 | 1.4 | 19.6 | 25.1 | 12.1 |
-| `long_count_aggregate` | 30.0 | 0.3 | 0.6 | 29.4 | 25.1 | 7.3 |
-| `max_aggregate` | 31.0 | 0.3 | 0.5 | 29.7 | 26.3 | 7.5 |
-| `min_aggregate` | 31.0 | 0.3 | 0.5 | 29.7 | 26.2 | 7.4 |
-| `order_by_multi_key` | 242.6 | 53.3 | 54.4 | 124.6 | 70.5 | 119.3 |
-| `order_distinct_take` | 138.6 | 1.1 | 75.8 | 20.9 | 34.1 | 8.1 |
-| `order_reverse_normalized` | 38.5 | 0.7 | 1.3 | 19.8 | 27.0 | 11.1 |
-| `order_take_desc` | 38.8 | 0.7 | 1.3 | 19.8 | 26.9 | 10.0 |
+| `groupby_average` | 175.4 | 1.5 | 1.8 | 37.0 | 42.9 | 8.9 |
+| `groupby_count` | 153.2 | 1.3 | 1.5 | 20.5 | 32.3 | 8.4 |
+| `groupby_first` | 252.8 | 1.3 | 2.3 | 20.5 | 32.9 | 10.0 |
+| `groupby_having_count` | 141.8 | 1.3 | 1.5 | 20.5 | 32.3 | 8.5 |
+| `groupby_having_hidden_sum` | 176.3 | 1.5 | 1.9 | 36.9 | 43.0 | 8.7 |
+| `groupby_having_post_where` | 172.1 | 1.4 | 1.9 | 36.9 | 42.2 | 8.5 |
+| `groupby_max` | 176.7 | 1.5 | 1.9 | 37.1 | 45.2 | 8.6 |
+| `groupby_min` | 177.2 | 1.5 | 1.9 | 38.3 | 45.6 | 8.5 |
+| `groupby_multi_reducer` | 191.7 | 1.6 | 1.9 | 37.2 | 43.7 | 9.0 |
+| `groupby_select_order` | 176.6 | 1.4 | 1.6 | 37.8 | 41.9 | 8.4 |
+| `groupby_select_sum` | 204.9 | 2.8 | 3.2 | 33.5 | 37.7 | 22.8 |
+| `groupby_sum` | 171.6 | 1.4 | 1.9 | 37.7 | 42.0 | 8.4 |
+| `groupby_where_count` | 76.9 | 0.9 | 1.3 | 37.1 | 39.7 | 11.2 |
+| `groupby_where_sum` | 90.4 | 0.9 | 1.3 | 37.1 | 39.7 | 11.2 |
+| `join_count` | 38.6 | 11.2 | 12.6 | 40.7 | 68.3 | 25.1 |
+| `join_groupby_count` | 157.4 | 17.2 | 19.2 | 66.2 | 86.0 | 73.5 |
+| `join_groupby_to_array` | 191.5 | 17.8 | 19.6 | 78.4 | 35.8 | 80.6 |
+| `join_probe` | — | — | — | — | — | 16.6 |
+| `join_probe_build` | — | — | — | — | — | 31.6 |
+| `join_select` | 93.0 | 19.6 | 21.7 | 73.2 | 90.5 | 69.5 |
+| `join_where_count` | 48.9 | 19.1 | 20.6 | 62.9 | 77.6 | 31.5 |
+| `last_match` | 0.0 | 0.5 | 1.4 | 19.5 | 25.5 | 12.0 |
+| `long_count_aggregate` | 29.7 | 0.3 | 0.6 | 29.3 | 25.1 | 7.3 |
+| `max_aggregate` | 30.7 | 0.3 | 0.5 | 29.6 | 26.3 | 7.3 |
+| `min_aggregate` | 31.0 | 0.3 | 0.5 | 29.6 | 26.3 | 7.4 |
+| `order_by_multi_key` | 245.0 | 53.4 | 54.5 | 124.5 | 70.3 | 118.9 |
+| `order_distinct_take` | 138.7 | 1.1 | 75.0 | 20.8 | 34.4 | 8.0 |
+| `order_reverse_normalized` | 38.7 | 0.7 | 1.3 | 19.6 | 27.0 | 9.5 |
+| `order_take_desc` | 38.7 | 0.7 | 1.3 | 19.6 | 27.0 | 9.3 |
 | `point_lookup` | — | — | — | — | — | 0.0 |
 | `point_lookup_scan` | — | — | — | — | — | 3.0 |
-| `reverse_distinct_by` | 296.3 | 1.6 | 3.2 | 20.6 | 32.6 | 10.9 |
-| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 19.3 |
-| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.8 | 19.2 |
-| `select_count` | 0.1 | 0.0 | 0.0 | 67.0 | 0.0 | 0.0 |
-| `select_many` | — | 61.4 | — | — | — | — |
-| `select_where` | 107.8 | 4.1 | 5.3 | 76.0 | 22.2 | 17.9 |
-| `select_where_count` | 32.6 | 0.3 | 0.6 | 29.5 | 25.9 | 7.4 |
-| `select_where_order_take` | 36.8 | 0.7 | 1.4 | 19.8 | 26.6 | 13.0 |
-| `select_where_sum` | 37.4 | 0.4 | 0.6 | 20.4 | 24.8 | 7.5 |
-| `single_match` | 0.0 | 0.4 | 1.1 | 45.9 | 22.2 | 9.7 |
-| `skip_take` | 0.3 | 0.0 | 0.0 | 1.3 | 0.2 | 0.1 |
-| `skip_while_match` | 3.5 | 0.4 | 0.4 | 46.1 | 21.7 | 7.8 |
-| `sort_first` | 38.1 | 0.4 | 1.3 | 18.8 | 26.1 | 9.3 |
-| `sort_take` | 38.6 | 0.7 | 1.3 | 19.8 | 27.0 | 9.7 |
-| `sort_take_select` | 38.6 | 0.7 | 1.3 | 19.8 | 26.9 | 9.6 |
-| `sum_aggregate` | 30.2 | 0.3 | 0.0 | 22.5 | 24.2 | 7.3 |
-| `sum_where` | 32.9 | 0.3 | 0.6 | 29.6 | 25.8 | 7.3 |
-| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.2 | 0.1 |
+| `reverse_distinct_by` | 296.5 | 1.6 | 3.2 | 20.5 | 34.2 | 11.1 |
+| `reverse_take` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 19.5 |
+| `reverse_take_select` | 0.0 | 0.0 | 0.0 | 0.0 | 3.7 | 19.0 |
+| `select_count` | 0.1 | 0.0 | 0.0 | 64.6 | 0.0 | 0.0 |
+| `select_many` | — | 61.3 | — | — | — | — |
+| `select_where` | 107.1 | 4.2 | 5.2 | 75.6 | 21.9 | 17.7 |
+| `select_where_count` | 32.8 | 0.3 | 0.6 | 29.4 | 25.9 | 7.2 |
+| `select_where_order_take` | 37.1 | 0.7 | 1.4 | 19.6 | 26.6 | 12.9 |
+| `select_where_sum` | 37.2 | 0.4 | 0.6 | 20.2 | 24.9 | 7.4 |
+| `single_match` | 0.0 | 0.4 | 1.1 | 44.2 | 22.3 | 9.2 |
+| `skip_take` | 0.3 | 0.0 | 0.0 | 1.2 | 0.2 | 0.1 |
+| `skip_while_match` | 3.5 | 0.4 | 0.4 | 44.4 | 21.8 | 7.6 |
+| `sort_first` | 38.2 | 0.4 | 1.3 | 18.7 | 26.2 | 9.3 |
+| `sort_take` | 38.4 | 0.7 | 1.3 | 19.6 | 27.2 | 9.6 |
+| `sort_take_select` | 38.5 | 0.7 | 1.4 | 19.5 | 27.1 | 9.6 |
+| `sum_aggregate` | 30.3 | 0.3 | 0.0 | 22.4 | 24.2 | 7.3 |
+| `sum_where` | 33.1 | 0.3 | 0.6 | 29.5 | 25.8 | 7.3 |
+| `take_count` | 1.8 | 0.1 | 0.1 | 1.2 | 0.3 | 0.1 |
 | `take_count_filtered` | 1.1 | 0.0 | 0.0 | 0.5 | 0.1 | 0.0 |
 | `take_sum_aggregate` | 0.8 | 0.0 | 0.0 | 0.2 | 0.0 | 0.0 |
 | `take_where_count` | 0.9 | 0.0 | 0.0 | 0.3 | 0.0 | 0.0 |
-| `take_while_match` | 7.8 | 0.2 | 0.3 | 16.9 | 9.0 | 7.3 |
-| `to_array_filter` | 48.4 | 3.3 | 3.3 | 22.2 | 33.7 | 13.2 |
-| `to_table` | — | 14.1 | 37.1 | 49.6 | 52.2 | 20.6 |
-| `to_table_staged` | — | 25.8 | 26.2 | 53.2 | 61.3 | 33.6 |
-| `where_join_count` | 41.6 | 6.0 | 6.7 | 48.0 | 40.6 | 19.9 |
-| `zip_count_pred` | 39.3 | 0.1 | — | 114.0 | 33.7 | — |
-| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 113.6 | 33.5 | — |
-| `zip_dot_product_3arg` | 46.6 | 0.1 | — | 113.9 | 33.7 | — |
-| `zip_reverse_to_array` | — | 4.5 | — | 125.3 | 50.7 | — |
+| `take_while_match` | 7.8 | 0.2 | 0.3 | 19.0 | 9.0 | 7.3 |
+| `to_array_filter` | 48.4 | 3.3 | 3.3 | 22.0 | 33.5 | 13.0 |
+| `to_table` | — | 14.1 | 36.9 | 49.4 | 52.0 | 21.1 |
+| `to_table_staged` | — | 25.8 | 26.0 | 53.0 | 61.5 | 33.9 |
+| `where_join_count` | 41.7 | 6.0 | 6.7 | 47.6 | 40.5 | 19.7 |
+| `zip_count_pred` | 39.6 | 0.1 | — | 113.5 | 33.5 | — |
+| `zip_dot_product` | 46.9 | 0.1 | 0.1 | 113.6 | 33.3 | — |
+| `zip_dot_product_3arg` | 46.8 | 0.1 | — | 113.6 | 33.3 | — |
+| `zip_reverse_to_array` | — | 4.5 | — | 124.9 | 50.6 | — |
 <!-- BENCH:TABLES END -->
 
 ## Missing lanes (the `—` cells)
@@ -222,7 +223,7 @@ Each empty cell's reason is also in the bench `.das` file's comment; SQL gaps ar
 - **`reverse_distinct_by` m4 / m5f** — array uses the backward-index walk; non-array sources fuse the forward keep-last splice (decs 27.6/5.0, XML 74.5/22.2); SQL uses MAX(pk).
 - **`order_distinct_take` m4 vs m3f** — `unique_key` hashes workhorse keys directly (array `int`) but string-interpolates structs (decs `DecsBrand`); the gap is per-element string hashing, not decs-walk. `distinct_by_count` is the key-based variant (m4 parity).
 - **`zip_reverse_to_array` / `zip_*` SQL / Decs** — `reverse` has no SQL order key; zip is not relational / not expressible over one archetype walk. By design. (XML/JSON zip lanes are lit, partially fused.)
-- **m7 absent families** — `zip_*` / `cross_join` (lockstep pairing over an unordered slot walk is meaningless) and `select_many` (flat fixture, no nested array field; array-only). Everything else in the m7 column is instantiated — but read the `groupby_*` / `join_groupby_*` / reverse-family cells as the **tier-2 cascade cost**, not a fused emit: table group_by fusion and a backward slot walk are named deferred edges (see `LINQ_TO_TABLE.md`), so those cells are the numbers a fix would improve.
+- **m7 absent families** — `zip_*` / `cross_join` (lockstep pairing over an unordered slot walk is meaningless) and `select_many` (flat fixture, no nested array field; array-only). Everything else in the m7 column is instantiated, and the `groupby_*` family is a fused emit (`plan_group_by_core` over the usage-pruned slot walk). The remaining cascade cells are `join_groupby_*` (join |> group_by over a table lead declines) and the reverse family (no backward slot walk) — both named deferred edges (see `LINQ_TO_TABLE.md`), so those cells are the numbers a fix would improve.
 - **`point_lookup` / `point_lookup_scan` non-m7** — m7-only pair: only a table source has a key to probe (`where(kv.key == X)` + terminator → `key_exists` / `tab?[X]`, O(1)); the `_scan` twin forces the same query through the walk (compound `&&` predicate declines the probe) to show the gap. Other sources have no analog by design.
 - **`join_probe` / `join_probe_build` non-m7** — m7-only A/B pair: a table srcB joined on its bare key probes the user's table per lead row (no internal join hash, no build loop); the `_build` twin feeds the identical rows pre-materialized to a kv array, forcing the hashed build. Other sources have no keyed-srcB analog by design.
 - **`to_table` / `to_table_staged` SQL** — `to_table` isn't an SQL terminator (`_sql` pass-through has no table sink). All in-memory sources are instantiated: array / XML / JSON / table fuse the insert-loop sink (`_staged` is the materialize-then-`to_table_move` shape every chain had before the sink arm); decs declines by design (explicit guard in its loop_or_count lane), so its `to_table` cell is the full tier-2 cascade — currently slower than its `_staged` twin, which fuses the array materialization first. That gap is the motivating number for a future decs sink hook.
diff --git a/daslib/linq_fold_table.das b/daslib/linq_fold_table.das
index 89b20b630..d744c1884 100644
--- a/daslib/linq_fold_table.das
+++ b/daslib/linq_fold_table.das
@@ -125,9 +125,18 @@ class TableAdapter : SourceAdapter {
     def override const can_join() : bool {
         return true   // rides emit_array_join: direct-return lead loop via wrap_source_loop
     }
+    def override const can_group_by() : bool {
+        return true   // plan_group_by_core drives the bucket-fill through wrap_source_loop (kv pruning free)
+    }
     def override emit_join_hook(var c : Captures; var ctx : EmitCtx; at : LineInfo) : Expression? {
         return emit_array_join(c, ctx, at)
     }
+    def override build_group_by_adapter(var c : Captures; var ctx : EmitCtx; at : LineInfo) : SourceAdapter? {
+        // join |> group_by over a table lead stays a deferred edge (LINQ_TO_TABLE.md) — null bails to tier-2
+        if (c.single |> key_exists("upstream_join")) return null
+        return new TableAdapter(tabExpr = clone_expression(tabExpr), srcName = qn("tsrc", at),
+                                elemType = clone_type(elemType), lane = lane)
+    }
     def override arrayTop() : Expression? {
         // Feeds the reserve hint (type_has_length covers tables). The backward-index reverse lanes that
         // also read arrayTop gate on array_source, which is false here — matchTop stays iterator-typed.
diff --git a/doc/source/reference/linq_fold_patterns.rst b/doc/source/reference/linq_fold_patterns.rst
index fb22a137e..b3f17fcf5 100644
--- a/doc/source/reference/linq_fold_patterns.rst
+++ b/doc/source/reference/linq_fold_patterns.rst
@@ -150,7 +150,7 @@ Source-side entry points
      - Optional source — only when the ``pugixml`` module is linked (``require ?pugixml`` + ``static_if (typeinfo builtin_module_exists(pugixml))``). Emits an inlined DOM child-element walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): the chain body is scanned for the ``Row`` fields it reads, and only those attributes are read via ``read_xml_field`` into scalar locals — unread fields (notably ``string`` fields, whose ``clone_string`` is the alloc cost) are never touched, so a float-only chain runs alloc-free and JIT beats the equivalent SQLite query. A whole-row escape (``to_array`` / identity ``_select(_)`` / pass-to-fn) routes to the full ``build_xml_row`` instead. The ``XmlAdapter`` **rides every pattern row** (``try_splice_patterns`` runs with no ``onlyRow`` restriction); per-row ``requires`` predicates and the adapter's capability hooks (``can_join`` / ``can_group_by`` / ``defers_materialization`` / the ``non_array_source`` gate) decide what fuses, and a shape it can't fuse cascades to tier-2 — see :ref:`linq_fold_xml_patterns` for the full fuse/defer breakdown. ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``) and the node is passed by value (``var root`` — ``_fold``'s macro-arg inference skips the const&→value copy).
    * - ``unsafe(each_kv(tab))`` / ``keys(tab)`` / ``values(tab)``
      - ``extract_table_source`` (``TableAdapter``, ``daslib/linq_fold_table.das``)
-     - In-tree source — recognized by name **plus** a table-typed argument (``table<K;V>`` / ``table<K>``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table<KEY; array<TUPB>>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. ``can_group_by`` is off and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). **``to_table()`` sinks fuse** (table-buffer materializer row above): the chain inserts straight into the result table — a bare ``each_kv(tab).to_table()`` is a reserve-ahead table clone through the fused walk, and a ``keys(tab)`` chain lands in the ``table<K>`` set form. ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference.
+     - In-tree source — recognized by name **plus** a table-typed argument (``table<K;V>`` / ``table<K>``), so an unrelated user ``keys`` never fires it. The kv lane (``each_kv``) binds ``kv.key`` / ``kv.value`` and **usage-prunes the walk**: a chain touching only ``.value`` walks ``values(tab)`` alone, only ``.key`` (or neither) walks ``keys(tab)`` alone — half the slot-skip work of the zipped two-iterator form, which is emitted only when both sides (or the whole pair) are read. A whole-pair escape binds a named-tuple copy, so the kv lane fuses **copyable value types only** — a non-copyable-valued ``each_kv`` falls through and the surviving instantiation concept-asserts (error 31400; the keys/values lanes still fuse such tables). Bare ``count()`` / ``long_count()`` folds to O(1) ``length(tab)``; a plain ``distinct`` over raw keys/kv elements is **dropped** before matching (keys are unique by construction; only uniqueness-preserving prefix ops allow the drop — a preceding ``select`` keeps the distinct, and the values lane always keeps it). **Point-lookup folds** (``try_table_point_lookup``): a key-equality ``where`` (``kv.key == X``, bare ``k == X`` on the keys lane, either operand order; predicate-form ``any(p)`` / ``count(p)`` too) against a loop-invariant, side-effect-free ``X`` folds the whole walk to an O(1) probe — ``any`` / keys-lane ``contains(X)`` → ``key_exists(tab, X)``, ``count`` → ``key_exists ? 1 : 0``, ``first`` / ``first_or_default`` (± one trailing ``select``) → a ``tab?[X]`` probe with the scan's exact semantics (panic on a missing ``first``, eagerly-bound default value otherwise). Anything else — compound ``&&`` predicates, other comparison operators, an ``X`` that reads the binder or has side effects (the scan evaluates ``X`` per element, the probe once) — keeps the scan. ``order_by`` / ``take`` / ``first`` observe the table's unspecified slot order, exactly like a hand ``for (k, v in keys(t), values(t))`` loop. **Joins fuse on either side** (``can_join`` is on; the adapter rides the shared ``emit_array_join`` through its own ``wrap_source_loop``): a table *lead* walks its pruned slot iterator(s) as the probe loop; a table in the *srcB slot* joined on its bare key — ``d.key`` on the kv lane, the bare element on a ``keys(set)`` source — skips the join's internal ``table<KEY; array<TUPB>>`` entirely and probes the user's table per lead row (``join_keyb_is_bare_key`` + ``build_join_probe_pieces``; unique table keys make the probe ≡ hash semantics exactly). The probe is itself usage-pruned: count-no-where and key-only shapes stay on ``key_exists``, value shapes bind the matched value **by reference** from a ``tab?[k]`` pointer (no copy), and only a whole-pair use binds the kv tuple. A non-bare b-key keeps the hashed build over the kv iterator; ``group_join`` (outer — its result consumes the whole bucket) always keeps it. **``group_by`` fuses** (``can_group_by`` is on; ``build_group_by_adapter`` hands ``plan_group_by_core`` a fresh ``TableAdapter``, so the bucket-fill loop is the usage-pruned slot walk — a group key over ``kv.value.brand`` walks ``values(tab)`` alone) for the plain-lead shape only: ``join |> group_by`` over a table lead declines (the upstream-join arm returns null) and reverse has no backward slot walk — those shapes cascade to tier-2 (see ``benchmarks/sql/LINQ_TO_TABLE.md``). **``to_table()`` sinks fuse** (table-buffer materializer row above): the chain inserts straight into the result table — a bare ``each_kv(tab).to_table()`` is a reserve-ahead table clone through the fused walk, and a ``keys(tab)`` chain lands in the ``table<K>`` set form. ``unsafe`` is required at an unfused chain head (the sources are ``[unsafe_outside_of_for]``); fused chains rewrite the head before inference.
    * - ``unsafe(from_json(jv, type<Row>))``
      - ``extract_json_source`` (``JsonAdapter``, ``daslib/linq_fold_json.das``)
      - In-tree source — the adapter is compiled in unconditionally (no ``static_if`` gate, unlike XML's pugixml one), but a program only pulls JSON into scope by requiring ``json`` / ``json_boost`` itself. ``extract_json_source`` matches a ``from_json`` whose first argument is a ``json::JsonValue?``, so a JSON-less program returns null and the chain falls to the array tier. The adapter pulls in **no** json dependency — it emits ``from_json`` / ``read_json_field`` by name (resolved at the user's splice site, like ``linq_fold_decs`` emits ``for_each_archetype``; ``from_JV`` is emitted only for a non-struct element type). Emits an inlined ``for (e in jv.value as _array)`` walk replacing the generator, and **field-prunes** the per-element materialization (pass 2b): only the keys the chain reads are pulled via ``read_json_field`` by name — unread keys (notably ``string`` fields whose materialization clones) are never touched, so a scalar-only chain skips ~all of the full per-row build (3.6× over the full materialize — see ``benchmarks/micro/json_source_shapes.das``). A whole-row escape reads **every** top-level field by name (``emit_full_row_by_name``), so a custom whole-row ``from_JV(Row)`` override is **not** honored (Option B — this is a flat query source, not a deserializer; materialize the array with an explicit ``from_JV`` first for that). ``unsafe`` is required (the source is ``[unsafe_outside_of_for]``). Deferred materialization mirrors XML: order/distinct/take buffer a cheap ``(orderKey, JsonValue?)`` surrogate and materialize only the K survivors — by name (``emit_full_row_by_name``), so a struct survivor reads each field by key; only a non-struct ``Row`` falls back to ``outBind <- from_JV(handle, type<Row>)``. The ``JsonAdapter`` also fuses ``join`` / ``join |> group_by`` (``emit_join_hook`` + ``JsonJoinAdapter`` off ``build_group_by_adapter``'s upstream-join arm), reusing the array-join machinery (``build_join_standalone_pieces`` / ``build_join_adapter_pieces``): srcB is collected into a ``table<KEY; array<TUPB>>`` and the field-pruned array walk is the probe side, so the join key reads only its own field per element (e.g. ``read_json_field(jcur, "brand", …)``). Standalone ``group_join`` and a trailing ``where`` / ``select`` / ``count`` over group-join rows defer to tier-2, mirroring XML.
@@ -497,6 +497,16 @@ see the table-source row above and the probe row below.
        so a join touching only ``c.value.*`` walks ``values(tab)``
        alone. All srcB modes compose (hashed array/iterator srcB,
        table-srcB probe); ``group_join`` stays outer over every slot.
+   * - ``unsafe(each_kv(tab)) |> _group_by(K) |> _select(reduce) |> ...`` (``keys`` / ``values`` lanes too; ``having`` / trailing ``where`` / ``order_by`` / ``count`` compose)
+     - pattern ``group_by`` → ``TableAdapter.build_group_by_adapter`` → ``plan_group_by_core``
+     - **Table lead group_by**: ``build_group_by_adapter`` hands the
+       planner a fresh ``TableAdapter``, so the bucket-fill loop is
+       framed by ``wrap_source_loop`` and the kv usage-pruner sees the
+       whole accumulation body (key expr + reducer updates + upstream
+       where/select segments) — a group key over ``kv.value.brand``
+       walks ``values(tab)`` alone. ``join |> group_by`` over a table
+       lead declines (the upstream-join arm returns null) and cascades
+       to tier-2.
    * - ``arrA |> _group_join(arrB, on, into)`` (+ optional leading ``_where``)
      - pattern ``join_general`` with the ``group_join`` literal (``isGroupJoin``)
      - C# GroupJoin (**outer**): one result row per srcA row — ``result(a,
diff --git a/tests/linq/test_linq_table_source.das b/tests/linq/test_linq_table_source.das
index 8900c05f2..6c78bec35 100644
--- a/tests/linq/test_linq_table_source.das
+++ b/tests/linq/test_linq_table_source.das
@@ -619,3 +619,113 @@ def test_to_table_sink(t : T?) {
         delete arr
     }
 }
+
+[test]
+def test_table_group_by(t : T?) {
+    t |> run("kv lane: count + sum per group agrees with a hand loop") @(t : T?) {
+        var tab <- make_int_table(20)
+        let groups <- _fold(each_kv(tab)
+                            ._group_by(_.key % 3)
+                            ._select((K = _._0, N = _._1 |> count(), S = _._1 |> select($(kv : IKV) => kv.value) |> sum()))
+                            .to_array())
+        var expN : table<int; int>
+        var expS : table<int; int>
+        for (k, v in keys(tab), values(tab)) {
+            expN[k % 3] ++
+            expS[k % 3] += v
+        }
+        t |> equal(length(groups), length(expN))
+        for (g in groups) {
+            t |> equal(g.N, expN?[g.K] ?? -1)
+            t |> equal(g.S, expS?[g.K] ?? -1)
+        }
+        delete expN
+        delete expS
+        delete tab
+    }
+    t |> run("values lane: group key + reducer over struct values") @(t : T?) {
+        var tab <- make_pt_table()
+        let groups <- _fold(values(tab)
+                            ._group_by(_.x % 2)
+                            ._select((K = _._0, MaxY = _._1 |> select($(p : Pt) => p.y) |> max()))
+                            .to_array())
+        t |> equal(length(groups), 2)
+        for (g in groups) {
+            t |> equal(g.MaxY, g.K == 0 ? 40 : 30)
+        }
+        delete tab
+    }
+    t |> run("keys lane: group over raw keys") @(t : T?) {
+        var tab <- make_int_table(10)
+        let groups <- _fold(keys(tab)._group_by(_ % 4)._select((K = _._0, N = _._1 |> length)).to_array())
+        t |> equal(length(groups), 4)
+        var total = 0
+        for (g in groups) {
+            total += g.N
+        }
+        t |> equal(total, 10)
+        delete tab
+    }
+    t |> run("upstream where + select segments feed the bucket fill") @(t : T?) {
+        var tab <- make_int_table(20)
+        let groups <- _fold(each_kv(tab)
+                            ._where(_.key % 2 == 0)
+                            ._select(_.value)
+                            ._group_by(_ % 40)
+                            ._select((K = _._0, N = _._1 |> length))
+                            .to_array())
+        // even keys 0..18 -> values 0,20,..,180; % 40 buckets {0, 20}, 5 each
+        t |> equal(length(groups), 2)
+        for (g in groups) {
+            t |> equal(g.N, 5)
+        }
+        delete tab
+    }
+    t |> run("having + trailing where + order") @(t : T?) {
+        var tab <- make_int_table(12)
+        let groups <- _fold(each_kv(tab)
+                            ._group_by(_.key % 5)
+                            ._having(_._1 |> length >= 2)
+                            ._select((K = _._0, S = _._1 |> select($(kv : IKV) => kv.value) |> sum()))
+                            ._where(_.S > 60)
+                            ._order_by(_.S)
+                            .to_array())
+        // buckets 0..4 sizes 3,3,2,2,2; sums 150,180,90,110,130 -> having keeps all, where keeps S>60
+        t |> equal(length(groups), 5)
+        var prev = -1
+        for (g in groups) {
+            t |> success(g.S > 60 && g.S >= prev)
+            prev = g.S
+        }
+        delete tab
+    }
+    t |> run("count terminator counts groups") @(t : T?) {
+        var tab <- make_int_table(10)
+        t |> equal(_fold(each_kv(tab)._group_by(_.key % 3)._select((K = _._0, N = _._1 |> length)).count()), 3)
+        delete tab
+    }
+    t |> run("empty table yields no groups") @(t : T?) {
+        let e : table<int; int>
+        let groups <- _fold(each_kv(e)._group_by(_.key)._select((K = _._0, N = _._1 |> length)).to_array())
+        t |> equal(length(groups), 0)
+    }
+    t |> run("fused agrees with the tier-2 iterator path") @(t : T?) {
+        var tab <- make_int_table(15)
+        let fused <- _fold(each_kv(tab)
+                           ._group_by(_.key % 4)
+                           ._select((K = _._0, N = _._1 |> count()))
+                           .to_array())
+        var tier2 <- unsafe(each_kv(tab)) |> _group_by(_.key % 4) |> _select((K = _._0, N = _._1 |> length)) |> to_array()
+        t |> equal(length(fused), length(tier2))
+        var t2 : table<int; int>
+        for (g in tier2) {
+            t2[g.K] = g.N
+        }
+        for (g in fused) {
+            t |> equal(g.N, t2?[g.K] ?? -1)
+        }
+        delete t2
+        delete tier2
+        delete tab
+    }
+}