Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 55 additions & 4 deletions benchmarks/sql/LINQ_TO_TABLE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,22 @@ Sibling of [LINQ.md](LINQ.md) / [LINQ_TO_DECS.md](LINQ_TO_DECS.md). Plan of reco
`table<K;V>` / `table<K>` as the 6th `_fold` source, plus the `to_table` sink.
Edited in-place as PRs land.

Status: **stage 6 committed — arc complete** (to_table sink; stage 5 = join probe + table-lead
joins, 2742f6db2; stage 4 = point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources,
29d23baf6; stage 2 = TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba;
master's fixed-array rework merged in after stage 5, 1ab3e6a67).
Status: **stage 7 committed** (group_by fusion — `can_group_by` + `build_group_by_adapter` on
`TableAdapter`, riding `plan_group_by_core` with the usage-pruned slot walk as the bucket-fill
loop; stage 6 = to_table sink; stage 5 = join probe + table-lead joins, 2742f6db2; stage 4 =
point-lookup folds, ac441c4a0; stage 3 = `%linq!` table sources, 29d23baf6; stage 2 =
TableAdapter + m7, 571fe879e; stage 1 = `each_kv` builtin, 8751bb9ba; master's fixed-array
rework merged in after stage 5, 1ab3e6a67; the JIT inline slot walk landed separately, #3100).

Stage 7 findings:
- **Two overrides were the whole change**: the group_by splice pattern was already adapter-generic
(`can_group_by_source` gate → `build_group_by_adapter` → `plan_group_by_core`), so enabling
tables = `can_group_by() == true` + a fresh-`TableAdapter` `build_group_by_adapter`. The kv
usage-pruner sees the whole accumulation body (key expr + reducer updates + upstream
where/select segments), so a group key over `kv.value.brand` walks `values(tab)` alone.
- m7 `groupby_*` INTERP 144–201 → 30–50 ns/op (count 163→31, ~5×); JIT 44–73 → 8.4–11 (count
43.5→8.4, another ~5× — the fused emit rides #3100's inline slot walk). `join_groupby_*` stays
on the cascade (deferred edge below).

Stage 6 findings:
- **Tier-2 surface required for typing**: `_fold`'s argument must fully type before the macro
Expand Down Expand Up @@ -212,6 +224,41 @@ PR1 findings:

End of arc: `skills/linq.md` + linq docs mention the table source.

## Late stage (planned) — reducer shapes & general code hygiene

Cross-source cleanups; none are table-specific. Items 1–2 are user-facing reducer-shape fixes,
items 3–4 are codebase hygiene investigations (the linq_fold surface is workable but "a tad too
unwieldy" — the table adapter took several stages, and many fuses read as "add this hook, because
reasons" rather than falling out of the architecture).

1. **Identity-lambda reducers**: `_._1 |> max($(v) => v)` (also `min`/`sum`/`average`) fails with
30303 today — the untyped lambda can't infer on the tier-2 lazy-bucket surface, and
`recognize_reducer_specs` has no identity arm either. Fix both ends: recognize the identity
inner-select and canonicalize to the bare form (`max()`), and make the tier-2 generic accept
it so unfused chains agree.
2. **Untyped inner-select lambda params**: `_._1 |> select($(c) => c.value.price) |> sum()`
requires an explicit param type (`$(c : CarKV)`) — the lazy bucket's `select` doesn't flow the
element type into the lambda. Thread the type through so the annotation becomes optional;
today's explicit-type requirement is a usability trap (the error is an opaque 30303, not
"annotate the param").
3. **match.das adoption survey (linq_fold* + sqlite_* family)**: flatten_opt's move to
`daslib/match` bought both fewer lines and more readable matchers; the linq spine-walkers are
full of the same hand-rolled `is ExprCall` / `as` / null-guard chains. Prime candidates:
`match_key_probe_side` / `extract_key_probe` (manual `ExprRef2Value` peeling + `ExprField`
checks), `extract_*_source` gates, and the sqlite_linq chain decomposition. Survey first,
convert where the match form is a strict readability win.
4. **SourceAdapter interface audit**: all initially-wanted adapters now exist (11 classes:
Array/Zip/ArrayJoin/Decs/DecsJoin/Xml/XmlJoin/Json/JsonJoin/Table/ProjectedSource) — audit the
~20-method interface against real usage and remove the scaffolding it forces. Known smells:
`arrayTop()`/`arraySrcName()` are still marked "transitional … removed once all consumers move
into subclass methods" yet remain load-bearing (reserve hint reads them; adapters override
them with comments explaining which distant gate reads what); the `build_group_by_adapter`
upstream-join arm repeats ~30 lines of keyaLam/keybLam/resultLam validation across
Array/Decs/Xml/Json; two overlapping reverse hooks (`emit_reverse_skip_into_tail` vs
`emit_reverse_last_backward`); emit fns reconstruct `headCalls` from stringly-keyed captures
("mirrors emit_loop_or_count_lane_decs"). Goal: a new source should not need "several stages"
of hook-by-hook enablement for the standard fuse set.

## Risks / watch items

- **Mangler ICE 50609** (iterator element-const collision) — `each_kv` yields `-const` non-ref
Expand All @@ -224,6 +271,10 @@ End of arc: `skills/linq.md` + linq docs mention the table source.

## Deferred edges (named, not built)

- **`join |> group_by` over a table lead**: `TableAdapter.build_group_by_adapter` declines the
upstream-join arm (returns null → tier-2). The fix is a TableJoin analog of `ArrayJoinAdapter`
(lead loop from the pruned slot walk, srcB hash/probe from the stage-5 pieces); the
`join_groupby_*` m7 cells are the numbers it would improve. Revisit on demand.
- **Point-lookup conjunct extraction**: `where(kv.key == X && <residual>)` (incl. the collapsed
multi-where form) could probe and evaluate the residual on the probed element only. The matcher
currently declines compound predicates; add when a real chain wants it.
Expand Down
Loading
Loading