linq: tables (each_kv / keys / values) as the 6th _fold source + the to_table sink#3099
Merged
Conversation
… lowering over an empty source
each_kv(tab) yields (key, value) named tuples — read-only copies, strict can_copy gate on the
value type (no clone fallback; matches insert's own gate). Explicit reject overloads for
table<K> void values ("iterate keys() instead") and dim-array values, which otherwise mis-bind
and cascade. Pure daslib: a generator zipping the keys/values builtin slot-walk iterators.
PR1 of the LINQ table-source arc (plan: benchmarks/sql/LINQ_TO_TABLE.md).
Also fixes a pre-existing generator-lowering bug exposed by the empty-table test: the yield-for
lowering emitted `loop &&= _builtin_iterator_first(...)` per source, short-circuiting first()
on later sources when an earlier one came up empty — but end_loop closes ALL sources, and
closing a never-opened container iterator unlocks a container whose lock magic was already
cleared ("table/array magic mismatch on unlock"). Reachable on master by any generator zipping
two lockable containers with the first one empty. Now emits `loop = first(...) && loop`,
matching SimNode_ForWithIterator's always-evaluate-first semantics.
Regression: tests/language/generator_zip_empty.das (written first, failed, now green).
Validation: full INTERP suite 10891/0 fail; AOT tests/language 1054 + tests/linq 1893; JIT lane
green on new files; lint (MCP + CI) clean; das2rst no stubs/Uncategorized; Sphinx clean.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…array rework merges in Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ench lane daslib/linq_fold_table.das: TableAdapter + extract_table_source. each_kv/keys/values chain heads (name + table-typed-arg match) emit fused slot walks inside a single-param invoke binding the table. The kv lane usage-prunes the walk from the body's it.key/it.value reads: one side touched -> single-iterator keys()/values() walk (half the slot-skip work), both -> zipped two-iterator for, whole-pair escape -> named-tuple bind (copyable values only; non-copyable falls through and the surviving each_kv instantiation concept-asserts). Bare count/long_count folds to O(1) length(tab); plain distinct over raw keys/kv elements is dropped (keys unique by construction; uniqueness-preserving prefixes only). group_by/join/reverse defer to tier-2 (staged: point-lookup folds, join probe — see benchmarks/sql/LINQ_TO_TABLE.md). Notable mechanics: the qmacro grammar allows $i() only in the FIRST iterator slot of a multi-source for, so the kv zip header uses literal loop-var names (ZipAdapter's itA/itB trade); keys() yields non-const elements, so the engine-visible bind is a let-rebind (workhorse copy, free); the dispatcher clears removeConstant on cloned element types so the -const iterator spelling doesn't leak into buffer types and break push_clone unification. benchmarks/sql/table.das: m7 lane (45 families, kv-form chains, order-insensitive guards) + fixture_table in _common + m7 column in _update_results + results.md re-sweep (2026-06-10). INTERP profile: pruned scans sit between array and XML (sum_aggregate 13.4 ns/elem vs array 2.1 / XML 54.3 / JSON 146.7; contains_match 6.6 keys-pruned); deferred markers groupby ~160-190 / join ~195-230 / reverse_take 58.7 flag the staged tier-2 cells. tests/linq/test_linq_table_source.das: 24 fused-vs-hand-loop agreement tests across all lanes (count shortcuts, accumulators, early-exit, to_array slot-order, order/distinct/take, dropped- distinct correctness, values-distinct stays real, iterator-typed result, set form, tier-2 heads). Docs: linq_fold_patterns.rst source row, linq_fold.md layout, LINQ_TO_TABLE.md findings. Validation: full INTERP suite 10912/0 fail; AOT tests/linq 1914; JIT lane green; MCP + CI lint clean; Sphinx clean; full 6-lane bench sweep regenerated. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… the plan doc Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…arg from_in `from kv in tab` over table<K;V> → each_kv (kv.key/kv.value), table<K> set → keys, anything else → each (arrays unchanged, ast-verified identical emission). The reader can't tell an array from a table, so every untyped fused source now emits `from_in(src)` and FromInMacro dispatches by the inferred value type. FromInMacro rejects switch from `return call` to macro_error + return null (the _sql idiom) — returning the call report-ast-changes every pass and churns to the 50-pass infer cap (30507). The not-inferred arm also gates on isAutoOrAlias and doubles as the defer for local sources whose type settles a pass later. Joins over tables already work on either side at tier-2 (tested both ways); cross/SelectMany over tables stays a named deferred edge in LINQ_TO_TABLE.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… → O(1) probe try_table_point_lookup runs ahead of pattern dispatch in the table arm: any / keys-lane contains → key_exists, count → key_exists ? 1 : 0, first / first_or_default (± one trailing select) → an unsafe(tab?[X]) probe with the scan's exact semantics (panic on missing first, eagerly-bound default). Predicate-form any(p)/count(p) and either operand order match too. X must be loop-invariant AND side-effect free — the scan evaluates X per element, a probe once; a regression test pins per-element evaluation for an impure X. Compound && predicates (incl. collapsed multi-where) decline the probe; conjunct extraction is a named deferred edge in LINQ_TO_TABLE.md. m7 INTERP: point_lookup 0.0 ns/elem vs point_lookup_scan 8.4 (the same query forced through the walk); results.md re-swept. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…srcB key probe Stage 5 of the table arc (benchmarks/sql/LINQ_TO_TABLE.md). Two halves: 1. Lead generalization: emit_array_join takes its lead loop, bind name, and lead invoke-param spelling from the adapter (wrap_source_loop / bind_name / new SourceAdapter.invoke_param_type), so TableAdapter sets can_join=true and routes emit_join_hook to the same emitter — table-lead joins walk the kv usage-pruned slot iterators (a join touching only c.value.* walks values(tab) alone), group joins stay outer over every slot. 2. Table-srcB probe: a join whose srcb is each_kv(tab)/keys(set) joined on its bare key skips the internal table<KEY; array<TUPB>> + build loop — srcB binds the user's table and the per-A probe is a key lookup, usage-pruned like the point-lookup fold (count/key-only -> key_exists, value shapes -> by-ref bind off tab?[k], whole-pair -> kv tuple). Unique table keys make probe == hash semantics exactly; non-bare keybs and group joins keep the hashed build. Per-pair statements factored into build_join_pair_core, shared by build_join_standalone_pieces (group-join arm + bucket wrap unchanged for the decs/xml/json callers) and the new build_join_probe_pieces. m7 sweep: join_count 195.0 -> 65.6 ns/elem INTERP, join_where_count 229.1 -> 81.4; new join_probe 47.3 vs join_probe_build 79.1 (probe ~1.7x on identical rows). Tests: fused-vs-hand-loop agreement both leads, probe shapes, declines (non-bare keyb, group join), %linq! set-srcB + into forms. INTERP 10947/0, AOT+JIT linq 1949/1949, Sphinx -W clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…each-kv # Conflicts: # daslib/builtin.das
…on in the plan doc Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ee tier-2 forms Stage 6 of the table arc (benchmarks/sql/LINQ_TO_TABLE.md), closing the arc. Two layers: 1. Tier-2 surface (daslib/linq.das): selector-free to_table over iterators and arrays — iterator<tuple<K;V>> -> table<K;V> map, iterator<K> -> table<K> set, plus borrowing array forms with reserve. Iterator params are const-qualified (the 50609 mangler-ICE defuse) so each_kv's -const flavor and to_sequence's -& flavor converge on one instantiation. Duplicate keys keep the last occurrence (das insert semantics, not C#'s throw). 2. Fused emit: to_table joins loop_terminator_family + the ARRAY materializer lane; the new arm rides emit_fold_array_lane via FoldArraySpec.bufDeclStmt (table buffer instead of the array decl) — where/select/ranges plumbing all shared. A (k => v) MakeTuple projection splits so key and value evaluate exactly once; other projections bind to a local; pass-through spells the kv access with the element tuple's real field names so the kv usage-pruner maps them. Reserve fires on unfiltered walks only (table over-reserve is worse than an array's slack), with the take-min variant. Map-vs-set falls out of the resolved terminator type. Declines that keep tier-2: the 3-arg selector form, decs sources (explicit guard — the decs lane's implicit-to_array fall-through would mis-emit an array for a table-typed expr). m7: to_table 32.5 vs to_table_staged (materialize + builtin to_table_move) 68.3 ns/elem INTERP (28.8 vs 41.6 JIT). 13 new tests (58/58 in the arc file); full INTERP 10978/10984 0 failed, AOT linq 1962/1962, JIT linq 1962/1962, Sphinx -W clean. results.md re-swept (82 families); skills/linq.md gains the table-source + to_table section (end-of-arc item). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…urces The m7 column had 26 empty cells; only 7 were principled (zip_* x4 / cross_join — lockstep pairing over an unordered slot walk is meaningless; select_many — flat fixture; decs_count_bare_pred — decs-only). The rest were scoping debt: - join_select / where_join_count — fuse today via the stage-5 join work; lanes simply hadn't been written. where_join_count lands at 46.8 ns/elem INTERP (lead-where pruned join); join_select 222.9 (iterator-typed join bail, tier-2). - 12 groupby_* + join_groupby_count/to_array + order_reverse_normalized / reverse_take_select / reverse_distinct_by — instantiated as tier-2-cascade cells (table group_by fusion and a backward slot walk are named deferred edges); the cells now show the cost a future fix would improve. to_table / to_table_staged gain m3f/m4/m5f/m6f lanes (only SQL stays absent — _sql has no table sink): array fuses at 18.7 vs 54.8 staged (~3x), XML 118.2 vs 144.8, JSON 144.3 vs 166.8; decs declines by design and its 144.0 vs 56.8 staged gap is the motivating number for a future decs sink hook. results.md re-swept (all 82 families, m7 dashes 26 -> 7); missing-lanes prose rewritten to match. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR makes table<K;V> / table<K> first-class _fold sources (via a new TableAdapter) and adds a selector-free to_table() sink that can fuse as a terminator, enabling better fusion and O(1) shortcuts/probes for common table-shaped LINQ patterns.
Changes:
- Add
each_kvbuiltin iteration for tables (with compile-time rejections for unsupported value types) and fix generator zip lowering to always callfirst()for every source. - Introduce
TableAdapterfor_foldto fuseeach_kv/keys/valueschains, including usage-pruned walks, point-lookup folding, and join optimizations (table-lead joins + table-srcB probe mode). - Add selector-free
to_table()tier-2 overloads and a fused_foldsink; expand tests, docs, and benchmarks to cover the new behavior and add an m7 “table” benchmark lane.
Reviewed changes
Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| tests/linq/test_linq_table_source.das | New _fold-level correctness tests for table sources, joins, point-lookup folds, and to_table() fusion. |
| tests/linq/test_linq_das.das | Adds %linq! coverage for untyped table/table-set from dispatch and join scenarios. |
| tests/linq/failed_linq_das_table.das | Adds compile-fail coverage for table-source inference rejections in %linq!. |
| tests/language/table_each_kv.das | Adds language-level tests for the new each_kv iterator behavior. |
| tests/language/generator_zip_empty.das | Regression test for generator multi-source loop lowering with empty sources. |
| tests/language/failed_each_kv.das | Compile-fail coverage for each_kv rejection cases (non-copyable/void/dim-array values). |
| src/ast/ast_generate.cpp | Fixes generator lowering to ensure _builtin_iterator_first runs for each zipped source. |
| skills/linq.md | Documents table sources and selector-free to_table() usage at a high level. |
| doc/source/stdlib/handmade/function-builtin-each_kv-0xdb81e5ca7a0e3baa.rst | Adds builtin docs entry for each_kv. |
| doc/source/reference/linq_fold_patterns.rst | Documents new table-source adapter behavior, point-lookup folds, join probe mode, and to_table() sink pattern. |
| doc/source/reference/linq_das.rst | Updates %linq! docs to reflect from_in dispatch for untyped sources, including tables. |
| doc/reflections/das2rst.das | Adds each_kv to the containers grouping for docs generation. |
| daslib/linq.das | Adds selector-free to_table overloads for iterators/arrays (map + set forms). |
| daslib/linq_fold.md | Updates architecture notes to include the new linq_fold_table module. |
| daslib/linq_fold.das | Wires in linq_fold_table and adds dispatcher logic for table source extraction/splicing. |
| daslib/linq_fold_table.das | Implements TableAdapter, table-source extraction, point-lookup folding, and redundant-distinct dropping. |
| daslib/linq_fold_decs.das | Declines to_table for decs to avoid incorrect fallback behavior. |
| daslib/linq_fold_common.das | Adds invoke_param_type capability, to_table as a terminator lane, and join refactors + probe-mode support. |
| daslib/linq_das.das | Updates %linq! transpilation to use from_in(...) for untyped sources and extends FromInMacro for table dispatch. |
| daslib/builtin.das | Adds each_kv builtin + rejection overloads and docs comments. |
| benchmarks/sql/xml.das | Adds XML lane benchmarks for fused to_table vs staged baseline. |
| benchmarks/sql/table.das | New m7 lane benchmarks for table source fusion, probes, joins, and to_table. |
| benchmarks/sql/results.md | Extends benchmark report to include the new table lane and to_table results. |
| benchmarks/sql/LINQ_TO_TABLE.md | Adds/updates the “arc plan” document detailing design decisions and deferred edges for the table arc. |
| benchmarks/sql/json.das | Adds JSON lane benchmarks for fused to_table vs staged baseline. |
| benchmarks/sql/decs.das | Adds decs lane benchmarks for to_table (tier-2 cascade) vs staged baseline. |
| benchmarks/sql/array.das | Adds array lane benchmarks for fused to_table vs staged baseline. |
| benchmarks/sql/_update_results.das | Extends the results generator to include the new m7 lane and header. |
| benchmarks/sql/_common.das | Adds a shared fixture_table helper for the new m7 benchmark lane. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
The arc's linq_fold_patterns.rst additions use ≡ / ⇒ / × in prose; pdflatex halts on undeclared unicode (CI docs job failed on U+2261). conf.py's preamble is the documented place for these — verified locally via sphinx -b latex + pdflatex -halt-on-error pass 1. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
pull Bot
pushed a commit
to forksnd/daScript
that referenced
this pull request
Jun 11, 2026
A for-loop over keys(tab)/values(tab) (incl. the fused linq kv zips) compiled to a heap-allocated C++ TableIterator + first/next call per element per lane. Workhorse-keyed tables are open-addressed at every capacity (GaijinEntertainment#3025), so the walk is a flat ctrl-byte scan — now emitted inline: lock once, scan ctrl[slot] > CTRL_TOMBSTONE, keys copy the slot key out (past-end guarded, like the C++ iterator), values bind a pointer into the data block, close re-checks the data base (modified-during-iteration on shared/hopeless tables that bypass the lock) and unlocks. String / non-workhorse keys keep the generic iterator (different liveness regimes). Detection: the daslib generics instantiate into the compiling module as builtin`keys`<hash> — matched by that compiler-generated prefix (the plain-name + module-$ check never fired; instances don't keep either). The skipped source call never allocates an iterator, mirroring count(). Glue: jit_table_lock/unlock (module_jit.cpp wrapping builtin_table_lock/ unlock; engine mapping + DAS_API symbol for the exe/dll paths). LLVM_JIT_CODEGEN_VERSION 0x25 -> 0x26. m7 JIT spot numbers (ns/elem): count/sum/max_aggregate 13.4 -> 7.3, chained_where 17.8 -> 10.4, join_count 33 -> 25.2, join_probe 24 -> 16.6, groupby_count ~160 -> 44.1, reverse_take ~70 -> 19.3, point_lookup_scan 6.0 -> 3.0, last_match -> 12.0. Full sweep + results.md refresh after the table-arc PR (GaijinEntertainment#3099) merges and this branch rebases onto it. Gates: JIT tests/linq 1962/1962, tests/language 1054/1054, jit_tests + decs + json green, exe-build smoke links (the GaijinEntertainment#3025 dll-glob lesson), new tests/jit_tests/table_walk.das 8/8 INTERP+JIT (incl. is_jit_function firing checks, tombstones, by-ref values, break-unlock, locked-iteration panic, string-key fallback). CI lint clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Tables become a first-class
_foldsource —each_kv/keys/valueschains fuse through a newTableAdapter— plus the selector-freeto_table()sink for every direct-loop source. Plan of record with settled decisions, per-stage findings, and named deferred edges:benchmarks/sql/LINQ_TO_TABLE.md.Stage commits (each independently reviewable)
each_kvbuiltin (8751bb9) —(key, value)named-tuple iteration over tables (strict can_copy gate, reject overloads for void/dim values). Also fixes a pre-existing generator-lowering bug: yield-for emitted short-circuitingloop &&= _builtin_iterator_first(...), so an empty first source closed a never-opened iterator ("magic mismatch on unlock").TableAdapter(571fe87) — kv usage-pruned slot walks: a chain touching only.valuewalksvalues(tab)alone, key-only shapes walkkeys(tab)alone; O(1) barecount(); redundantdistinctover keys/kv dropped. New m7 bench lane.%linq!table sources (29d23ba) —from kv in tabdispatcheseach_kv(map) /keys(set) by argument type, no annotation needed.where(kv.key == X)+any/count/first[_or_default]/containsfolds the whole walk to an O(1)key_exists/tab?[X]probe whenXis loop-invariant and side-effect-free; everything else keeps the scan (per-element evaluation of impureXis observable, covered by a regression test).emit_array_joingeneralized to an adapter-driven lead loop (any direct-return source rides it), so table leads join through the pruned slot walk. A table srcB joined on its bare key (d.key/ bare set element) skips the join's internal hash entirely and probes the user's table per lead row — unique table keys make the probe ≡ hash semantics exactly.to_tablesink (b72f625) — fused insert-loop terminator (ak => vprojection splits so each side evaluates once; reserve from O(1) length on unfiltered walks) + selector-free tier-2 forms over iterators and arrays. Duplicate keys keep the last occurrence (das insert semantics, not C#'s throw).Mid-arc, master's fixed-array rework was merged in (1ab3e6a) and
each_kv's dim-array reject overloads re-validated against the newauto(valT)[]matching rules. A final bench pass (9331bbc) fills the m7 column — 26 empty cells down to 7 genuinely-inapplicable ones — and lightsto_tablelanes across array/decs/XML/JSON.Numbers (INTERP ns/elem, n = 100k; full matrix in
benchmarks/sql/results.md)sumover valuespoint_lookupvs same query as scanjoin_count(table lead)join_probevs forced hashed buildto_tablevs materialize-then-convertGates
Full INTERP 10984 tests / 0 failed; full AOT 10304 / 0 failed; JIT
tests/linq1962/1962 on a clean cache + verifier smoke; CI lint 0 issues across all 20 changed.dasfiles; detect-dupe clean (bench-lane parallelism only); das2rst no stubs / no Uncategorized; Sphinx -W clean;results.mdre-swept on an idle machine.🤖 Generated with Claude Code