Skip to content

linq late-stage cleanups: reducer shapes, adapter hygiene, match.das AST patterns (+ preflight pdflatex gate)#3110

Merged
borisbat merged 11 commits into
masterfrom
bbatkin/match-ast-conversions
Jun 12, 2026
Merged

linq late-stage cleanups: reducer shapes, adapter hygiene, match.das AST patterns (+ preflight pdflatex gate)#3110
borisbat merged 11 commits into
masterfrom
bbatkin/match-ast-conversions

Conversation

@borisbat

Copy link
Copy Markdown
Collaborator

Late-stage cleanup arc for linq (plan of record: benchmarks/sql/LINQ_TO_TABLE.md "Late stage"): user-facing reducer-shape fixes, adapter hygiene, and match.das adoption for AST matchers — plus the preflight side-quest (first mac run + a new docs/pdflatex gate).

Reducer shapes (items 1+2)

  • Selector overloads sum / min / max / average(src, selector) in daslib/linq.das (iterator + array forms) — the C# 2-arg spellings. first / count deliberately excluded: their C# 2-arg forms take a predicate, not a selector.
  • Untyped lambdas on the group-bucket surface_._1 |> select(@(x) => x * 2) |> sum failed with 30303 on BOTH tiers because the fully generic tier-2 params can't flow a type into an untyped lambda, and the fold macro reads ._type so tier-2 must accept the spelling first. Fix: BucketLambdaStamper in linq_boost stamps the bucket element type onto untyped single-param lambdas of direct _._1 |> op(lam) calls before inference (same mechanism as the existing outer-param injection in visit()).
  • 2-arg recognizer arm in is_bucket_reducer_call — the direct-selector spelling fuses like the inner-select form; an identity lambda (max(@(v) => v)) canonicalizes to the bare reducer.
  • 18 new tests (tests/linq/test_linq_reducer_shapes.das), green on interp / JIT / AOT; fused-vs-unfused parity asserted.

Adapter hygiene (item 4, partial by design)

  • 4A: the upstream-join validation block was byte-identical in the Array / Json / Xml build_group_by_adapter arms — extracted extract_upstream_join_core + extract_upstream_join_array_srcb into linq_fold_common; decs keeps extract_decs_bridge and shares only the core.
  • 4B: the transitional getters arrayTop() / arraySrcName() renamed to loop_source_expr() / loop_source_name() with an honest contract: they're the generic-lane source feed (the shared array-shaped lanes emit for/length() against the name and read the expr for compile-time facts; null = decline), not an array-only leftover. linq_fold.md updated to match.
  • Deferred with reasons (recorded in LINQ_TO_TABLE.md): stringly-Captures → typed ChainView; per-source dispatch can't become a registry (macro modules compile into separate contexts).

match.das adoption (item 3) + ExprRef2Value transparency

  • daslib/match now peels ExprRef2Value for AST class patterns and $v captures (match_peel_r2v), mirroring qmatch's transparency rule — the typer's wrappers have no surface syntax, so a clean ExprField(...) pattern should match a wrapped source. An explicit ExprRef2Value(...) pattern still matches the wrapper itself. 7 new tests (tests/match/test_match_r2v_peel.das).
  • match_expr now accepts das_string fields compared against string locals (language-level compare; previously a false "mismatching expression type" error).
  • Converted the hand-rolled is/as/peel ladders that fit: the key-probe matchers in linq_fold_table and the bare-key join matcher in linq_fold_common (now one shared is_bare_key_ref); extract_key_probe's == leaf goes through qmatch. Declined with reasons: is_bucket_reducer_call (match is statement-shaped; the ladder isn't the function) and extract_decs_bridge (match.das array patterns reject das-vector scrutinees).
  • Toolbox docs: skills/das_macros.md gains a "match (daslib/match) — pairs with qmatch" section (division of labor, limits, canonical conversions) and the stale "qmatch auto-peel is TODO" claim is fixed; CLAUDE.md idiomatic-forms row added.

preflight: first mac run + docs/pdflatex gate

  • Tool discovery off PATH: sphinx-build via ~/Library/Python/*/bin (mac pip --user) / ~/.local/bin, latexmk via /Library/TeX/texbin.
  • New gate 7 (docs/pdflatex): compiles both PDFs via latexmk -cd, deliberately stricter than CI — CI's PDF steps are continue_on_error, so an undeclared unicode char ships broken release PDFs without going red (the U+2261 incident). Locally it's a FAIL with a one-line remedy.
  • doc/source/conf.py declares 7 more unicode chars (≈ ∈ ∉ √ ∞ ∧ ∨) per the "support them as they come" policy. No Windows behavior change.

Bench re-sweep

benchmarks/sql/results.md re-swept (m1–m7, INTERP + JIT) per the living-doc rule. No perf-path changes in this PR — deltas are run-to-run noise (JIT on Apple silicon is high-variance per the results.md preamble).

Also

  • skills/writing_tests.md documents the tests/.das_test directory filter and its root-path caveat (the filter only loads at the --test ROOT, so subtree sweeps walk gated dirs unfiltered), plus the current options no_aot AOT-hash limitation.
  • detect-dupe pass over the diff folded two findings (one bare-key matcher, shared preflight skip-gate helper).

Validation

  • preflight --full: 16 gates pass (format, lint, dasgen, ci-das, all seven docs gates incl. pdflatex, tests-cpp, interp, JIT, AOT, sequence smoke; cpp-syntax skipped — no C++ changed).
  • Full suites: interp / JIT / AOT green; new tests covered on all three tiers.

🤖 Generated with Claude Code

borisbat and others added 10 commits June 11, 2026 21:07
…oop-source getters

extract_upstream_join_core + extract_upstream_join_array_srcb in linq_fold_common
replace the ~30-line keya/keyb/result + srcB validation copied across the
Array/Decs/Json/Xml build_group_by_adapter arms (Json/Xml/Array were byte-identical;
Decs keeps its extract_decs_bridge srcB path).

arrayTop()/arraySrcName() -> loop_source_expr()/loop_source_name(): the 'transitional'
comment described a migration that should never finish — the pair IS the generic-lane
interface (counter/early-exit/dedup/order/hashed-join lanes key off it). Contract
documented on the base class + linq_fold.md.

Also drops two pre-existing unused 'require strings' (STYLE030) in touched files.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
First full-tier run on mac: 13 passed / 0 failed — the utility was already
cross-platform-clean except doc-tool discovery. sphinx-build probe now falls
back to ~/Library/Python/*/bin (mac pip --user) and ~/.local/bin when PATH
misses; latexmk falls back to /Library/TeX/texbin (BasicTeX). PATH wins when
present; Windows behavior unchanged.

New 7th doc gate docs/pdflatex compiles both PDFs via latexmk -cd, mirroring
CI's latex-action — but fatal where CI is continue_on_error, so an undeclared
unicode char fails at pre-push instead of shipping broken release PDFs
(the U+2261 incident class). One-line remedy documented in conf.py, whose
preamble also gains a verified margin of likely-next symbols.

Validated on mac: all 7 doc gates pass, both PDFs compile.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Late-stage items 1+2 (LINQ_TO_TABLE.md). Three pieces:

- linq.das: 2-arg projected-reducer overloads sum/min/max/average(src, selector)
  over arrays and iterators (C# Max(selector) parity — returns the projection;
  max_by stays the element-returning form). first/count excluded by design:
  their C# 2-arg forms take predicates, not selectors.

- linq_boost: BucketLambdaStamper — when the chain element is the group_by_lazy
  shape tuple<K, array<E>>, untyped single-param lambdas passed to direct
  <bind>._1 |> select/reducer calls get E stamped before the rewrite (same
  mechanism as the existing outer-param injection), so the previously opaque
  error[30303] spelling now compiles identically on fused and unfused paths.
  Root cause: tier-2 must type first — the fold macro reads ._type for accType,
  so the 30303 cascaded into _fold's 50503 decline.

- linq_fold_common: is_bucket_reducer_call accepts the 2-arg direct selector
  form as <r>_inner_select; an identity lambda canonicalizes to the bare
  reducer (max($(v) => v) ≡ max()).

18 new tests (tests/linq/test_linq_reducer_shapes.das) green on interp, JIT,
and AOT; full linq/json/xml/sqlite/decs suites pass (3977 tests).
linq_fold_patterns.rst group_by row updated with the accepted reducer
spellings.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Post-typer ExprRef2Value wrappers are invisible (no surface syntax), so AST
class patterns now peel them on the source side at every nesting level —
mirroring qmatch's rule. Without this, nested patterns like
ExprField(value = ExprVar(...)) fail against post-infer sources because the
wrapper sits between the levels, degrading every conversion to
capture-peel-rematch.

- match_struct's AST arm wraps the access in match_peel_r2v() before the
  null guard + is/as cast; an explicit ExprRef2Value(...) pattern still
  matches the wrapper itself (match.das can spell it; qmatch can't).
- $v captures of Expression-typed values bind the peeled node (consistent
  with qmatch's $e).
- match_peel_r2v ships both pointee flavors; das's const-correctness wall
  (a const-pointee pointer can't escape to a mutable copy) picks the right
  one per call site.

7 new tests in tests/match/test_match_r2v_peel.das; full match (61) and
flatten (245, the main match consumer) suites green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
The stage-B hand-rolled is/as/ExprRef2Value ladders collapse to nested class
patterns (riding the new ExprRef2Value transparency):

- match_key_probe_side's 4-deep peel ladder → is_key_ref, one match with an
  ExprField(name="key", value=ExprVar(...)) arm per lane (linq_fold_table).
- extract_key_probe's ==-leaf decompose → qmatch(leaf, $e(lhs) == $e(rhs)).
- join_keyb_is_bare_key (linq_fold_common) → same two-arm match shape.
- match.das: match_expr() gains the das_string == string exemption the
  constant path already had (an ExprVar name couldn't be compared against a
  runtime string local without it).

Assessed and declined: is_bucket_reducer_call (match is statement-shaped;
the tuple-returning recognizer with string-set alternation reads better
hand-rolled) and extract_decs_bridge (match.das array patterns reject
das-vector scrutinees — ExprCall.arguments / ExprBlock.list can't be
destructured; a fit only if the library grows das-vector patterns).

linq (1993) / match (61) / sqlite (904) suites green; point-lookup behavior
identical (71/71 table-source tests).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
skills/das_macros.md gains the 'match (daslib/match)' toolbox section — the
qmatch↔match division of labor (pattern-as-source-syntax vs node-class
destructuring), the feature set with the canonical key-probe example, and the
real limits (das-vector fields can't be destructured; statement-shaped match
doesn't fit tuple-returning recognizers). Also fixes the stale 'auto-peel
inside qmatch remains a TODO' claim — it landed (ast_match.das header,
test-pinned in test_field_typed_source.das) — and folds the peel guidance
into one 'ExprRef2Value transparency' section covering both matchers.

CLAUDE.md idiomatic-forms table points hand-rolled is/as ladders at the two
matchers. LINQ_TO_TABLE.md late-stage section updated with status + the
ast_match/match attribution fix.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… caveat

can_visit_folder gates no_aot/ast/ast_match dirs under --use-aot/-jit and
module dirs by availability — but only when the filter sits at the --test
ROOT. Sweeping a subtree (--test tests/flatten) bypasses it and produces
false 50101s from the deliberately interp-only dirs. Also notes the current
in-file 'options no_aot' AOT-hash breakage (master fix incoming).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ared skip-gate helper

is_key_ref (table point-lookup) was structurally identical to join_keyb_is_bare_key
(common, join keyb) — renamed the common one to is_bare_key_ref and pointed both
call sites at it. preflight's two docs skip helpers now share skip_gates().

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
No perf-path changes in this PR — deltas are run-to-run noise per the
living-doc rule (every PR touching linq_fold*/linq/linq_boost re-sweeps).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Public only so match-expanded code can call it cross-module; users never
call it directly, same as match_type/match_expr.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings June 12, 2026 04:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR advances the “late stage” LINQ cleanup work by (1) expanding reducer call shapes to better match C# (selector overloads + canonicalization), (2) improving fold adapter hygiene via shared helpers and clearer adapter contracts, and (3) adopting daslib/match patterns (including ExprRef2Value transparency) for AST matchers. It also strengthens local tooling by adding a stricter docs/pdflatex preflight gate and updating docs/benchmarks accordingly.

Changes:

  • Add selector overloads for sum/min/max/average(src, selector) and update fold recognition to fuse direct-selector and inner-select forms (incl. identity-lambda canonicalization).
  • Make daslib/match peel ExprRef2Value transparently for AST class patterns and $v captures; convert select LINQ fold match ladders to match/qmatch.
  • Enhance preflight docs tooling discovery and add a local docs/pdflatex gate; update docs and benchmark living docs.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
utils/preflight/README.md Updates preflight documentation to reflect 7 docs gates.
utils/preflight/main.das Adds tool discovery for sphinx-build/latexmk, introduces docs/pdflatex gate, and refactors skip helpers.
tests/match/test_match_r2v_peel.das Adds tests ensuring daslib/match peels ExprRef2Value for AST patterns and captures.
tests/linq/test_linq_reducer_shapes.das Adds tests for selector reducers, bucket-surface untyped lambdas, and reducer-shape parity.
skills/writing_tests.md Documents .das_test directory filter behavior and AOT/JIT subtree caveat.
skills/preflight.md Documents the new 7th docs gate (pdflatex) and tool discovery behavior.
skills/das_macros.md Updates matcher guidance, adds match vs qmatch guidance, and documents ExprRef2Value transparency.
modules/dasPUGIXML/daslib/linq_fold_xml.das Deduplicates upstream-join validation via shared helpers; renames adapter getter override to loop_source_name.
doc/source/reference/linq_fold_patterns.rst Documents accepted reducer spellings and bucket-surface lambda stamping behavior.
doc/source/conf.py Extends LaTeX unicode declarations and clarifies policy comments.
doc/reflections/das2rst.das Hides match_peel_r2v in “Implementation details” for generated docs grouping.
daslib/match.das Adds match_peel_r2v, peels ExprRef2Value for AST class patterns and $v captures, and relaxes match_expr string-like compares.
daslib/linq.das Adds selector overloads for max/min/sum/average for iterator and array sources.
daslib/linq_fold.md Updates adapter contract docs from arrayTop/arraySrcName to loop_source_expr/loop_source_name.
daslib/linq_fold_table.das Adopts daslib/match, updates adapter getters, and simplifies key-probe recognition using shared bare-key matcher + qmatch.
daslib/linq_fold_json.das Deduplicates upstream-join validation via shared helpers; renames adapter getter override to loop_source_name; removes unused strings require.
daslib/linq_fold_decs.das Deduplicates upstream-join validation via shared helpers for the decs join-by-group-by adapter path.
daslib/linq_fold_common.das Renames generic-lane source feed API, adds reducer-shape recognition updates (2-arg reducers + identity canonicalization), adds shared upstream-join extraction helpers, and converts bare-key matcher to match.
daslib/linq_fold_array.das Updates adapter getters, deduplicates upstream-join validation via shared helpers; removes unused strings require.
daslib/linq_boost.das Adds bucket-surface lambda stamping pass to type untyped lambdas over group buckets pre-inference.
CLAUDE.md Updates guidance on replacing is/as/peel ladders with qmatch/match.
benchmarks/sql/results.md Refreshes benchmark results sweep (living doc update).
benchmarks/sql/LINQ_TO_TABLE.md Updates “Late stage” plan status and records what’s completed/deferred.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread daslib/linq_fold_common.das Outdated
Copilot review: the hand-rolled single-if peel missed nested wrappers and
violated the house rule this PR documents (route through the peel helper).
A miss was benign (falls to the inner-select path, same result) but the
canonicalization now matches the conservative while-peel convention.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 23 out of 23 changed files in this pull request and generated 1 comment.

Comment thread daslib/match.das
Comment on lines +113 to +122
def public match_peel_r2v(var e : Expression?) : Expression? {
//! Strip post-typer ``ExprRef2Value`` wrappers. The ``match`` macro emits this around the
//! source side of AST class patterns so a clean pattern (``ExprField(...)``) matches a
//! typer-wrapped source, mirroring ``qmatch``'s transparency rule. ``$v`` captures of
//! Expression-typed values go through it too, so they bind the peeled node.
while (e != null && e is ExprRef2Value) {
e = (e as ExprRef2Value).subexpr
}
return e
}
@borisbat borisbat merged commit 96f4686 into master Jun 12, 2026
35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants