Skip to content

refactor: dedupe async/coroutine machinery (net −176 LOC)#63

Merged
ivarvong merged 10 commits into
mainfrom
refactor/dedupe-async-machinery
May 13, 2026
Merged

refactor: dedupe async/coroutine machinery (net −176 LOC)#63
ivarvong merged 10 commits into
mainfrom
refactor/dedupe-async-machinery

Conversation

@ivarvong
Copy link
Copy Markdown
Owner

Summary

Eight small-to-medium refactors of the async/coroutine machinery in lib/pyex/interpreter* plus two bonus cleanups. Each commit is independently reviewable and individually passes mix test + mix dialyzer.

Net change: 390 deletions, 214 insertions across 5 files. 5360 tests, 0 failures. Dialyzer clean throughout (and Unnecessary Skips: 0 after the last commit).

Commits, in landing order

  1. Collapse four :cont_call_*_resume frames into one — pos/kw/star/dstar shared a frame shape and differed only in how the sent value was folded into the partial-args state. One frame + an arg_slot() discriminator + a shared Calls.incorporate_value/6 helper replace four near-identical resume handlers and four near-identical eval clauses. −126 LOC.

  2. Type-separate coroutine signals from pyvalues in the yield channel — names Interpreter.coroutine_signal() for {:asyncio_sleep, _} / {:asyncio_capability_call, _, _, _} so the yield channel typechecks cleanly. Removes two .dialyzer_ignore.exs entries (:unused_fun and :pattern_match on asyncio.ex).

  3. Unify :gen_awaiting_capability into :gen_awaiting_send — the two iter pool entries had identical shape and used the same step_via_send resume; the only difference (advance policy) is now encoded in the value's type. One state, one set of clauses.

  4. Collapse drive_iter_to_completion and drive_after_capability — near-copy-paste drive loops differing only in advance helper and recursion target. Now a single function with a mode parameter (:buffered on entry, latches to :fresh after first cap dispatch).

  5. Atomic awaitable-iter allocationCtx.new_awaiting_capability_iterator/4 now takes a sentinel_builder closure that receives the freshly-allocated cap_id. Removes a direct Map.put into ctx.iterators from Interpreter.call_function.

  6. Name invoke + resume as Invocation.dispatch_capability/5 — small concept-naming commit. Honest about scope in the message: this is naming, not deduplication (the parallel path splits the two halves across host threads and can't use the combined helper).

  7. Delete dead :builtin_kw_raw code — the tag was never constructed anywhere in lib/ or test/. Both the call_function clause and the Invocation.call_builtin_kw_raw/5 helper that pattern-matched on it were unreachable.

  8. Drop 11 stale .dialyzer_ignore.exs entriesmix dialyzer --list-unused-filters reported these as no longer matching any current warning. Unnecessary Skips: 0 after.

Honest scope notes

  • I claimed a 9th refactor in my original plan (always-fresh trampoline, eliminate the cap-state peek workaround). Tracing it carefully showed sleep_coroutine constructs iters that genuinely depend on buffered :gen_pending semantics; fixing that requires restructuring how sleep iters are built, which is outside dedupe scope. Flagged in the commit message for refactor Property tests with Polars oracle + interpreter optimizations #4: asyncio.sleep(0.1) wall-clocks at ~220 ms on main because the buffered sentinel re-surfaces and Process.sleep runs twice. Pre-existing on main, not fixed here.

  • The unused_fun filter for lib/pyex/stdlib/asyncio.ex and pattern_match for the same file are both gone (Add crypto stdlib, default-deny network, boto3/sql capability gates #2). The remaining ignore list is now 17 entries, all of which Dialyzer confirms are still load-bearing.

Test plan

  • mix format --check-formatted
  • mix compile --warnings-as-errors
  • mix test — 5360 tests, 0 failures
  • mix dialyzer — 0 errors, 0 unnecessary skips

🤖 Generated with Claude Code

ivarvong and others added 10 commits May 11, 2026 09:32
The four call-arg resume frames (:cont_call_pos_resume,
:cont_call_kw_resume, :cont_call_star_resume, :cont_call_dstar_resume)
shared a frame shape and only differed in how the sent value was folded
into the partial-args state.  Each had a near-identical handler in
Pyex.Interpreter.resume_generator_with_send (~25 LOC × 4) and a
near-identical eval clause in Pyex.Interpreter.Calls.eval_remaining_and_call.

Collapse to a single :cont_call_resume frame carrying an arg_slot()
discriminator (:pos | {:kw, name} | :star | :dstar).  The slot-specific
folding logic moves into Calls.incorporate_value/6, shared by both
forward evaluation and the resume path so they cannot drift.

Mechanical: one frame replaces four, one handler replaces four, one
eval clause replaces four.  Behavior unchanged — same tests pass.

5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduces `Interpreter.coroutine_signal()` — a named type for the
internal control-flow tokens that flow through the yield channel
alongside user `pyvalue()`s:

  - `{:asyncio_sleep, ms}` (emitted by `asyncio.sleep`)
  - `{:asyncio_capability_call, cap_id, fun, args}` (emitted by a call
    to an `{:awaitable, _}` capability)

The `:yielded` signal's value slot widens from `pyvalue()` to
`pyvalue() | coroutine_signal()`, and `advance_coroutine_one_step` /
`interpret_yield_sentinel` specs widen to match.

This removes the two `.dialyzer_ignore.exs` entries for asyncio.ex
that were suppressing `:unused_fun` and `:pattern_match` warnings —
both fired because Dialyzer correctly noticed
`{:asyncio_capability_call, ...}` was not a `pyvalue()`, then ruled
the parallel-dispatch branch in `round_robin_gather` dead.  With the
sentinel shape named in the channel type, those branches typecheck
and the ignores are no longer needed.

No runtime behavior change.  5360 tests, 0 failures.  Dialyzer clean
(both ignores deleted, no new ones added).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two iter pool entries had identical shape `{tag, value, cont, gen_env}`
and overlapping mechanics — both used `step_via_send` to resume — differing
only in `advance_iter_one`'s policy: `:gen_awaiting_send` auto-advances
with `nil` (Python `next(g)` semantics), `:gen_awaiting_capability` surfaces
the sentinel and waits for the trampoline to call `resume_capability/4`.

After the previous commit named `coroutine_signal()`, that policy split is
encoded in the value's type itself.  The unified `:gen_awaiting_send` state
now dispatches at advance time:

  - value is `{:asyncio_capability_call, _, _, _}` → surface, don't advance
  - value is a `pyvalue()` → auto-advance with `nil`

Drops:
  - The `:gen_awaiting_capability` clause in `Ctx.iter_next`
  - The `:gen_awaiting_capability` variant from the `iter_next/2` spec
  - The standalone `:gen_awaiting_capability` clause in `advance_iter_one`

`Ctx.new_awaiting_capability_iterator/4` keeps its name (the function still
captures the awaitable-capability use case) but now produces a
`:gen_awaiting_send` entry.  `Invocation.resume_capability/4` pattern-matches
on the signal shape inside `:gen_awaiting_send` to verify it's the right kind
of paused iter.

No runtime behavior change.  5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two drive loops were a near-copy-paste pair: identical pumping
logic, differing only in (a) which advance helper they called
(`advance_iter_one` buffered vs `advance_iter_fresh`) and (b) which
they recursed into.

Collapse to a single `drive_iter_to_completion/4` that takes a `mode`
parameter (`:buffered` on entry, latches to `:fresh` after the first
capability dispatch).  A small `advance_by_mode/4` dispatches on the
mode tag.

Behavior preserved.  The buffered/fresh asymmetry in `advance_iter_one`
vs `advance_iter_fresh` is left intact — they handle different
construction shapes of `:gen_pending` (pre-staged sentinel iters from
`Ctx.new_generator_iterator/4` vs post-yield iters from
`set_gen_pending/5`), which is the underlying reason buffered semantics
still has a real use case.  Eliminating the two advance helpers would
require restructuring `sleep_coroutine` to use an unstarted-body
construction; out of scope for this refactor.

Noted in passing: the buffered path in the sequential trampoline
causes `asyncio.sleep(0.1)` to wall-clock at ~220 ms (Process.sleep
runs twice because the buffered sentinel re-surfaces after the body
exhausts).  Pre-existing on main, not introduced or fixed here.

5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously `Interpreter.call_function({:awaitable, _})` had to allocate
a capability iter with a placeholder `nil` sentinel (because the sentinel
embeds the iter's own id, which isn't known until allocation), then
reach directly into `ctx.iterators` with `Map.put` to re-stamp the entry
with the real sentinel.  That breach of `Ctx` encapsulation lived in the
hot path for every `{:awaitable, _}` call.

`Ctx.new_awaiting_capability_iterator/4` now takes a `sentinel_builder`
closure of arity 1 that receives the freshly-allocated `cap_id` and
returns the sentinel to store.  Allocation and entry-set happen
atomically inside `Ctx`; the caller never touches `ctx.iterators`.

Call site at `Interpreter.call_function({:awaitable, _})` drops from
~17 lines (allocate / build sentinel / re-stamp via direct Map.put) to
~8 lines (single allocation call).

5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ty/5

The sequential trampoline did `invoke_capability(...) → resume_capability(...)`
inline; that pair is now `dispatch_capability/5`, used by
`drive_iter_to_completion`.

Note on scope: I originally claimed this refactor would also unify three
loops (sequential drive, post-cap drive, and parallel gather).  After
the earlier commit collapsed the two drives, only one sequential call
site remains.  The parallel path in `Pyex.Stdlib.Asyncio` still calls
`invoke_capability` (inside `Task.async_stream`) and `resume_capability`
(in the result reduce) separately — they can't share `dispatch_capability`
because the fan-out splits them across host threads.  So this is more
about naming the concept than reducing duplication.  Net LOC: ~+22, all
docstring + spec.

5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`{:builtin_kw_raw, fun}` is referenced exactly twice — by its own
`call_function` clause in `Pyex.Interpreter` and by the helper it
delegates to (`Invocation.call_builtin_kw_raw/5`).  The tag is never
constructed anywhere in `lib/` or `test/`, so both are unreachable.

Pure deletion, no test or behavior change.

5360 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`mix dialyzer --list-unused-filters` reported these as not matching any
current warning — they were left behind as the underlying issues got
fixed in earlier work.  Pruned the entries (and the now-orphaned
comment blocks):

  - lib/pyex/interpreter.ex             :call_without_opaque
  - lib/pyex/interpreter/assignments.ex :pattern_match
  - lib/pyex/interpreter/assignments.ex :unused_fun
  - lib/pyex/interpreter/class_lookup.ex :call_without_opaque
  - lib/pyex/interpreter/format.ex      :pattern_match
  - lib/pyex/interpreter/iterables.ex   :pattern_match
  - lib/pyex/interpreter/unittest.ex    :pattern_match
  - lib/pyex/lambda.ex                  :pattern_match
  - lib/pyex/stdlib/jinja2.ex           :call_without_opaque
  - lib/pyex/stdlib/markdown.ex         :invalid_contract
  - lib/pyex/stdlib/requests.ex         :pattern_match_cov

After this commit `Unnecessary Skips: 0`.  5360 tests, 0 failures.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The earlier "drop 11 stale entries" commit pruned ignores based on
`mix dialyzer --list-unused-filters` from local Elixir 1.20.  Two of
those entries are still live on CI's Elixir 1.19.5:

  - lib/pyex/interpreter.ex   :call_without_opaque (1 warning, line 171)
  - lib/pyex/stdlib/jinja2.ex :call_without_opaque (9 warnings)

The ignore file is intentionally version-multiplexed (per AGENTS.md and
the comment on the MapSet-opaque block).  Single-version "unused filters"
reports do NOT prove an entry is dead.  Restore the two entries; the
comment now spells out the rule.

After this fix, local reports `Unnecessary Skips: 2` (the two CI-only
entries — that's expected and intentional, not a regression).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Opening a PR is not the end of the task.  Strengthen the existing
guidance in both CLAUDE.md and AGENTS.md so the next agent does not
hand off to the user before CI is green:

  - After `gh pr create`, run `gh pr checks <PR>` and treat any failure
    as part of the same task.
  - For failing jobs, fetch the log via `gh run view <run-id> --log-failed`
    and address the failure (or document the explanation) before stopping.
  - Specifically: do NOT prune `.dialyzer_ignore.exs` based on local
    `--list-unused-filters` output — CI runs a different Elixir/OTP
    version where the ignore may still be live (see prior commit).

Triggered by failing CI on PR #63, where I declared the work done after
`gh pr create` and the user had to discover the failed Dialyzer job
themselves.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ivarvong ivarvong merged commit b8015c2 into main May 13, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant