Skip to content

feat: async/await as cooperative coroutines (Phase 1) + lazy itertools.islice#58

Merged
ivarvong merged 5 commits into
mainfrom
feat/async-coroutines
May 10, 2026
Merged

feat: async/await as cooperative coroutines (Phase 1) + lazy itertools.islice#58
ivarvong merged 5 commits into
mainfrom
feat/async-coroutines

Conversation

@ivarvong
Copy link
Copy Markdown
Owner

@ivarvong ivarvong commented May 9, 2026

Why

TODO.txt's audit ranked async/await as Tier-1 #1 — the single largest blocker for LLM-emitted code, since modern FastAPI handlers and agent loops are `async def` by default and one such function blew up Pyex at parse time. Lazy `itertools.islice` was Tier-1 #6 — a flagship correctness gap given the README's generators-as-continuations framing (`list(islice(infinite_gen, 5))` used to time out).

Constraint

Pyex is a pure function on the BEAM. No spawning, no global state, no `Process.*` (the `BannedCallTracer` checks every CI build). The runtime doesn't own concurrency on the tenant's behalf — the host application does. So whatever "async" looks like, it can't reach for BEAM processes inside Pyex itself.

Decision

Coroutines as tagged sync functions, driven by a synchronous trampoline.

  • The `:function` pyvalue grew a `kind` field (`:sync` | `:async`) — a single discriminator instead of a parallel `:async_function` tag that would have required N parallel pattern-match clauses across attribute lookup, bound-method construction, callable?, repr, etc.
  • Calling an async function binds parameters and returns a `:coroutine` value without running the body. `await` and `asyncio.run` drive that coroutine via `Pyex.Interpreter.Invocation.drive_coroutine/3`.
  • Async generators (`async def` + `yield`) ride the existing `:lazy_iter` machinery, so FastAPI streaming patterns work transparently.
  • Strict on shape — `await 42` and `asyncio.run(99)` raise CPython-shaped TypeError. The error message includes a hint about the most common LLM mistake (forgetting to call the async function).

This trades fan-out parallelism for simplicity and determinism. Same answer as CPython, sequential wall-clock when fan-out matters. Phase 2 territory: a host-driven trampoline that lets the Elixir application actually overlap awaitable capabilities (HTTP, DB) as real BEAM Tasks. The wire shape for that lives in the divergence tests, not in speculative API.

What ships

```python
import asyncio

async def fetch(user_id):
return {"id": user_id, "name": "Ada"}

async def handler(req):
user = await fetch(req)
return user

asyncio.run(handler(42))

{'id': 42, 'name': 'Ada'}

```

  • Parser: `async def`, `async for`, `async with`, `await` (at unary precedence)
  • Interpreter: kind-aware dispatch, strict await, bound-method routing for async instance / static / class methods
  • `asyncio` module: `run`, `gather`, `sleep`, `create_task`, `ensure_future`, `wait_for`, `iscoroutine`, `iscoroutinefunction`
  • Real `Task` value with `.result()` / `.done()` / `.cancel()` / `.exception()`
  • `gather(return_exceptions=True)` returns real exception instances (not strings) so `isinstance(r, ValueError)` works
  • Async generators via the existing lazy_iter machinery
  • 52 conformance tests in `test/pyex/async_conformance_test.exs`

Phase 1 divergences from CPython (pinned in tests)

  • `gather` is sequential, not interleaved at await points
  • `create_task` drives eagerly rather than scheduling
  • Nested `asyncio.run` is silently allowed (CPython errors)
  • Async list comprehensions (`[x async for x in g()]`) not yet parsed

Each is a real test that demonstrates the divergence rather than asserting behavior identical to CPython. When a future Phase 2 changes any of these, the corresponding test will flip — exactly the signal we want.

Bonus: lazy `itertools.islice`

Independent fix shipped as the first commit. `islice` over an infinite generator used to time out because the iterator was materialized into a list before the builtin ran. Fixed by registering `islice` in the no-drain set and converting it to an `:islice_call` signal evaluated by a bounded-step iterator handler. Same shape extends naturally to `takewhile`/`dropwhile`/`filterfalse` if those become load-bearing.

Commits

  • `itertools.islice: stop cleanly on infinite generators` — independent fix
  • `refactor: add :sync | :async kind discriminator to function values` — pure prep, no behavior change (5295 tests pass before/after)
  • `feat: async/await as cooperative coroutines` — the substance
  • `test: async conformance suite (52 tests, sharp coverage)` — including pinned divergences
  • `docs: README, TODO, CHANGELOG for async/await + islice`

Test plan

  • `mix format --check-formatted` clean
  • `mix compile --warnings-as-errors` clean
  • `mix test` — 5344 tests, 0 failures, 2 skipped
  • `mix dialyzer` — 41 errors / 41 skipped (baseline unchanged)
  • Phase 1 divergence tests demonstrate real divergences, not CPython-identical behavior
  • `asyncio.run` strict on non-coroutine input
  • `await` strict on non-awaitable input
  • `gather(return_exceptions=True)` returns real exception instances (`isinstance(r, ValueError)` works)
  • `Task.done()`/`.result()`/`.cancel()`/`.exception()` work
  • Async methods (instance, @staticmethod, @classmethod, subclass override)
  • `async for` over sync iterables and async generators
  • FastAPI-shaped async handler round-trip

🤖 Generated with Claude Code

ivarvong and others added 5 commits May 9, 2026 16:57
Calling list(islice(g(), 5)) against `def g(): while True: yield i`
used to time out — Pyex.Interpreter.Invocation.maybe_drain_args
was draining the iterator into a list before the builtin ran, and
the drain never finished.

Fix is in three pieces:

  1. itertools.islice is registered in Builtins.no_drain_builtin_funcs,
     so an :iterator argument flows through to do_islice unchanged.
     The capture comes from a public Itertools.islice_capture/0 to
     guarantee MapSet identity (a local &do_islice/1 inside Itertools
     and an external &Itertools.do_islice/1 are distinct function
     values on the BEAM).
  2. do_islice grows :iterator clauses that return an :islice_call
     signal {iter, start, stop, step}. List/tuple/range arguments
     stay on the eager path.
  3. BuiltinResults.eval_islice steps the iterator at most
     ceil((stop - start) / step) times via Ctx.iter_next +
     step_generator, normalizing :list / :gen_pending /
     :gen_awaiting_send / :instance into a uniform advance-iter loop.

Same shape extends naturally to takewhile / dropwhile / filterfalse
if any of those become load-bearing for an LLM-emitted workload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure preparation for async support, no behavior change.  The
:function pyvalue grows a 7th element naming its kind (:sync today;
:async lands in the next commit).

Why a discriminator vs. a parallel :async_function tag: every
dispatch site that pattern-matches on :function — attribute lookup,
bound-method construction, callable?, py_repr/py_type, lambda's
FastAPI handler path, abc.abstractmethod, dataclass detection,
helpers' refresh_closure / update_closure_env / function_attr —
would have needed a parallel clause for :async_function.  Adding a
field instead lets one set of patterns dispatch polymorphically and
makes "is this thing async?" a property check rather than a tag
comparison.

Mechanically: 65 sites across 11 files.  All-underscore patterns
(`{:function, _, _, _, _, _}`) become `{:function, _, _, _, _, _, _}`
via sed; named-binding patterns (`{:function, name, params, body, env,
is_generator}`) get `kind` added by hand so destructure-and-rebuild
sites preserve the original kind.  Two creation sites — `def`
evaluation and lambda — produce `:sync`.  __init__ rebuilds with
`:sync` (async __init__ would need its own design).

Suite: 5295 tests, 0 failures (one fastapi_test pattern needed an
extra `_kind`).  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Coroutines are sync-defined functions tagged kind: :async.  Calling
one binds parameters into the closure environment and returns a
{:coroutine, name, body, call_env} value without running the body.
`await` drives that coroutine via Invocation.drive_coroutine/3 — a
synchronous trampoline that runs the body to completion in its
captured env and unwraps the return value.  Async generators ride
the existing :lazy_iter machinery (kind :sync, is_generator true)
so FastAPI streaming patterns work transparently.

## Surface

  * Parser: `async def`, `async for`, `async with`, `await`.  The
    four NotImplementedError sites are gone; bare `async` (without
    def/for/with) gets a clear SyntaxError.  `await` parses at unary
    precedence to match CPython.
  * Interpreter: eval({:def, meta, ...}) honors meta[:async]; the
    new call_function clause for kind=:async builds a coroutine.
    eval({:await, ...}) drives via Invocation.drive_coroutine,
    which is strict on shape (TypeError on non-awaitables, matching
    CPython).
  * Bound-method dispatch: Invocation.call_bound_method routes
    :async methods through call_function with self prepended so
    `instance.method()` returns a coroutine rather than a value.
  * asyncio module: `run`, `gather`, `sleep`, `create_task`,
    `ensure_future`, `wait_for`, `iscoroutine`,
    `iscoroutinefunction`.
  * `gather(return_exceptions=True)` wraps captured exceptions as
    real {:instance, exception_class, ...} values built via
    Interpreter.exception_instance_class, so callers can do
    `isinstance(r, ValueError)` against gather results.
  * `gather` returns a Task (not a bare list) so the canonical
    `await asyncio.gather(...)` works against strict await.
  * `asyncio.create_task` / `ensure_future` drive the coroutine
    eagerly and wrap the result in a Task that reports done() →
    True, result() → value, cancel() → False.  Methods exposed via
    Pyex.Methods.resolve.

## Strict by default

`await 42` and `asyncio.run(99)` both raise TypeError now (CPython
parity).  The earlier "permissive sync-sugar tolerance" hid bugs
LLM-emitted code should learn to fix.  asyncio.run's error message
includes a hint about forgetting to call the async function — the
most common LLM mistake.

## Deliberate non-additions

  * No Pyex.run_async/2.  The earlier draft reserved a
    {:done, value} | {:suspended, frame, awaiting} contract with no
    code path producing :suspended.  Adding it back when there's a
    real second use case (a host-driven event loop) avoids
    speculative API design.
  * gather is sequential, not concurrent.  Same answer as CPython,
    slower wall-clock when fan-out matters.  Phase 2 territory.
  * Async list comprehensions ([x async for x in g()]) not yet
    parsed.  Workaround: build via async-for body, or consume the
    async gen via sync `for` (works because async generators ride
    lazy_iter).

5292 tests, 0 failures.  Dialyzer clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One test per distinct behavior — no micro-variations.  Organized so
a reader can scan the describe blocks and see what Phase 1 promises.

The "Phase 1 divergences from CPython (pinned)" describe block is
the important one for honesty: each test demonstrates a real
observable divergence and pins it.  When Phase 2 changes the
behavior (real interleaving, scheduled tasks, etc.) those tests
will flip, which is exactly the signal we want.

  * gather is sequential — pinned with shared-state interleaving
    (CPython gives ABABAB; Pyex gives AAABBB)
  * create_task drives eagerly — pinned with side-effect ordering
    (CPython prints "after-create" before "ran"; Pyex prints "ran"
    first because the task is already done)
  * nested asyncio.run is silently allowed — CPython errors with
    "asyncio.run() cannot be called from a running event loop";
    Pyex Phase 1 has no concept of "running event loop"
  * async list comprehensions ([x async for x in g()]) parser-rejected

Plus: asyncio.sleep returns a Task wrapping nil so the canonical
`await asyncio.sleep(t)` works against strict await.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
README: add an "async / await with a synchronous trampoline" bullet
to the "What it runs" section.  Calls out the gather sequencing
trade-off and points to the conformance suite for divergences.
asyncio shows up in the stdlib grid.

TODO: new "Recently fixed (async/await — Phase 1)" section captures
the surface that landed and lists the four pinned divergences.
Test count updated.

CHANGELOG: new Unreleased entry with three sections (async/await,
islice, the function-kind refactor).  Each line is a behavior the
reader can verify in the test suite, not a feature claim.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ivarvong ivarvong merged commit 41509e6 into main May 10, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant