Add support for out-of-tree builtin modules#559
Conversation
Initial scaffolding for an example demonstrating how to define and use an out-of-tree builtin VM module that wraps a C library (qrcodegen by Project Nayuki, MIT). Layout: - vendor/qrcodegen/: vendored upstream sources + Makefile mirroring spy/libspy/Makefile (native/native-static/wasi/emscripten targets). - spyvm_qrcodegen/: importable Python package exposing a MODULE instance of ModuleRegistry. C-build metadata (include_dirs, libraries, ...) is sketched as TODO; ModuleRegistry.build_info is not implemented yet. - demo/: main.spy + spy.toml manifest. spy.toml lists out-of-tree modules via extra-vm-modules = [...]; equivalent CLI is --extra-vm-module <path> (additive on top of spy.toml; --no-spy-toml disables the manifest). The bindings and CLI/manifest loader are not implemented yet. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
I think this looks reasonable. I'm trying to decide if there are any drawbacks to putting the libraries and flags inside the Python registry for the module, rather than setting them externally (like on the spy command line). I'm wondering if setting on the spy command line might be needed if the C library is distributed separate from the SPy module, and its location varies depending on system setup. (Ex: Maybe I'm using the system installation of libxml, maybe I'm using the one in my conda environment?) |
that's a good point and I don't know what is the best tradeoff. My reasoning is that "which concrete C lib to link" is a property of the extension module, not of the final consumer. The CPython parallel is this:
The parallel doesn't fully apply to spyvm_modules because it's not clear whether we should treat them as source distributions or "compiled" artifacts. In the qrcodegen example, you must go there and type So probably, a fully (over?) engineered system would be:
However, I'm not sure that at this stage we need this level of complexity, especially keeping in mind that out-of-tree vm modules might not be the proper long term solution). I think we can get 90% of the benefits and 1% of complexity by keeping the build info in the Happy to be convinced otherwise :) |
Adds spy/libspy/flags.py, a CLI (python -m spy.libspy.flags) that prints compiler flags, CC, AR, and include paths for each supported target. This makes spy/libspy/flags.py the single source of truth for libspy-compatible build flags, so out-of-tree module Makefiles stay in sync automatically. Updates examples/out-of-tree/vendor/qrcodegen/Makefile to derive its flags from the new helper instead of duplicating them by hand. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dling Adds spy/libspy/bundle.py with link_bundle(), which links multiple .a archives (wasm32-wasi-musl) into a single reactor .wasm via zig cc. All archives are wrapped in --whole-archive so export-only symbols survive. Extends CTest in spy/tests/support.py with: - c_compile_archive(): compile a C source string to a .a using libspy flags - wasm_link_bundle(): thin wrapper around link_bundle() for tests Adds test_bundle_multiple_archives to TestLLWasm, confirming that two independently-compiled archives share globals and are callable through a single LLWasmInstance. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds spy/libspy/bundle_cache.py with get_or_build_bundle(), which looks up a pre-built bundle under build/wasm-bundles/<sha256>/ and builds it on cache miss. The cache key covers archive contents, sorted exports, and the zig toolchain version, so any change to inputs produces a new bundle. force_rebuild=True bypasses the lookup. Adds test_bundle_cache and test_bundle_cache_invalidation to TestLLWasm. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds get_LLMOD() to spy/libspy/__init__.py: with no extras it returns the prebuilt LLMOD unchanged; with extra archives it links libspy.a + extras into a bundle via bundle_cache.get_or_build_bundle() and returns a new LLWasmModule for it. Adds all_export_names() to LLWasmModuleBase / LLWasmModule (wasmtime) to expose the list of symbol names exported by a compiled .wasm module. Updates CTest.c_compile_archive to pass -I<libspy-include> and the SPY_DEBUG/RELEASE flag, so out-of-tree archives can #include "spy.h". Adds test_get_LLMOD_with_extra_archive: a side archive calls spy_str_alloc from libspy, verifying that the bundle-with-libspy path works end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ModuleRegistry gains a wasm_archive field. When an out-of-tree module sets it, SPyVM.__init__ collects the archive path before creating self.ll and passes it to get_LLMOD, which builds a single bundled .wasm containing libspy + all extra archives. Empty wasm_archive (the default) leaves the existing fast path untouched. _load_extra_vm_module is replaced by _import_extra_vm_module, which returns the registry rather than immediately registering it, so archives can be collected before ll is created. Module registration is then done in the same loop as before. Updates examples/out-of-tree/spyvm_qrcodegen/__init__.py to set wasm_archive pointing at the vendored qrcodegen wasi archive, removing the stale build_info sketch. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds spy/cli/spy_toml.py: reads extra-vm-modules from spy.toml (relative paths resolved against the toml file's directory) and merges them with any --extra-vm-module CLI flags. Adds --extra-vm-module and --no-spy-toml options to Base_Args so every CLI command inherits them automatically. Updates init_vm() to read spy.toml from the source directory and pass the combined module list to SPyVM(). Switches from async_new() to the sync SPyVM() constructor since WASM bundling is synchronous and we don't run in a Pyodide context from the CLI. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When no extra modules are requested, init_vm() falls back to async_new() so that Pyodide/Node execution (where LLMOD is None and requires async WASM loading) continues to work. The sync SPyVM() path is only taken when extra_vm_modules is non-empty, since bundling requires a native zig toolchain that is not available in Pyodide. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire up the qrcodegen out-of-tree module end to end: C glue (qrcodegen_spy.c) bridging spy str/bytes to the vendored qrcodegen library, the Python bindings (encode/get_size/get_module), and a demo/main.spy that encodes a string and renders the QR code as ANSI color blocks.
Introduce CModuleBuildInfo in ModuleRegistry so out-of-tree modules can declare the C-build metadata (headers, include dirs, archives) the C backend needs: - registry.py: new CModuleBuildInfo dataclass with archive_specs (build_dir/<target>/archive_name layout, mirroring libspy/build/), include_dirs, and headers. ModuleRegistry gains a build_info field. - vm.py: collect build_infos in a dict keyed by modname as modules are registered. - context.py: add_include_maybe emits #include for out-of-tree builtins that have a c_header declared in their build_info. - cbackend.py: collect extra include dirs and target-resolved archives from vm.c_build_infos and pass them to NinjaWriter. - ninja.py: NinjaWriter.write accepts extra_include_dirs and extra_archives, adding them to cflags/ldflags. The mymod test fixture gains a C implementation (mymod.c/mymod.h + Makefile) and sets build_info so test_simple now passes on the C backend too. The spyvm_qrcodegen example is updated to bundle glue + vendored library into a single archive (one archive per target, same layout as libspy) and exposes build_info so `spy build` works end-to-end. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
|
@seibert it took much longer than expected but now I have something which seem to work. To run
|
No description provided.