Cargo resolver-v2 host/target feature-world split#157
Conversation
Resolve crate features in two worlds — target and host (proc-macro + build-script dependency trees and their transitive deps) — matching cargo `resolver = "2"` semantics. Today features are unified globally per (crate, platform-triple), so a single feature-enabled node (e.g. a TLS backend pulled in by a target-only dependency) leaks into every proc-macro and build-script tree, inflating the host/exec build graph. Resolver: - per-world feature state with lazy host-world allocation - edge-kind world classification: build-dependency edges and edges into proc-macros cross into the host world; classification is sticky downward through the host subgraph - in-round worklist draining and per-world optional / `dep:` / `dep?/feature` handling - workspace members are world boundaries: member edges never cross worlds and member build scripts keep base instances - fixes a first-match `break` bug in subfeature forwarding and a `_MAX_ROUNDS` off-by-one Rendering: - crates whose worlds diverge get `<name>_host` / `_bs_host` siblings; generated dep edges select the world - per-triple world-merged base views - `@platforms//:incompatible` stubs for never-activated crates - `worlds_report.json` divergence report in the hub from_cargo attrs: - `resolver_feature_worlds = "unified" | "split"` (default "unified", byte-identical generated output) gates the new behavior - `proc_macro_packages` supplies the proc-macro package set Adds resolver and rendering unit tests for the split.
|
@stepango-xai Hi, thanks for taking a stab at this! This overunification has been a long-standing worry of mine and I'd be excited to get things split properly. That said, I feel like we are adding a lot of code here and I would love if we can get to a shorter implementation that is easier to verify/review. One thing that might help is to not retain the unified mode; it's strictly less correct :) Another thing to keep in mind, while the current unification inflates the build graph in terms of the union, it also decreases the build graph in terms of not duplicating crates (or at least it will once we land hermeticbuild/rules_rust#24 so we stop optimizing build scripts and path-mapping will be able to dedupe them in same-platform builds). I'm not saying it's a reason not to pursue this fix, but just mentioning for completeness :) Anyway, I'm happy to see some interest here and look forward to reviewing a cleaner submission once you've had a chance to deslop a bit! |
|
@dzbarsky thanks for the feedback, I was thinking about removing unified mode, but ultimately decided to keep both for backward compatibility since I was not sure if this is smth team will want to keep. I'll update the PR with simplified version. |
Make the resolver-v2 feature-world split the single, always-on algorithm: remove the legacy 'unified' resolver and the resolver_feature_worlds flag threaded through the resolver/rendering (the dual data paths, in-round-vs- plain iteration fork, and the preserved first-match break bug go with it). Drop the worlds_report.json hub diagnostic and revert to the inline _hub_repo(contents=...) form. Split behavior is unchanged; //rs/private unit tests: 50/50 pass (the only removed test is the unified-mode parity case).
With unified mode gone there is only one resolver, so the 'split mode' / 'split-mode' qualifiers no longer distinguish anything. Reword comments and debug strings, rename world_split_attrs -> host_world_attrs, and drop the redundant split_ prefix from the feature tests (kept split_lockfile_packages, which is unrelated). The feature name 'feature-world split' stays. No behavior change; //rs/private tests 50/50.
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 74cfd05bb9
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Replace the offline proc_macro_packages JSON with automatic detection: the sparse index carries no proc-macro bit, so for each sparse+ crate download its .crate archive in parallel (sha256-verified), extract Cargo.toml, and read [lib] proc-macro. The bit is stamped into the crate's fact and persisted in the facts cache, so warm runs reuse it with no re-download. Facts written before this change lack is_proc_macro and are treated as stale (re-sniffed) to avoid silently misclassifying registry proc-macros on a warm cache. Shared dl-template parsing / .crate URL construction factored into registry_utils (parse_dl_template, crate_archive_url) and reused by the spoke crate_repository and registry_config_repository (no behavior change there). Removes the proc_macro_packages from_cargo attr and all its plumbing. Verification: //rs/private 50/50; new test/proc_macro_sniff integration test builds @proc_macro_sniff//:proc-macro2_host, which exists only because serde_derive was sniffed as a proc-macro and crossed into the host world.
|
That's pretty much the last change I was planning at this point, running validations internally. @dzbarsky Do let me know if you have any more ideas for improvements/simplifications or extra testing |
BTW can you share an example that was failing for you? I'm assuming you are cross-compiling; was the failure in proc macros or build scripts? I am guessing/hoping it was in build scripts, feels like proc macros would hopefully be simple enough that they don't need platform-specific features (in which case we can remove all the proc-macro complications added here). My concern is that, as you've discovered, there's not a good way to discover if a crate is a proc macro without downloading a lot of bytes. That would be a pretty major regression in the current performance. |
An unactivated crate renders only the @platforms//:incompatible stub, so rust_crate never creates its <binary>__bin targets. Emitting the <name>-<version>__<binary> hub aliases anyway left them dangling. Skip them for unactivated crates (gen_binaries on an unreached crate is a no-op). Addresses a review comment.
|
@dzbarsky thanks — answers, plus a proposal that I think lines up with where you're steering: The failing example. Cross-compiling (linux target; host/exec = the build platform). The expensive exec-config action was the
So it wasn't purely build scripts; the On the proc-macro complications / the download cost. You're right that detection is the expensive part — and it's only needed for proc-macros. Build-dependency edges are already known from the manifest/index, so a build-script-only world split needs no archive sniffing at all. That suggests splitting this into:
I'm happy to cut this PR down to (1) and drop the sniffing entirely — that removes most of the review surface you flagged. Does that match what you had in mind? If (2) is ever wanted, the facts cache keeps warm runs free, but I agree paying it on cold resolution isn't great. |
Host world captures build-dependency subtrees only; proc-macro edges no longer cross worlds. This removes the need to know which crates are proc-macros at resolution time, so all archive-sniffing goes away: the downloader / registry_utils / registry_config / crate_repository changes revert to upstream, and the is_proc_macro resolution plumbing + the proc_macro_sniff test workspace are removed. Kept: build-dep crossing, workspace-member world boundaries, the amphibious member-build-dep handling, and the full rendering layer (_host/_bs_host siblings, classify_worlds, render_world_views, unactivated stubs). The pre-existing rust_crate is_proc_macro *compile* attr (rule selection) is untouched. Net -480 lines vs the full split. //rs/private 45/45; @build_dep_features builds end-to-end; buildifier clean.
Reduce to build-script-only feature-world split (no proc-macro sniffing)
9566c82 removed the is_proc_macro resolution plumbing but did not regenerate the test lockfile, so its facts cache still carried the sniff-era residue: an is_proc_macro field on every registry crate fact (2527 entries), a spurious syn 2.0.118 entry (pinned by no test Cargo.lock), and 4 reordered dep lists. The build-script-only resolver never writes or reads is_proc_macro for sparse crates, and warm runs copy the dead field through verbatim, so it never self-healed. Regenerating from scratch (bazel mod deps with fresh facts) reproduces the pre-PR lockfile byte-for-byte (sha256 match), shrinking the PR's lockfile diff from 2528/2527 to zero.
|
@dzbarsky I think we should be good to go with build-script support. I'll run some benchmarks internally to see how much benefits we'll get with build-script only comparing to proc-macro and probably follow up another PR. |
|
@stepango-xai I was also curious - when you cross-compile, are you doing so exclusively or do you target host as well? I.e. I'm wondering if something like this PR (which would make build script optimization match the target optimziation, vs always forcing |
|
Not exclusively cross - we build for the host platform too e.g. aarch64-apple-darwin, aarch64-unknown-linux-gnu, x86_64-unknown-linux-gnu, so it's a mix: same-platform CI (e.g. x86_64-linux exec building x86_64-linux), plus genuinely cross configs - local dev on aarch64-darwin building the linux targets, and cross-arch (arm64 <> x86_64). I think hermeticbuild/rules_rust#24 + path-mapping and the world split would work together well:
|
|
Cool, for now I've cut bazelbuild/bazel-central-registry#9368 which contains hermeticbuild/rules_rust#24 - this should let you enable path mapping and get exec/target dedupe on same-platform builds, assuming you have a new enough Bazel with bazelbuild/bazel#29542 For the feature split - I want to take a bit of time to prototype some alternatives and see if I can come up with something else that is a bit more streamlined, thanks for your patience! |
|
Sounds great! No problem at all. Let me test out the 0.0.93. Let me know if I can help with anything. |
|
@stepango-xai still a bit sloppy but want to give #158 a spin? Note that neither that PR nor this one will help your sqlx issue since that's a proc macro, but should help with build scripts. For proc macros I think we need to lobby Cargo folks to make it easier to statically determine which crates are proc macros. Luckily we have some on payroll now, lemme poke them when I'm back from pat leave :) |
|
@dzbarsky thanks a lot! I'll test it out once I can make monorepo to built with 0.0.93, non-hermetic build scripts is all over the place. Some metadata that will help detecting proc macros easy would be 🔥 And congrats with new family member 🎉 |
|
You can allowlist existing violations to make it easier. But yeah there's a lot of naughty crates out there |
rules_rsresolves crate features globally per(crate, platform-triple), so a feature enabled only for a runtime/target dependency (a TLS backend, an allocator, …) leaks into every build-script subtree, dragging runtime-only work — C builds, heavy codegen — into the exec configuration that nothing in the build world actually needs. This PR resolves features in two worlds — target (normal-dep state) and host (build-dependency subtrees + their transitive deps) — matching cargoresolver = "2"semantics, so host-world (exec-config) compiles stop carrying runtime-only features.Why
A workspace links
rustlswith theaws-lc-rsbackend through a normal (target) dependency. Today that single unifiedrustlsnode carriesaws-lc-rsinto every build-script tree too, forcing theaws-lc-sysC build into the exec configuration even though no build script needs it. Cargo's resolver v2 resolves the build-dependency graph separately (here,ring-only); this PR brings that behavior torules_rs, dropping such builds out of the exec cone.Scope note: the split is build-script-only. Proc-macro edges do not cross worlds, so there is no archive sniffing and no network fetch at resolution time — an earlier revision's proc-macro detection was removed. The pre-existing
rust_crateis_proc_macrocompile attr (rule selection) is untouched.Changes
rs/private/resolver.bzl(+300) — per-world feature state. Target world is eager; the host world is lazily materialized only for crates it actually reaches. Work items arepackage_index * 2 + world. Edge classification (cargo resolver v2): build-dependency edges cross into the host world, other edges stay in the originating world, and host-ness is sticky downward through the host subgraph. Workspace members are world boundaries (their gazelle-owned single targets must link one instance per crate). "Amphibious" handling for member[build-dependencies]: the edge stays target-world for activation/labels, but its features are also seeded into the host instance so a crate shared by a member's build script and a build-dep tree agrees across worlds. Also fixes a first-matchbreakbug in subfeature forwarding and a_MAX_ROUNDSoff-by-one.cargo_workspace_graph.bzl+339,rust_crate.bzl+93) —classify_worldstags every crateunactivated/target_only/host_only/identical/divergent/label_divergent. Crates whose worlds diverge (or whose host-world dep labels diverge) get a sibling:<name>_host(and_bs_hostfor the build-script variant); generated dep edges select the world. Divergence propagates along host-world normal-dep edges (iterative Kahn, deps before dependents). Crates no world activates render a loud@platforms//:incompatiblestub, andgen_binarieshub aliases are skipped for them.rs/extensions.bzl(+60) /rs/private/repository_utils.bzl(+51) — wire per-world resolution into the module extension and thread the host-world attrs through spoke generation; emit a debug class-count report.rs/private/cargo_workspace_graph_test.bzl(+957) — unit coverage for edge classification, world boundaries, amphibious member build-deps, divergence propagation, and the rendering classes.This is the single resolver behavior (no opt-in flag) — a deliberate question for reviewers, see Open questions.
Validation
bazel test //rs/private:all→ 45 / 45 passbuildifier --mode=check.bzltest/build_dep_featuresdepends onswc_ecma_preset_env = "34.0.0", whose graph diverges between the runtime and build-script worlds.The renderer emits 8
_hostworld-split siblings for the crates that diverge (build-dep crates likecccompile[for tool]in the exec config):test/MODULE.bazel.lockregenerates byte-identical to pre-PR (sha256 match) — net zero lockfile diff.Open questions (feedback wanted)
worlds_report.jsondivergence report in the hub; dropped here to keep the surface minimal. Re-add behind a debug flag if useful?Branch:
stepango-xai:feature-worlds-poc· net +1,746 / −54 across 6 Starlark files · validated end-to-end against real crates.io crates.