Cargo resolver-v2 host/target feature-world split by stepango-xai · Pull Request #157 · hermeticbuild/rules_rs

stepango-xai · 2026-06-17T17:18:59Z

rules_rs resolves crate features globally per (crate, platform-triple), so a feature enabled only for a runtime/target dependency (a TLS backend, an allocator, …) leaks into every build-script subtree, dragging runtime-only work — C builds, heavy codegen — into the exec configuration that nothing in the build world actually needs. This PR resolves features in two worlds — target (normal-dep state) and host (build-dependency subtrees + their transitive deps) — matching cargo resolver = "2" semantics, so host-world (exec-config) compiles stop carrying runtime-only features.

Why

A workspace links rustls with the aws-lc-rs backend through a normal (target) dependency. Today that single unified rustls node carries aws-lc-rs into every build-script tree too, forcing the aws-lc-sys C build into the exec configuration even though no build script needs it. Cargo's resolver v2 resolves the build-dependency graph separately (here, ring-only); this PR brings that behavior to rules_rs, dropping such builds out of the exec cone.

_{Scope note: the split is build-script-only. Proc-macro edges do not cross worlds, so there is no archive sniffing and no network fetch at resolution time — an earlier revision's proc-macro detection was removed. The pre-existing rust_crate is_proc_macro compile attr (rule selection) is untouched.}

Changes

rs/private/resolver.bzl (+300) — per-world feature state. Target world is eager; the host world is lazily materialized only for crates it actually reaches. Work items are package_index * 2 + world. Edge classification (cargo resolver v2): build-dependency edges cross into the host world, other edges stay in the originating world, and host-ness is sticky downward through the host subgraph. Workspace members are world boundaries (their gazelle-owned single targets must link one instance per crate). "Amphibious" handling for member [build-dependencies]: the edge stays target-world for activation/labels, but its features are also seeded into the host instance so a crate shared by a member's build script and a build-dep tree agrees across worlds. Also fixes a first-match break bug in subfeature forwarding and a _MAX_ROUNDS off-by-one.
Rendering (cargo_workspace_graph.bzl +339, rust_crate.bzl +93) — classify_worlds tags every crate unactivated / target_only / host_only / identical / divergent / label_divergent. Crates whose worlds diverge (or whose host-world dep labels diverge) get a sibling :<name>_host (and _bs_host for the build-script variant); generated dep edges select the world. Divergence propagates along host-world normal-dep edges (iterative Kahn, deps before dependents). Crates no world activates render a loud @platforms//:incompatible stub, and gen_binaries hub aliases are skipped for them.
rs/extensions.bzl (+60) / rs/private/repository_utils.bzl (+51) — wire per-world resolution into the module extension and thread the host-world attrs through spoke generation; emit a debug class-count report.
rs/private/cargo_workspace_graph_test.bzl (+957) — unit coverage for edge classification, world boundaries, amphibious member build-deps, divergence propagation, and the rendering classes.

_{This is the single resolver behavior (no opt-in flag) — a deliberate question for reviewers, see Open questions.}

Validation

Check	Result
Resolver + rendering unit tests	`bazel test //rs/private:all` → 45 / 45 pass
`buildifier --mode=check`	clean on all 6 changed `.bzl`
End-to-end on a real build-dep-heavy tree	`test/build_dep_features` depends on `swc_ecma_preset_env = "34.0.0"`, whose graph diverges between the runtime and build-script worlds. `$ bazel build @build_dep_features//:swc_ecma_preset_env # 344 packages / 31,109 targets configured # Build completed successfully, 1679 total actions → libswc_ecma_preset_env.rlib` The renderer emits 8 `_host` world-split siblings for the crates that diverge (build-dep crates like `cc` compile `[for tool]` in the exec config): `serde_host serde_core_host serde_json_host serde_derive_host syn_host quote_host proc-macro2_host precomputed-map_host`
Lockfile churn	`test/MODULE.bazel.lock` regenerates byte-identical to pre-PR (sha256 match) — net zero lockfile diff.

Open questions (feedback wanted)

Topic	Question
Single algorithm vs flag	This replaces the resolver outright rather than gating behind an opt-in attr. Prefer a transition flag, or commit to resolver-v2 semantics as the default?
Proc-macro worlds	Deliberately out of scope: proc-macro edges don't cross worlds, so no proc-macro detection is needed. Splitting proc-macro worlds too would require a detection mechanism (previously prototyped via archive sniffing, since dropped). Worth pursuing later, or is build-script scope the right stopping point?
Diagnostics	An earlier revision emitted a `worlds_report.json` divergence report in the hub; dropped here to keep the surface minimal. Re-add behind a debug flag if useful?

_{Branch: stepango-xai:feature-worlds-poc · net +1,746 / −54 across 6 Starlark files · validated end-to-end against real crates.io crates.}

Resolve crate features in two worlds — target and host (proc-macro + build-script dependency trees and their transitive deps) — matching cargo `resolver = "2"` semantics. Today features are unified globally per (crate, platform-triple), so a single feature-enabled node (e.g. a TLS backend pulled in by a target-only dependency) leaks into every proc-macro and build-script tree, inflating the host/exec build graph. Resolver: - per-world feature state with lazy host-world allocation - edge-kind world classification: build-dependency edges and edges into proc-macros cross into the host world; classification is sticky downward through the host subgraph - in-round worklist draining and per-world optional / `dep:` / `dep?/feature` handling - workspace members are world boundaries: member edges never cross worlds and member build scripts keep base instances - fixes a first-match `break` bug in subfeature forwarding and a `_MAX_ROUNDS` off-by-one Rendering: - crates whose worlds diverge get `<name>_host` / `_bs_host` siblings; generated dep edges select the world - per-triple world-merged base views - `@platforms//:incompatible` stubs for never-activated crates - `worlds_report.json` divergence report in the hub from_cargo attrs: - `resolver_feature_worlds = "unified" | "split"` (default "unified", byte-identical generated output) gates the new behavior - `proc_macro_packages` supplies the proc-macro package set Adds resolver and rendering unit tests for the split.

dzbarsky · 2026-06-17T20:07:08Z

@stepango-xai Hi, thanks for taking a stab at this! This overunification has been a long-standing worry of mine and I'd be excited to get things split properly.

That said, I feel like we are adding a lot of code here and I would love if we can get to a shorter implementation that is easier to verify/review. One thing that might help is to not retain the unified mode; it's strictly less correct :)

Another thing to keep in mind, while the current unification inflates the build graph in terms of the union, it also decreases the build graph in terms of not duplicating crates (or at least it will once we land hermeticbuild/rules_rust#24 so we stop optimizing build scripts and path-mapping will be able to dedupe them in same-platform builds). I'm not saying it's a reason not to pursue this fix, but just mentioning for completeness :)

Anyway, I'm happy to see some interest here and look forward to reviewing a cleaner submission once you've had a chance to deslop a bit!

stepango-xai · 2026-06-17T20:22:36Z

@dzbarsky thanks for the feedback, I was thinking about removing unified mode, but ultimately decided to keep both for backward compatibility since I was not sure if this is smth team will want to keep. I'll update the PR with simplified version.

Make the resolver-v2 feature-world split the single, always-on algorithm: remove the legacy 'unified' resolver and the resolver_feature_worlds flag threaded through the resolver/rendering (the dual data paths, in-round-vs- plain iteration fork, and the preserved first-match break bug go with it). Drop the worlds_report.json hub diagnostic and revert to the inline _hub_repo(contents=...) form. Split behavior is unchanged; //rs/private unit tests: 50/50 pass (the only removed test is the unified-mode parity case).

With unified mode gone there is only one resolver, so the 'split mode' / 'split-mode' qualifiers no longer distinguish anything. Reword comments and debug strings, rename world_split_attrs -> host_world_attrs, and drop the redundant split_ prefix from the feature tests (kept split_lockfile_packages, which is unrelated). The feature name 'feature-world split' stays. No behavior change; //rs/private tests 50/50.

dzbarsky · 2026-06-18T02:50:34Z

@codex review

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74cfd05bb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Replace the offline proc_macro_packages JSON with automatic detection: the sparse index carries no proc-macro bit, so for each sparse+ crate download its .crate archive in parallel (sha256-verified), extract Cargo.toml, and read [lib] proc-macro. The bit is stamped into the crate's fact and persisted in the facts cache, so warm runs reuse it with no re-download. Facts written before this change lack is_proc_macro and are treated as stale (re-sniffed) to avoid silently misclassifying registry proc-macros on a warm cache. Shared dl-template parsing / .crate URL construction factored into registry_utils (parse_dl_template, crate_archive_url) and reused by the spoke crate_repository and registry_config_repository (no behavior change there). Removes the proc_macro_packages from_cargo attr and all its plumbing. Verification: //rs/private 50/50; new test/proc_macro_sniff integration test builds @proc_macro_sniff//:proc-macro2_host, which exists only because serde_derive was sniffed as a proc-macro and crossed into the host world.

stepango-xai · 2026-06-18T16:42:13Z

That's pretty much the last change I was planning at this point, running validations internally. @dzbarsky Do let me know if you have any more ideas for improvements/simplifications or extra testing

dzbarsky · 2026-06-18T16:45:24Z

That's pretty much the last change I was planning at this point, running validations internally. @dzbarsky Do let me know if you have any more ideas for improvements/simplifications or extra testing

BTW can you share an example that was failing for you? I'm assuming you are cross-compiling; was the failure in proc macros or build scripts? I am guessing/hoping it was in build scripts, feels like proc macros would hopefully be simple enough that they don't need platform-specific features (in which case we can remove all the proc-macro complications added here).

My concern is that, as you've discovered, there's not a good way to discover if a crate is a proc macro without downloading a lot of bytes. That would be a pretty major regression in the current performance.

An unactivated crate renders only the @platforms//:incompatible stub, so rust_crate never creates its <binary>__bin targets. Emitting the <name>-<version>__<binary> hub aliases anyway left them dangling. Skip them for unactivated crates (gen_binaries on an unreached crate is a no-op). Addresses a review comment.

stepango-xai · 2026-06-18T16:51:19Z

@dzbarsky thanks — answers, plus a proposal that I think lines up with where you're steering:

The failing example. Cross-compiling (linux target; host/exec = the build platform). The expensive exec-config action was the aws-lc-sys C build. It got dragged into the host world through both kinds of edge:

build scripts — crates whose build.rs links tonic with a rustls/aws-lc-rs TLS feature, and
a proc-macro — sqlx-macros → sqlx-core → rustls/aws-lc-rs.

So it wasn't purely build scripts; the sqlx-macros chain was a real contributor (the headline was aws-lc-sys leaving the exec config entirely, which needed both cut).

On the proc-macro complications / the download cost. You're right that detection is the expensive part — and it's only needed for proc-macros. Build-dependency edges are already known from the manifest/index, so a build-script-only world split needs no archive sniffing at all. That suggests splitting this into:

build-script-only feature-world split — no downloader.bzl/registry_utils.bzl changes, no .crate downloads, much less code; captures the -sys/build-script bloat (the majority of it).
proc-macro feature-splitting as a separate, optional follow-up, only if the residual proc-macro-world bloat justifies the detection cost.

I'm happy to cut this PR down to (1) and drop the sniffing entirely — that removes most of the review surface you flagged. Does that match what you had in mind? If (2) is ever wanted, the facts cache keeps warm runs free, but I agree paying it on cold resolution isn't great.

Host world captures build-dependency subtrees only; proc-macro edges no longer cross worlds. This removes the need to know which crates are proc-macros at resolution time, so all archive-sniffing goes away: the downloader / registry_utils / registry_config / crate_repository changes revert to upstream, and the is_proc_macro resolution plumbing + the proc_macro_sniff test workspace are removed. Kept: build-dep crossing, workspace-member world boundaries, the amphibious member-build-dep handling, and the full rendering layer (_host/_bs_host siblings, classify_worlds, render_world_views, unactivated stubs). The pre-existing rust_crate is_proc_macro *compile* attr (rule selection) is untouched. Net -480 lines vs the full split. //rs/private 45/45; @build_dep_features builds end-to-end; buildifier clean.

stepango-xai · 2026-06-18T18:20:44Z

e.g. https://github.com/stepango-xai/rules_rs/compare/feature-worlds-poc...stepango-xai:rules_rs:feature-worlds-build-deps-only?expand=1

Reduce to build-script-only feature-world split (no proc-macro sniffing)

9566c82 removed the is_proc_macro resolution plumbing but did not regenerate the test lockfile, so its facts cache still carried the sniff-era residue: an is_proc_macro field on every registry crate fact (2527 entries), a spurious syn 2.0.118 entry (pinned by no test Cargo.lock), and 4 reordered dep lists. The build-script-only resolver never writes or reads is_proc_macro for sparse crates, and warm runs copy the dead field through verbatim, so it never self-healed. Regenerating from scratch (bazel mod deps with fresh facts) reproduces the pre-PR lockfile byte-for-byte (sha256 match), shrinking the PR's lockfile diff from 2528/2527 to zero.

stepango-xai · 2026-06-18T23:22:25Z

@dzbarsky I think we should be good to go with build-script support. I'll run some benchmarks internally to see how much benefits we'll get with build-script only comparing to proc-macro and probably follow up another PR.

dzbarsky · 2026-06-19T19:12:48Z

@stepango-xai I was also curious - when you cross-compile, are you doing so exclusively or do you target host as well? I.e. I'm wondering if something like this PR (which would make build script optimization match the target optimziation, vs always forcing -c opt like exec does) and enabling path-mapping (which can dedupe target/exec configurations if the flags are all the same) would help? (By collapsing the aws-lc-sys to a single action that is shared for target and exec, when those platforms are the same)

stepango-xai · 2026-06-19T20:17:43Z

Not exclusively cross - we build for the host platform too e.g. aarch64-apple-darwin, aarch64-unknown-linux-gnu, x86_64-unknown-linux-gnu, so it's a mix: same-platform CI (e.g. x86_64-linux exec building x86_64-linux), plus genuinely cross configs - local dev on aarch64-darwin building the linux targets, and cross-arch (arm64 <> x86_64).

I think hermeticbuild/rules_rust#24 + path-mapping and the world split would work together well:

Config duplication (same platform, exec vs target differ only by flags): Use target compilation mode for Cargo build scripts rules_rust#24 + path-mapping should collapse aws-lc-sys into one shared action and recover most of the win there without any resolver change. Agreed.
Feature over-unification (the host/exec world carrying aws-lc-rs at all): for the cross configs (exec ≠ target - different triple), there's nothing for path-mapping to dedupe, so the exec-config aws-lc-sys only disappears with the split. That's its irreducible value, and for us the darwin-exec -> linux-target path is a big chunk of local-dev builds.

dzbarsky · 2026-06-22T18:01:55Z

Cool, for now I've cut bazelbuild/bazel-central-registry#9368 which contains hermeticbuild/rules_rust#24 - this should let you enable path mapping and get exec/target dedupe on same-platform builds, assuming you have a new enough Bazel with bazelbuild/bazel#29542

For the feature split - I want to take a bit of time to prototype some alternatives and see if I can come up with something else that is a bit more streamlined, thanks for your patience!

stepango-xai · 2026-06-22T18:10:30Z

Sounds great! No problem at all. Let me test out the 0.0.93. Let me know if I can help with anything.

dzbarsky · 2026-06-22T22:49:27Z

@stepango-xai still a bit sloppy but want to give #158 a spin? Note that neither that PR nor this one will help your sqlx issue since that's a proc macro, but should help with build scripts.

For proc macros I think we need to lobby Cargo folks to make it easier to statically determine which crates are proc macros. Luckily we have some on payroll now, lemme poke them when I'm back from pat leave :)

stepango-xai · 2026-06-22T23:39:19Z

@dzbarsky thanks a lot! I'll test it out once I can make monorepo to built with 0.0.93, non-hermetic build scripts is all over the place. Some metadata that will help detecting proc macros easy would be 🔥 And congrats with new family member 🎉

dzbarsky · 2026-06-22T23:40:34Z

You can allowlist existing violations to make it easier. But yeah there's a lot of naughty crates out there

stepango-xai changed the title ~~Cargo resolver-v2 host/target feature-world split~~ [WIP] Cargo resolver-v2 host/target feature-world split Jun 17, 2026

dzbarsky mentioned this pull request Jun 17, 2026

Is it a good idea to not depend on cargo? #150

Open

stepango-xai added 2 commits June 17, 2026 17:37

chatgpt-codex-connector Bot reviewed Jun 18, 2026

View reviewed changes

Comment thread rs/extensions.bzl Outdated

Comment thread rs/rust_crate.bzl

stepango-xai and others added 2 commits June 18, 2026 14:02

Merge pull request #1 from stepango-xai/feature-worlds-build-deps-only

68d225d

Reduce to build-script-only feature-world split (no proc-macro sniffing)

stepango-xai changed the title ~~[WIP] Cargo resolver-v2 host/target feature-world split~~ Cargo resolver-v2 host/target feature-world split Jun 19, 2026

Uh oh!

Conversation

stepango-xai commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Changes

Validation

Open questions (feedback wanted)

Uh oh!

dzbarsky commented Jun 17, 2026

Uh oh!

stepango-xai commented Jun 17, 2026

Uh oh!

dzbarsky commented Jun 18, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

stepango-xai commented Jun 18, 2026

Uh oh!

dzbarsky commented Jun 18, 2026

Uh oh!

stepango-xai commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stepango-xai commented Jun 18, 2026

Uh oh!

stepango-xai commented Jun 18, 2026

Uh oh!

dzbarsky commented Jun 19, 2026

Uh oh!

stepango-xai commented Jun 19, 2026

Uh oh!

dzbarsky commented Jun 22, 2026

Uh oh!

stepango-xai commented Jun 22, 2026

Uh oh!

dzbarsky commented Jun 22, 2026

Uh oh!

stepango-xai commented Jun 22, 2026

Uh oh!

dzbarsky commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

stepango-xai commented Jun 17, 2026 •

edited

Loading

stepango-xai commented Jun 18, 2026 •

edited

Loading