Skip to content

Cargo resolver-v2 host/target feature-world split#157

Open
stepango-xai wants to merge 8 commits into
hermeticbuild:mainfrom
stepango-xai:feature-worlds-poc
Open

Cargo resolver-v2 host/target feature-world split#157
stepango-xai wants to merge 8 commits into
hermeticbuild:mainfrom
stepango-xai:feature-worlds-poc

Conversation

@stepango-xai

@stepango-xai stepango-xai commented Jun 17, 2026

Copy link
Copy Markdown

rules_rs resolves crate features globally per (crate, platform-triple), so a feature enabled only for a runtime/target dependency (a TLS backend, an allocator, …) leaks into every build-script subtree, dragging runtime-only work — C builds, heavy codegen — into the exec configuration that nothing in the build world actually needs. This PR resolves features in two worldstarget (normal-dep state) and host (build-dependency subtrees + their transitive deps) — matching cargo resolver = "2" semantics, so host-world (exec-config) compiles stop carrying runtime-only features.

Why

A workspace links rustls with the aws-lc-rs backend through a normal (target) dependency. Today that single unified rustls node carries aws-lc-rs into every build-script tree too, forcing the aws-lc-sys C build into the exec configuration even though no build script needs it. Cargo's resolver v2 resolves the build-dependency graph separately (here, ring-only); this PR brings that behavior to rules_rs, dropping such builds out of the exec cone.

Scope note: the split is build-script-only. Proc-macro edges do not cross worlds, so there is no archive sniffing and no network fetch at resolution time — an earlier revision's proc-macro detection was removed. The pre-existing rust_crate is_proc_macro compile attr (rule selection) is untouched.

Changes

  • rs/private/resolver.bzl (+300) — per-world feature state. Target world is eager; the host world is lazily materialized only for crates it actually reaches. Work items are package_index * 2 + world. Edge classification (cargo resolver v2): build-dependency edges cross into the host world, other edges stay in the originating world, and host-ness is sticky downward through the host subgraph. Workspace members are world boundaries (their gazelle-owned single targets must link one instance per crate). "Amphibious" handling for member [build-dependencies]: the edge stays target-world for activation/labels, but its features are also seeded into the host instance so a crate shared by a member's build script and a build-dep tree agrees across worlds. Also fixes a first-match break bug in subfeature forwarding and a _MAX_ROUNDS off-by-one.
  • Rendering (cargo_workspace_graph.bzl +339, rust_crate.bzl +93) — classify_worlds tags every crate unactivated / target_only / host_only / identical / divergent / label_divergent. Crates whose worlds diverge (or whose host-world dep labels diverge) get a sibling :<name>_host (and _bs_host for the build-script variant); generated dep edges select the world. Divergence propagates along host-world normal-dep edges (iterative Kahn, deps before dependents). Crates no world activates render a loud @platforms//:incompatible stub, and gen_binaries hub aliases are skipped for them.
  • rs/extensions.bzl (+60) / rs/private/repository_utils.bzl (+51) — wire per-world resolution into the module extension and thread the host-world attrs through spoke generation; emit a debug class-count report.
  • rs/private/cargo_workspace_graph_test.bzl (+957) — unit coverage for edge classification, world boundaries, amphibious member build-deps, divergence propagation, and the rendering classes.

This is the single resolver behavior (no opt-in flag) — a deliberate question for reviewers, see Open questions.

Validation

CheckResult
Resolver + rendering unit testsbazel test //rs/private:all45 / 45 pass
buildifier --mode=checkclean on all 6 changed .bzl
End-to-end on a real build-dep-heavy tree

test/build_dep_features depends on swc_ecma_preset_env = "34.0.0", whose graph diverges between the runtime and build-script worlds.

$ bazel build @build_dep_features//:swc_ecma_preset_env
# 344 packages / 31,109 targets configured
# Build completed successfully, 1679 total actions → libswc_ecma_preset_env.rlib

The renderer emits 8 _host world-split siblings for the crates that diverge (build-dep crates like cc compile [for tool] in the exec config):

serde_host  serde_core_host  serde_json_host  serde_derive_host
syn_host  quote_host  proc-macro2_host  precomputed-map_host
Lockfile churntest/MODULE.bazel.lock regenerates byte-identical to pre-PR (sha256 match) — net zero lockfile diff.

Open questions (feedback wanted)

TopicQuestion
Single algorithm vs flagThis replaces the resolver outright rather than gating behind an opt-in attr. Prefer a transition flag, or commit to resolver-v2 semantics as the default?
Proc-macro worldsDeliberately out of scope: proc-macro edges don't cross worlds, so no proc-macro detection is needed. Splitting proc-macro worlds too would require a detection mechanism (previously prototyped via archive sniffing, since dropped). Worth pursuing later, or is build-script scope the right stopping point?
DiagnosticsAn earlier revision emitted a worlds_report.json divergence report in the hub; dropped here to keep the surface minimal. Re-add behind a debug flag if useful?

Branch: stepango-xai:feature-worlds-poc · net +1,746 / −54 across 6 Starlark files · validated end-to-end against real crates.io crates.

Resolve crate features in two worlds — target and host (proc-macro +
build-script dependency trees and their transitive deps) — matching
cargo `resolver = "2"` semantics. Today features are unified globally per
(crate, platform-triple), so a single feature-enabled node (e.g. a TLS
backend pulled in by a target-only dependency) leaks into every
proc-macro and build-script tree, inflating the host/exec build graph.

Resolver:
- per-world feature state with lazy host-world allocation
- edge-kind world classification: build-dependency edges and edges into
  proc-macros cross into the host world; classification is sticky
  downward through the host subgraph
- in-round worklist draining and per-world optional / `dep:` /
  `dep?/feature` handling
- workspace members are world boundaries: member edges never cross
  worlds and member build scripts keep base instances
- fixes a first-match `break` bug in subfeature forwarding and a
  `_MAX_ROUNDS` off-by-one

Rendering:
- crates whose worlds diverge get `<name>_host` / `_bs_host` siblings;
  generated dep edges select the world
- per-triple world-merged base views
- `@platforms//:incompatible` stubs for never-activated crates
- `worlds_report.json` divergence report in the hub

from_cargo attrs:
- `resolver_feature_worlds = "unified" | "split"` (default "unified",
  byte-identical generated output) gates the new behavior
- `proc_macro_packages` supplies the proc-macro package set

Adds resolver and rendering unit tests for the split.
@stepango-xai stepango-xai changed the title Cargo resolver-v2 host/target feature-world split [WIP] Cargo resolver-v2 host/target feature-world split Jun 17, 2026
@dzbarsky

Copy link
Copy Markdown
Member

@stepango-xai Hi, thanks for taking a stab at this! This overunification has been a long-standing worry of mine and I'd be excited to get things split properly.

That said, I feel like we are adding a lot of code here and I would love if we can get to a shorter implementation that is easier to verify/review. One thing that might help is to not retain the unified mode; it's strictly less correct :)

Another thing to keep in mind, while the current unification inflates the build graph in terms of the union, it also decreases the build graph in terms of not duplicating crates (or at least it will once we land hermeticbuild/rules_rust#24 so we stop optimizing build scripts and path-mapping will be able to dedupe them in same-platform builds). I'm not saying it's a reason not to pursue this fix, but just mentioning for completeness :)

Anyway, I'm happy to see some interest here and look forward to reviewing a cleaner submission once you've had a chance to deslop a bit!

@stepango-xai

Copy link
Copy Markdown
Author

@dzbarsky thanks for the feedback, I was thinking about removing unified mode, but ultimately decided to keep both for backward compatibility since I was not sure if this is smth team will want to keep. I'll update the PR with simplified version.

Make the resolver-v2 feature-world split the single, always-on algorithm:
remove the legacy 'unified' resolver and the resolver_feature_worlds flag
threaded through the resolver/rendering (the dual data paths, in-round-vs-
plain iteration fork, and the preserved first-match break bug go with it).
Drop the worlds_report.json hub diagnostic and revert to the inline
_hub_repo(contents=...) form.

Split behavior is unchanged; //rs/private unit tests: 50/50 pass (the only
removed test is the unified-mode parity case).
With unified mode gone there is only one resolver, so the 'split mode' /
'split-mode' qualifiers no longer distinguish anything. Reword comments and
debug strings, rename world_split_attrs -> host_world_attrs, and drop the
redundant split_ prefix from the feature tests (kept split_lockfile_packages,
which is unrelated). The feature name 'feature-world split' stays. No behavior
change; //rs/private tests 50/50.
@dzbarsky

Copy link
Copy Markdown
Member

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 74cfd05bb9

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread rs/extensions.bzl Outdated
Comment thread rs/rust_crate.bzl
Replace the offline proc_macro_packages JSON with automatic detection: the
sparse index carries no proc-macro bit, so for each sparse+ crate download
its .crate archive in parallel (sha256-verified), extract Cargo.toml, and
read [lib] proc-macro. The bit is stamped into the crate's fact and persisted
in the facts cache, so warm runs reuse it with no re-download. Facts written
before this change lack is_proc_macro and are treated as stale (re-sniffed) to
avoid silently misclassifying registry proc-macros on a warm cache.

Shared dl-template parsing / .crate URL construction factored into
registry_utils (parse_dl_template, crate_archive_url) and reused by the spoke
crate_repository and registry_config_repository (no behavior change there).

Removes the proc_macro_packages from_cargo attr and all its plumbing.

Verification: //rs/private 50/50; new test/proc_macro_sniff integration test
builds @proc_macro_sniff//:proc-macro2_host, which exists only because
serde_derive was sniffed as a proc-macro and crossed into the host world.
@stepango-xai

Copy link
Copy Markdown
Author

That's pretty much the last change I was planning at this point, running validations internally. @dzbarsky Do let me know if you have any more ideas for improvements/simplifications or extra testing

@dzbarsky

Copy link
Copy Markdown
Member

That's pretty much the last change I was planning at this point, running validations internally. @dzbarsky Do let me know if you have any more ideas for improvements/simplifications or extra testing

BTW can you share an example that was failing for you? I'm assuming you are cross-compiling; was the failure in proc macros or build scripts? I am guessing/hoping it was in build scripts, feels like proc macros would hopefully be simple enough that they don't need platform-specific features (in which case we can remove all the proc-macro complications added here).

My concern is that, as you've discovered, there's not a good way to discover if a crate is a proc macro without downloading a lot of bytes. That would be a pretty major regression in the current performance.

An unactivated crate renders only the @platforms//:incompatible stub, so
rust_crate never creates its <binary>__bin targets. Emitting the
<name>-<version>__<binary> hub aliases anyway left them dangling. Skip them
for unactivated crates (gen_binaries on an unreached crate is a no-op).

Addresses a review comment.
@stepango-xai

stepango-xai commented Jun 18, 2026

Copy link
Copy Markdown
Author

@dzbarsky thanks — answers, plus a proposal that I think lines up with where you're steering:

The failing example. Cross-compiling (linux target; host/exec = the build platform). The expensive exec-config action was the aws-lc-sys C build. It got dragged into the host world through both kinds of edge:

  • build scripts — crates whose build.rs links tonic with a rustls/aws-lc-rs TLS feature, and
  • a proc-macro — sqlx-macrossqlx-corerustls/aws-lc-rs.

So it wasn't purely build scripts; the sqlx-macros chain was a real contributor (the headline was aws-lc-sys leaving the exec config entirely, which needed both cut).

On the proc-macro complications / the download cost. You're right that detection is the expensive part — and it's only needed for proc-macros. Build-dependency edges are already known from the manifest/index, so a build-script-only world split needs no archive sniffing at all. That suggests splitting this into:

  1. build-script-only feature-world split — no downloader.bzl/registry_utils.bzl changes, no .crate downloads, much less code; captures the -sys/build-script bloat (the majority of it).
  2. proc-macro feature-splitting as a separate, optional follow-up, only if the residual proc-macro-world bloat justifies the detection cost.

I'm happy to cut this PR down to (1) and drop the sniffing entirely — that removes most of the review surface you flagged. Does that match what you had in mind? If (2) is ever wanted, the facts cache keeps warm runs free, but I agree paying it on cold resolution isn't great.

Host world captures build-dependency subtrees only; proc-macro edges no
longer cross worlds. This removes the need to know which crates are
proc-macros at resolution time, so all archive-sniffing goes away: the
downloader / registry_utils / registry_config / crate_repository changes
revert to upstream, and the is_proc_macro resolution plumbing + the
proc_macro_sniff test workspace are removed.

Kept: build-dep crossing, workspace-member world boundaries, the amphibious
member-build-dep handling, and the full rendering layer (_host/_bs_host
siblings, classify_worlds, render_world_views, unactivated stubs). The
pre-existing rust_crate is_proc_macro *compile* attr (rule selection) is
untouched.

Net -480 lines vs the full split. //rs/private 45/45; @build_dep_features
builds end-to-end; buildifier clean.
@stepango-xai

Copy link
Copy Markdown
Author

stepango-xai and others added 2 commits June 18, 2026 14:02
Reduce to build-script-only feature-world split (no proc-macro sniffing)
9566c82 removed the is_proc_macro resolution plumbing but did not regenerate
the test lockfile, so its facts cache still carried the sniff-era residue: an
is_proc_macro field on every registry crate fact (2527 entries), a spurious
syn 2.0.118 entry (pinned by no test Cargo.lock), and 4 reordered dep lists.
The build-script-only resolver never writes or reads is_proc_macro for sparse
crates, and warm runs copy the dead field through verbatim, so it never
self-healed.

Regenerating from scratch (bazel mod deps with fresh facts) reproduces the
pre-PR lockfile byte-for-byte (sha256 match), shrinking the PR's lockfile diff
from 2528/2527 to zero.
@stepango-xai

Copy link
Copy Markdown
Author

@dzbarsky I think we should be good to go with build-script support. I'll run some benchmarks internally to see how much benefits we'll get with build-script only comparing to proc-macro and probably follow up another PR.

@stepango-xai stepango-xai changed the title [WIP] Cargo resolver-v2 host/target feature-world split Cargo resolver-v2 host/target feature-world split Jun 19, 2026
@dzbarsky

Copy link
Copy Markdown
Member

@stepango-xai I was also curious - when you cross-compile, are you doing so exclusively or do you target host as well? I.e. I'm wondering if something like this PR (which would make build script optimization match the target optimziation, vs always forcing -c opt like exec does) and enabling path-mapping (which can dedupe target/exec configurations if the flags are all the same) would help? (By collapsing the aws-lc-sys to a single action that is shared for target and exec, when those platforms are the same)

@stepango-xai

Copy link
Copy Markdown
Author

Not exclusively cross - we build for the host platform too e.g. aarch64-apple-darwin, aarch64-unknown-linux-gnu, x86_64-unknown-linux-gnu, so it's a mix: same-platform CI (e.g. x86_64-linux exec building x86_64-linux), plus genuinely cross configs - local dev on aarch64-darwin building the linux targets, and cross-arch (arm64 <> x86_64).

I think hermeticbuild/rules_rust#24 + path-mapping and the world split would work together well:

  • Config duplication (same platform, exec vs target differ only by flags): Use target compilation mode for Cargo build scripts rules_rust#24 + path-mapping should collapse aws-lc-sys into one shared action and recover most of the win there without any resolver change. Agreed.
  • Feature over-unification (the host/exec world carrying aws-lc-rs at all): for the cross configs (exec ≠ target - different triple), there's nothing for path-mapping to dedupe, so the exec-config aws-lc-sys only disappears with the split. That's its irreducible value, and for us the darwin-exec -> linux-target path is a big chunk of local-dev builds.

@dzbarsky

Copy link
Copy Markdown
Member

Cool, for now I've cut bazelbuild/bazel-central-registry#9368 which contains hermeticbuild/rules_rust#24 - this should let you enable path mapping and get exec/target dedupe on same-platform builds, assuming you have a new enough Bazel with bazelbuild/bazel#29542

For the feature split - I want to take a bit of time to prototype some alternatives and see if I can come up with something else that is a bit more streamlined, thanks for your patience!

@stepango-xai

Copy link
Copy Markdown
Author

Sounds great! No problem at all. Let me test out the 0.0.93. Let me know if I can help with anything.

@dzbarsky

Copy link
Copy Markdown
Member

@stepango-xai still a bit sloppy but want to give #158 a spin? Note that neither that PR nor this one will help your sqlx issue since that's a proc macro, but should help with build scripts.

For proc macros I think we need to lobby Cargo folks to make it easier to statically determine which crates are proc macros. Luckily we have some on payroll now, lemme poke them when I'm back from pat leave :)

@stepango-xai

Copy link
Copy Markdown
Author

@dzbarsky thanks a lot! I'll test it out once I can make monorepo to built with 0.0.93, non-hermetic build scripts is all over the place. Some metadata that will help detecting proc macros easy would be 🔥 And congrats with new family member 🎉

@dzbarsky

Copy link
Copy Markdown
Member

You can allowlist existing violations to make it easier. But yeah there's a lot of naughty crates out there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants