Draft
Conversation
Pre-merge checklist
|
docker_images() shells out to `docker images` on every call. It is called once per DependencySet construction (~34 times during a full mkpipeline run). Adding @cache avoids redundant subprocess calls. Measured on dev machine: 34 uncached calls: 0.987s (29.0ms each) 1 uncached + 33 cached: 0.000s Savings: ~0.96s Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
trim_tests_pipeline loads each composition with munge_services=True to discover image dependencies. This triggers expensive fingerprinting and dependency resolution for every composition. Since we only need to know which mzbuild images a composition references (not their fingerprints), use munge_services=False and extract image names directly from the service configs. Measured on dev machine (36 compositions): munge_services=True: 5.82s munge_services=False: 2.95s Savings: 2.87s (2x speedup) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
list-workflows only needs to enumerate workflow function names from the mzcompose.py module. It does not need resolved image specs or fingerprints. Pass munge_services=False to skip the expensive dependency resolution. This is called once per CI step from the mzcompose plugin hook. Measured on dev machine (cluster composition): munge_services=True: 2.454s munge_services=False: 0.075s Savings: 2.379s per invocation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fetch_hashes resolves dependencies for both architectures sequentially, each involving expensive file fingerprinting. Since the two arch builds are completely independent, resolve them in parallel using ThreadPoolExecutor. Measured on dev machine: Sequential: 8.06s Parallel: 1.90s Savings: 6.16s (4.2x speedup) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `mzcompose description` command only needs the module docstring, which is available without expensive dependency resolution and fingerprinting. Use munge_services=False to skip that work. This is called once per CI step via the mzcompose buildkite plugin command hook (line 100: `TEST_DESC="$(mzcompose description)"`). Measured savings: ~2.5s per `mzcompose description` call munge_services=True: 2.55s (full load_composition) munge_services=False: 0.23s (full load_composition) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The `mzcompose describe` (aka `ls`/`list`) command only displays service names, workflow names/docstrings, and the composition description. All of this data is available without expensive dependency resolution and fingerprinting. This is primarily a local development speedup since describe is not called in CI, but it makes `mzcompose ls` much more responsive. Measured savings: ~2.5s per `mzcompose describe` call munge_services=True: 2.78s (full load_composition) munge_services=False: 0.25s (full load_composition) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When resolving image dependencies, each Rust crate's input files were discovered via individual `git diff` + `git ls-files` subprocess calls. With ~118 crates across the workspace, this meant ~236 subprocess calls just for crate file enumeration. Add Workspace.precompute_crate_inputs() which does a single pair of git calls to discover all crate files at once, then partitions the results by crate path in Python. This is called automatically at the start of resolve_dependencies(). Measured savings for resolve_dependencies(all 41 images): Before: 4.80s After: 2.84s Savings: 1.96s (41%) Measured savings for single composition (pg-cdc, munge_services=True): Before: 2.57s After: 0.78s Savings: 1.80s (70%) This benefits every `mzcompose up` and `mzcompose run` call in CI. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When computing image fingerprints, each image's context files were discovered via individual `git diff` + `git ls-files` subprocess calls. With 41 images, this meant 82 subprocess calls just for image context enumeration. Add Repository._precompute_image_context_files() which does a single pair of git calls to discover all image context files at once, then partitions results by image path. This is called automatically at the start of resolve_dependencies(). Combined with the crate input batching from the previous commit: Measured savings for resolve_dependencies(all 41 images): Before (no batching): 4.17s After (both batched): 1.85s Savings: 2.32s (56%) Measured savings for single composition (pg-cdc, munge_services=True): Before: 2.43s After: 0.64s Savings: 1.79s (74%) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Two changes eliminate the remaining ~82 git subprocess calls from the
fingerprinting path:
1. CargoPreImage.inputs() now resolves its hardcoded inputs eagerly:
- 'ci/builder' directory is expanded to individual files via a single
cached expand_globs call
- '.cargo/config' is included only if it exists
- Result is cached with @cache since it's the same for all images
2. ResolvedImage.fingerprint() skips the expand_globs verification
pass when precomputed data is available, since all inputs are
already individual file paths from git.
This eliminates all git subprocess calls from resolve_dependencies,
reducing the total from ~384 calls (baseline) to just 5 (2 for crate
batch + 2 for image batch + 1 for ci/builder).
Measured savings for resolve_dependencies(all 41 images):
Before (no batching): 4.15s
After (all batching): 0.40s
Savings: 3.75s (90%)
Measured savings for single composition (pg-cdc, munge_services=True):
Before: 2.23s
After: 0.26s
Savings: 1.97s (88%)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The crate and image context precomputation used str(path) directly for
both git pathspecs and the file partitioning step. This works when the
repository root is a relative path (Path(".")), but fails when it's an
absolute path (as it is when MZ_ROOT is set in CI via mzcompose).
The issue: git --relative outputs paths relative to cwd, but the
partition logic compared these relative paths against potentially
absolute image/crate paths, causing the startswith() check to fail.
This left _context_files_cache empty, triggering the "files are
unknown to git" assertion.
Fix: use path.relative_to(root) to normalize all paths before
constructing git specs and partitioning results.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Pyright's type checker requires attributes to be declared on the class. Declare _inputs_cache on Crate and _context_files_cache on Image as Optional[set[str]] fields, initialized to None in __init__, and replace hasattr() checks with `is not None` comparisons. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.