packing: replace greedy merge with statistical partitioning#107
packing: replace greedy merge with statistical partitioning#107jlebon wants to merge 2 commits intocoreos:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request replaces the existing greedy clustering algorithm for OCI layer packing with a new statistical partitioning approach inspired by rpm-ostree. The new algorithm classifies components into layers based on size outliers (using median and MAD) and stability tiers, utilizing name-based hashing for deterministic binning. Feedback identifies critical logic errors where components could be silently dropped if the layer budget is exhausted or if a stability tier is allocated zero bins. Additionally, a fix is required for the use of an unstable Rust feature in the median calculation.
|
Don't look too much at the code. This is still rough/raw from Opus. I still need to carefully go through it, but the results are encouraging. I made it sweep for optimal parameters for the FCOS and Silverblue sets (which hopefully generalizes well to non-Fedora data sets -- will try to do more validation there). |
2db7255 to
3a357ad
Compare
The old algorithm used a greedy merge approach (BinaryHeap-based,
minimizing TEV loss per merge) which was fundamentally unstable: small
changes in input caused completely different merge decisions, resulting
in poor layer reuse across updates.
Additionally, because of its greediness, it easily fell into a local
optimum with this giant catch-all bucket with low stability and safer
more stable items in the other layers.
Replace it with a two-phase statistical partitioning approach inspired
by rpm-ostree's chunking algorithm:
Phase 1 classifies components by size using median + MAD, giving large
components (linux-firmware, kernel, firefox) their own singleton layers.
Phase 2 assigns all remaining components to bins using stability tiers
(high/mid/low via mean+stddev) and deterministic name-based hashing,
which ensures stable bin membership across builds without needing to
track prior build state.
Also remove the stability fallback that assigned min(known)/2 to
components without stability data (xattr, bigfiles, unclaimed). These
now stay at 0.0 and are naturally handled by the stability tiers.
Benchmarked against rpm-ostree's native chunking on FCOS F43 (10
biweekly stable releases) and Silverblue F43 (10 daily builds):
FCOS: 49.2% reuse, 826 MiB avg download (rpm-ostree: 33.0%, 1.1 GiB)
Silverblue: 89.5% reuse, 543 MiB avg download (rpm-ostree: 71.8%, 1.5 GiB)
Assisted-by: OpenCode (Claude Opus 4.6)
Switch test-arch.sh and test-self.sh to use --write-manifest-to for checking the unclaimed component size. The previous approach checked the layer size via skopeo inspect, which is inaccurate since the unclaimed component may share a layer with other components. For test-self.sh, also drop the is_chunked shortcut so that we always run chunkah and produce a manifest. Assisted-by: OpenCode (Claude Opus 4.6)
|
There's plenty of other container images to look at, I think |
|
Happy to turn on whatever is necessary to help! |
The old algorithm used a greedy merge approach (BinaryHeap-based,
minimizing TEV loss per merge) which was fundamentally unstable: small
changes in input caused completely different merge decisions, resulting
in poor layer reuse across updates.
Additionally, because of its greediness, it easily fell into a local
optimum with this giant catch-all bucket with low stability and safer
more stable items in the other layers.
Replace it with a two-phase statistical partitioning approach inspired
by rpm-ostree's chunking algorithm:
Phase 1 classifies components by size using median + MAD, giving large
components (linux-firmware, kernel, firefox) their own singleton layers.
Phase 2 assigns all remaining components to bins using stability tiers
(high/mid/low via mean+stddev) and deterministic name-based hashing,
which ensures stable bin membership across builds without needing to
track prior build state.
Also remove the stability fallback that assigned min(known)/2 to
components without stability data (xattr, bigfiles, unclaimed). These
now stay at 0.0 and are naturally handled by the stability tiers.
Benchmarked against rpm-ostree's native chunking on FCOS F43 (10
biweekly stable releases) and Silverblue F43 (10 daily builds):
Assisted-by: OpenCode (Claude Opus 4.6)