fix(gpu): minimize sharemode update delay#3299
Merged
Merged
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
eball
approved these changes
Jun 8, 2026
aby913
added a commit
that referenced
this pull request
Jun 9, 2026
* origin/main: fix(release-daemon): update udev-dev package versions for ARM architecture files archive: volume-aware extract/compress + friendlier errors (#3303) helm-charts/system-apps: bump system-frontend to v1.10.45 and user-service to v0.0.99 (#3302) fix(gpu): minimize sharemode update delay (#3299) daemon: add bridge connection watcher to monitor network carrier changes (#3298) fix(l4): add allow header x-archive-password (#3297) feat(app-service): auto-resolve template app resource requirements from rendered chart (#3295)
dkeven
added a commit
that referenced
this pull request
Jun 9, 2026
eball
added a commit
that referenced
this pull request
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Background
After refactor(scheduler): get rid of app-level GPU management HAMi#19, the share mode of all devices are managed by app-service, and the update event of node resource is watched by scheduler's node informer, although immediately, but the device update frequency is capped by the update logic, delaying the update of sharemode, we make the share mode update prior to other metadata update.
Target Version for Merge
1.12.6, 1.12.7
Related Issues
none
PRs Involving Sub-Systems
fix(scheduler): minimize sharemode update delay HAMi#21
Other information:
none
Note
Medium Risk
Touches the GPU scheduler/device-plugin version used in production clusters; behavior change is localized to share-mode refresh timing but affects scheduling-sensitive GPU workloads.
Overview
Bumps the bundled HAMi GPU virtualization stack from
v2.6.20tov2.6.21so Olares ships the upstream fix that prioritizes share-mode updates over other device metadata refresh, reducing delay after app-service manages share mode and the scheduler’s node informer sees resource events.infrastructure/gpu/.olares/config/gpu/hami/values.yamlsetsversion: "v2.6.21"(scheduler extender and device plugin images usebeclab/hamiwith this tag).platform/hami/.olares/Olares.yamlupdates the prebuilt container list tobeclab/hami:v2.6.21. No other chart or image pins change in this diff.Reviewed by Cursor Bugbot for commit 2b97a3b. Bugbot is set up for automated code reviews on this repo. Configure here.