Skip to content

fix(gpu): minimize sharemode update delay#3299

Merged
eball merged 1 commit into
mainfrom
gpu/fix/sharemode_update_delay
Jun 8, 2026
Merged

fix(gpu): minimize sharemode update delay#3299
eball merged 1 commit into
mainfrom
gpu/fix/sharemode_update_delay

Conversation

@dkeven

@dkeven dkeven commented Jun 8, 2026

Copy link
Copy Markdown
Member

Note

Medium Risk
Touches the GPU scheduler/device-plugin version used in production clusters; behavior change is localized to share-mode refresh timing but affects scheduling-sensitive GPU workloads.

Overview
Bumps the bundled HAMi GPU virtualization stack from v2.6.20 to v2.6.21 so Olares ships the upstream fix that prioritizes share-mode updates over other device metadata refresh, reducing delay after app-service manages share mode and the scheduler’s node informer sees resource events.

infrastructure/gpu/.olares/config/gpu/hami/values.yaml sets version: "v2.6.21" (scheduler extender and device plugin images use beclab/hami with this tag). platform/hami/.olares/Olares.yaml updates the prebuilt container list to beclab/hami:v2.6.21. No other chart or image pins change in this diff.

Reviewed by Cursor Bugbot for commit 2b97a3b. Bugbot is set up for automated code reviews on this repo. Configure here.

@vercel

vercel Bot commented Jun 8, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
olares-docs Ignored Ignored Jun 8, 2026 2:41pm

Request Review

@eball eball merged commit 8f23300 into main Jun 8, 2026
16 checks passed
aby913 added a commit that referenced this pull request Jun 9, 2026
* origin/main:
  fix(release-daemon): update udev-dev package versions for ARM architecture
  files archive: volume-aware extract/compress + friendlier errors (#3303)
  helm-charts/system-apps: bump system-frontend to v1.10.45 and user-service to v0.0.99 (#3302)
  fix(gpu): minimize sharemode update delay (#3299)
  daemon: add bridge connection watcher to monitor network carrier changes (#3298)
  fix(l4): add allow header x-archive-password (#3297)
  feat(app-service): auto-resolve template app resource requirements from rendered chart (#3295)
dkeven added a commit that referenced this pull request Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants