Releases: ROCm/spur
Releases · ROCm/spur
Spur nightly (20260424 · 7abf141)
Automated nightly build from main at 2026-04-24 23:35 UTC.
Commit: 7abf141
1 commit(s) since previous nightly.
Install
curl -fsSL https://raw.githubusercontent.com/ROCm/spur/main/install.sh | bash -s -- nightlySpur v0.2.2
Changes since v0.2.1
- #107/#108: Set supplementary groups (video, render) for non-root job processes — fixes GPU device access when running as non-root
- #109: Replace GetNode heartbeat hack with proper Heartbeat RPC — cleaner separation of read/write paths, real telemetry in heartbeats
- #110: Sync CLAUDE.md and README.md with current codebase
Spur v0.2.1
Hotfix: fix namespace wrapper mount ordering (PR #106). Removes private /tmp mount that broke job output file visibility.
Spur v0.2.0
Spur v0.2.0
Major release with 99 commits since v0.1.0. Highlights:
Raft-based HA (PR #84, #93)
- Replaced K8s Lease leader election with OpenRaft consensus
- Works on both bare-metal and Kubernetes — unified HA
- Config:
controller.peers = ["node1:6817", "node2:6817"]
Topology-aware scheduling (PR #80)
topology/tree— fat-tree fabric-aware scheduling, minimize switch hopstopology/block— rack-level job co-location- Config:
[topology]section with switch hierarchy
Bare-metal job isolation (PRs #100, #102, #103, #104, #105)
- UID/GID enforcement via setuid (jobs run as submitting user)
- PID + mount namespace isolation (private /tmp, /dev/shm, process view)
- GPU device restriction via selective bind-mount
- seccomp-BPF syscall whitelist (~150 syscalls, blocks ptrace/mount/bpf)
- Landlock LSM filesystem access control (kernel 5.13+)
[isolation]config section for operator control- Fork bomb protection (pids.max) and OOM isolation (memory.oom.group)
K8s operator improvements (PRs #85, #93, #98)
- hostNetwork, privileged, hostIPC, shmSize fields now applied to pods
- Extra device plugin resources (RDMA, InfiniBand)
- Cross-namespace SpurJob support
- Node state machine with auto-recovery from heartbeat timeout
Scheduler fixes (PR #94)
- Fixed misleading PendingReason::Priority on new jobs
- Dispatch failures now requeue instead of marking Failed
- salloc timeout + pending reason display
Binaries
spur-v0.2.0-linux-x86_64.tar.gz— all binaries (dynamically linked, glibc 2.39+)spurd-static— statically linked agent (musl, works on Ubuntu 22.04+)
Spur v0.1.0
What's Changed
- Users/powderluv/k8s operator and bare metal deploy by @powderluv in #1
- Multi-node Slurm feature tests + nodelist/exclude/time-limit fixes by @powderluv in #2
- Add release workflow and curl-pipe-bash installer by @powderluv in #3
New Contributors
- @powderluv made their first contribution in #1
Full Changelog: https://github.com/powderluv/spur/commits/v0.1.0