Skip to content

Releases: ROCm/spur

Spur nightly (20260424 · 7abf141)

24 Apr 23:35
7abf141

Choose a tag to compare

Pre-release

Automated nightly build from main at 2026-04-24 23:35 UTC.

Commit: 7abf141

1 commit(s) since previous nightly.

Install

curl -fsSL https://raw.githubusercontent.com/ROCm/spur/main/install.sh | bash -s -- nightly

Spur v0.2.2

19 Apr 18:29
1a2c61d

Choose a tag to compare

Changes since v0.2.1

  • #107/#108: Set supplementary groups (video, render) for non-root job processes — fixes GPU device access when running as non-root
  • #109: Replace GetNode heartbeat hack with proper Heartbeat RPC — cleaner separation of read/write paths, real telemetry in heartbeats
  • #110: Sync CLAUDE.md and README.md with current codebase

Spur v0.2.1

19 Apr 01:23
253423c

Choose a tag to compare

Hotfix: fix namespace wrapper mount ordering (PR #106). Removes private /tmp mount that broke job output file visibility.

Spur v0.2.0

18 Apr 21:36
46a36de

Choose a tag to compare

Spur v0.2.0

Major release with 99 commits since v0.1.0. Highlights:

Raft-based HA (PR #84, #93)

  • Replaced K8s Lease leader election with OpenRaft consensus
  • Works on both bare-metal and Kubernetes — unified HA
  • Config: controller.peers = ["node1:6817", "node2:6817"]

Topology-aware scheduling (PR #80)

  • topology/tree — fat-tree fabric-aware scheduling, minimize switch hops
  • topology/block — rack-level job co-location
  • Config: [topology] section with switch hierarchy

Bare-metal job isolation (PRs #100, #102, #103, #104, #105)

  • UID/GID enforcement via setuid (jobs run as submitting user)
  • PID + mount namespace isolation (private /tmp, /dev/shm, process view)
  • GPU device restriction via selective bind-mount
  • seccomp-BPF syscall whitelist (~150 syscalls, blocks ptrace/mount/bpf)
  • Landlock LSM filesystem access control (kernel 5.13+)
  • [isolation] config section for operator control
  • Fork bomb protection (pids.max) and OOM isolation (memory.oom.group)

K8s operator improvements (PRs #85, #93, #98)

  • hostNetwork, privileged, hostIPC, shmSize fields now applied to pods
  • Extra device plugin resources (RDMA, InfiniBand)
  • Cross-namespace SpurJob support
  • Node state machine with auto-recovery from heartbeat timeout

Scheduler fixes (PR #94)

  • Fixed misleading PendingReason::Priority on new jobs
  • Dispatch failures now requeue instead of marking Failed
  • salloc timeout + pending reason display

Binaries

  • spur-v0.2.0-linux-x86_64.tar.gz — all binaries (dynamically linked, glibc 2.39+)
  • spurd-static — statically linked agent (musl, works on Ubuntu 22.04+)

Spur v0.1.0

19 Mar 06:04
7ac44da

Choose a tag to compare

What's Changed

  • Users/powderluv/k8s operator and bare metal deploy by @powderluv in #1
  • Multi-node Slurm feature tests + nodelist/exclude/time-limit fixes by @powderluv in #2
  • Add release workflow and curl-pipe-bash installer by @powderluv in #3

New Contributors

Full Changelog: https://github.com/powderluv/spur/commits/v0.1.0