Need (Jay 2026-06-14)
@taos and @taOSmd share hardware (Fedora RTX 3060 on linstation, 12GB GPU). They must coordinate over A2A: post when claiming/using, post when free again, CHECK before use. Found when @taos FLUX image-gen OOMd because @taOSmd had ~9.4GB of Ollama models loaded on the same GPU despite an earlier free signal.
Interim protocol (agents follow now via the A2A integration channel)
GPU LEASE for a shared node:
- CHECK before loading: scan the channel for an open
[GPU CLAIM] on that node AND run nvidia-smi. Load only if free + unclaimed.
- CLAIM:
[GPU CLAIM] node=<host> holder=@you vram=~Xgb reason=... eta=... before loading.
- RELEASE:
[GPU RELEASE] node=<host> holder=@you immediately when done.
- REQUEST when blocked:
[GPU REQUEST] node=<host> need=~Xgb; holder releases or negotiates a split/window.
- Courtesy: do not hold the GPU idle; use keep-alive auto-unload.
Future (productize)
Fold into the cluster scheduler/ClusterManager as a real resource lease: per-device VRAM accounting, a lease registry, admission control (a backend wont load if it cannot fit alongside current leases), auto-eviction by keep-alive. The A2A text protocol is the stop-gap. Relates to #890/#892 (worker contract) and dynamic resource scheduling.
Acceptance
Both agents check + claim + release over A2A and never silently co-load past the GPU VRAM; then a scheduler-enforced version automates it.
Need (Jay 2026-06-14)
@taos and @taOSmd share hardware (Fedora RTX 3060 on linstation, 12GB GPU). They must coordinate over A2A: post when claiming/using, post when free again, CHECK before use. Found when @taos FLUX image-gen OOMd because @taOSmd had ~9.4GB of Ollama models loaded on the same GPU despite an earlier free signal.
Interim protocol (agents follow now via the A2A integration channel)
GPU LEASE for a shared node:
[GPU CLAIM]on that node AND run nvidia-smi. Load only if free + unclaimed.[GPU CLAIM] node=<host> holder=@you vram=~Xgb reason=... eta=...before loading.[GPU RELEASE] node=<host> holder=@youimmediately when done.[GPU REQUEST] node=<host> need=~Xgb; holder releases or negotiates a split/window.Future (productize)
Fold into the cluster scheduler/ClusterManager as a real resource lease: per-device VRAM accounting, a lease registry, admission control (a backend wont load if it cannot fit alongside current leases), auto-eviction by keep-alive. The A2A text protocol is the stop-gap. Relates to #890/#892 (worker contract) and dynamic resource scheduling.
Acceptance
Both agents check + claim + release over A2A and never silently co-load past the GPU VRAM; then a scheduler-enforced version automates it.