Skip to content

ct-clmsn/vkmps

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

vkmps — Vulkan Multi-Process Service

A compute-only Vulkan MPS (Multi-Process Service) daemon, inspired by NVIDIA's MPS. Multiple clients share a single Vulkan device managed by a central daemon over Unix domain sockets.

Quick Start

# Build
cmake -S . -B build
cmake --build build -j4

# Start the server
./build/vkmps-server

# Run the compute example
./build/simple_compute

# Run all tests
cd build && ctest -V

Requirements

  • Vulkan SDK — for VK_HEADER_VERSION and libvulkan
  • glslc — included with Vulkan SDK, compiles GLSL → SPIR-V
  • g++ ≥ 10 or clang ≥ 12 (C++17 required)
  • Linux with a Vulkan-capable GPU (or llvmpipe software rasterizer for development)

Build

cmake -S . -B build
cmake --build build -j4

Targets:

Target Description
vkmps-server The daemon binary
vkmps-control CLI tool to query/control a running server
simple_compute Single-client compute example (sync + async)
multi_process Multi-client concurrent compute example (async)
test_protocol Protocol serialization unit tests
test_integration Mock-server IPC integration tests
test_vulkan_integration Real-server Vulkan GPU integration tests

Server Safety

Stale Socket / Lock File

On startup the server checks whether a socket file already exists at the configured path. If a file is found it attempts to connect — if a live server responds, the new instance exits with an error (prevents duplicate daemons). If no server responds, the stale socket and lock file are removed before binding.

A PID lock file (<socket>.lock) prevents a second server from starting on the same path.

Crash Signal Handlers

SIGSEGV, SIGABRT, and SIGQUIT are handled with an async-signal-safe emergency cleanup routine that:

  • Writes a diagnostic message to stderr
  • Unlinks the Unix socket file
  • Unlinks the PID lock file
  • Calls shm_unlink on the ring-logger shared memory segment
  • Exits with 128 + signal

Normal shutdown via SIGINT/SIGTERM performs full graceful cleanup (close client sockets, stop threads, destroy Vulkan resources, unlink socket and lock file).

SIGPIPE is silently ignored to prevent process termination from broken client connections.

Server Usage

Usage: ./build/vkmps-server [options]

Options:
  -s, --socket PATH   Unix socket path (default: /tmp/vkmps.sock)
  -d, --device IDX    GPU device index (default: 0)
  -l, --log           Enable file logging (default: /tmp/vkmps.log)
  -L, --log-path PATH Log file path (implies --log)
  -M, --scheduling-mode MODE
                       Scheduling mode: exclusive-throughput (default)
                       or cooperative-fairshare
  -q, --loop-slice-quota N
                       Yield threshold for EXCLUSIVE_THROUGHPUT (default: 100000)
  --instrument-submodular
                       Enable submodular yield-point selection in SPIR-V
                       instrumentation (selects loops that maximize
                       yield-per-byte under the loop-slice-quota budget)
  -h, --help          Show this help

Logging

Enable with --log (writes to /tmp/vkmps.log by default) or --log-path /path/to/log to set a custom path:

./build/vkmps-server --log -s /tmp/vkmps.sock

The log captures timestamps for:

  • Server lifecycle (start, stop, thread state)
  • Vulkan device discovery (name, vendor, version, queues)
  • Client connections and disconnections
  • Resource operations (program registration, buffer allocation/free, data transfer)
  • Dispatch submissions, group sizes, priority scheduling
  • All failures and errors

Example log output:

[2026-05-20 20:05:18.851] [INFO] vkmps-server logging started (PID=50236)
[2026-05-20 20:05:18.877] [INFO] Vulkan device found: llvmpipe ... Vulkan 1.4.318
[2026-05-20 20:05:18.891] [INFO] Created 1 compute queue(s): [q0@1.000000]
[2026-05-20 20:05:18.892] [INFO] Server listening on /tmp/vkmps.sock
[2026-05-20 20:05:19.123] [INFO] Client 1 connected (PID=12345, name=client1, priority=MEDIUM)
[2026-05-20 20:05:19.456] [INFO] Registered program 1: compute_multiply
[2026-05-20 20:05:19.789] [INFO] Allocated buffer 1 (4096 bytes) for client 1
[2026-05-20 20:05:20.012] [INFO] Submitted dispatch 1 (4x1x1) prio=REALTIME for client 1
[2026-05-20 20:05:20.234] [INFO] Client 1 disconnected

Client API

Synchronous Dispatch

#include <vkmps/client.h>

vkmps_client_t client = vkmps_connect("/tmp/vkmps.sock");

vkmps_program_t prog = vkmps_register_program(client, "my_shader", spirv, size);
vkmps_buffer_t buf  = vkmps_alloc_buffer(client, 4096, VKMPS_BUFFER_USAGE_STORAGE);

vkmps_write_buffer(client, buf, 0, 4096, data);

vkmps_submission_t sub = vkmps_submit(client, prog, 64, 1, 1, &pc, 4,
                                       &buf, 1, &buf, 1);
vkmps_wait(client, sub, 0);

vkmps_read_buffer(client, buf, 0, 4096, output);

vkmps_disconnect(client);

Async (Non-Blocking) Dispatch

// Submit returns immediately — work runs in a background thread
vkmps_async_handle_t* handle = vkmps_submit_nb(client, prog, 64, 1, 1,
                                                &pc, 4, &buf, 1, &buf, 1);
// Do other work while dispatch is in flight...
// ...

// Block until the dispatch completes
vkmps_submit_wait(handle, 0);
vkmps_submit_handle_free(handle);

vkmps_submit_nb uses std::async internally to perform the submit + wait in a background thread, freeing the calling thread to overlap work with execution.

Function Description
vkmps_submit_nb(client, prog, gx, gy, gz, pc, pc_size, in, ni, out, no) Non-blocking submit; returns vkmps_async_handle_t* immediately
vkmps_submit_wait(handle, timeout_ns) Blocks until the dispatch completes; returns VKMPS_OK or error
vkmps_submit_handle_free(handle) Releases the async handle

Client Priority

vkmps_client_t client = vkmps_connect_with_priority(path, VKMPS_PRIORITY_REALTIME);

Architecture

┌────────────┐  ┌────────────┐  ┌────────────┐
│ Client A   │  │ Client B   │  │ Client C   │
│ (process)  │  │ (process)  │  │ (process)  │
└─────┬──────┘  └─────┬──────┘  └─────┬──────┘
      │ Unix socket   │ Unix socket   │ Unix socket
      └──────┬────────┴────────┬──────┘
             │                 │
      ┌──────▼─────────────────▼──────┐
      │      vkmps-server daemon      │
      │  ┌─────────────────────────┐  │
      │  │  Submission Thread      │  │
      │  │  (VkQueue dispatch)     │  │
      │  ├─────────────────────────┤  │
      │  │  Accept + Client Threads│  │
      │  │  (protocol handling)    │  │
      │  └─────────────────────────┘  │
      │         VkDevice              │
      └───────────────────────────────┘
  • Daemon owns the VkDevice — clients are thin IPC proxies over Unix domain sockets
  • Binary protocol — fixed-header wire format, no serialization library dependency
  • Priority scheduling — 4 client priorities (LOW/MEDIUM/HIGH/REALTIME) route to up to 4 compute queues with descending priorities
  • VK_EXT_global_priority — when available, GPU-side priority queuing (REALTIME through LOW)
  • VK_ARM_scheduling_controls — when detected on ARM Mali GPUs, enables shader core count control
  • Cooperative scheduling — dispatch dimensions dynamically scaled based on submission backlog depth and client priority

Extension Support

Extensions are probed at runtime and enabled only when the underlying GPU supports them:

Extension Scope Effect
VK_EXT_global_priority Cross-vendor Per-queue GPU priority (REALTIME/HIGH/MEDIUM/LOW)
VK_ARM_scheduling_controls ARM Mali Shader core count control for compute dispatches

Tests

cd build && ctest -V
Test binary Tests What it covers
test_protocol 31 Writer/Reader, every message type round-trip, edge cases, priority helpers
test_integration 5 Socketpair-based IPC: handshake, lifecycle, concurrent clients, priority round-trip, dispatch round-trip
test_vulkan_integration 6 Real-Vulkan: buffer read/write, compute dispatch, sequential dispatches, concurrent clients, large dispatch, async dispatch

Vulkan integration tests require a Vulkan-capable GPU or driver (e.g., llvmpipe). All tests gracefully skip when no Vulkan device is available.

License

Boost Software License 1.0 — see LICENSE_1_0.txt.

About

a Vulkan-based Multi-Process Service (MPS)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors