gb10

Star

Here are 32 public repositories matching this topic...

eelbaz / dgx-spark-vllm-setup

Star

One-command vLLM installation for NVIDIA DGX Spark with Blackwell GB10 GPUs (sm_121 architecture)

machine-learning ai deep-learning gpu cuda pytorch nvidia arm64 blackwell llm vllm llm-inference gb10 dgx-spark

Updated Oct 28, 2025
Shell

jdaln / dgx-spark-inference-stack

Star

Serve the home! Inference stack for your Nvidia DGX Spark aka the Grace Blackwell AI supercomputer on your desk. Mostly vLLM based for now and single-spark. For the not-so-rich buddies

docker docker-compose cuda inference self-hosted llama model-serving mlops dgx generative-ai local-llm gb10 dgx-spark

Updated Apr 27, 2026
Shell

joeynyc / spark-doctor

Star

Local diagnostic CLI for NVIDIA DGX Spark (GB10). Detects power caps, unified memory pressure, thermal risk, Docker/runtime issues, and validates vLLM/Ollama/llama.cpp/SGLang recipes.

cli nvidia diagnostics dgx llama-cpp vllm local-llm ollama sglang gb10 dgx-spark grace-blackwell nvidia-dgx-spark

Updated Apr 24, 2026
Python

seanGSISG / dgx-spark-sunshine-setup

Star

headless remote desktop to your dgx spark in crystal clear 4k

remote-desktop remote-access sunshine dgx gb10 dgx-spark

Updated Apr 5, 2026
Shell

parallelArchitect / sparkview

Star

Operator-grade GPU monitor for NVIDIA GPUs with native GB10 / DGX Spark coherent UMA support — PSI pressure, clock detection, ConnectX-7 network layer

python monitoring gpu cuda tui nvidia psi unified-memory gb10 dgx-spark

Updated Apr 23, 2026
Python

Single-file web UI for NVIDIA DGX Spark — pull Ollama models, browse and download from HuggingFace, manage LiteLLM routing, and control SGLang, vLLM, llama.cpp, LocalAI, and ComfyUI. All from one browser tab.

web ai nvidia model-deployment fastapi ai-tools llm llm-tools gb10 dgx-spark dgxspark

Updated Apr 24, 2026
Python

getainode / ainode

Star

Turn any NVIDIA GPU into a local AI platform. Inference + fine-tuning in your browser. One command to start, automatic clustering.

open-source gpu cuda inference self-hosted distributed nvidia fine-tuning ai-platform llm vllm local-ai gb10 dgx-spark grace-blackwell

Updated Apr 25, 2026
Python

Navi-AI-Lab / nvllm

Star

(Experimental) A high-throughput and memory-efficient inference and serving engine for LLMs with a optimized GB10 kernel

nvidia cuda-kernels cutlass local-inference vllm llm-inference qwen paged-attention self-hosted-ai gb10 sm120 nvfp4 dgx-spark fp4-quantization attention-kernel fp8-kv-cache

Updated Apr 25, 2026
Python

jxlarrea / homeassistant-voice-recipes

Sponsor

Star

GPU/CUDA-accelerated voice control stack for Home Assistant. Runs on x86/x64 and ARM64 (including the NVIDIA DGX Spark). 100% Local - No Cloud, No Subscriptions.

text-to-speech x86-64 cuda gpu-acceleration home-assistant speech-to-text arm64 voice-assistant local-llm qwen3 gb10 dgx-spark

Updated Apr 24, 2026
Go

scottgl9 / sglang-spark-gb10-optimizations

Sponsor

Star

SGLang optimizations for NVIDIA Spark (GB10) — SM121 Grace Blackwell

optimization marlin sglang gb10

Updated Apr 25, 2026
Python

ridanuae / dgx-spark-sglang-qwen35

Star

Run Qwen3.5-35B-A3B on NVIDIA DGX Spark (GB10) with SGLang - Ready-to-use Docker image + complete guide

docker nvidia moe blackwell llm sglang qwen3 gb10 dgx-spark qwen35

Updated Feb 26, 2026
Shell

parallelArchitect / spark-gpu-throttle-check

Star

Enhanced GPU throttle diagnostic for DGX Spark (GB10): NVML direct telemetry, throttle cause decoder, PCIe link monitoring, baseline drift detection, timeline capture.