Skip to content

fumiama/gozel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

47 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GoZeL

CI Go Reference License: AGPL v3

gozel is a pure-Go binding for the Intel oneAPI Level Zero API — enabling direct GPU/NPU compute from Go without cgo.

Built on purego and Windows syscall, gozel loads ze_loader at runtime via FFI, avoiding all C compiler dependencies. The entire API surface is auto-generated from the official Level Zero SDK headers, keeping bindings always in sync with upstream.


Table of Contents


Projects Using gozel

We maintain a list of projects built with gozel. If your project uses this library, please open a PR to add it here!

Project Description Link

Features

  • No cgo — Pure Go via purego and Windows syscall. No C toolchain required at build time.
  • Auto-generated — All 200+ Level Zero functions generated directly from the official C headers, covering Core, Sysman, Tools and Runtime APIs.
  • High-level wrapper — The ze sub-package provides an idiomatic Go API over the raw bindings: typed handles, error returns, automatic resource lifetime.
  • Kernel in Go — Embed SPIR-V binaries with //go:embed, load and launch GPU kernels entirely from Go.
  • Extensible — Add new examples, wrap more APIs, or plug in your own SPIR-V pipelines.

Platform Support

gozel targets all Intel GPUs including Intel Arc / Iris Xe / Data Center GPUs that ship a Level Zero driver. Any device visible to ze_loader should work.

OS Architecture Runtime Requirement
Windows amd64 Intel GPU driver with ze_loader.dll
Linux amd64 Intel compute-runtime with libze_loader.so
More under testing...

Quick Start

Install

go get github.com/fumiama/gozel

Minimal Example

package main

import (
	"fmt"

	"github.com/fumiama/gozel/ze"
)

func main() {
	gpus, err := ze.InitGPUDrivers()
	if err != nil {
		panic(err)
	}
	fmt.Printf("Found %d GPU driver(s)\n", len(gpus))

	devs, _ := gpus[0].DeviceGet()
	for _, d := range devs {
		prop, _ := d.DeviceGetProperties()
		fmt.Printf("  Device: %s\n", string(prop.Name[:]))
	}

	// Found 1 GPU driver(s)
	//   Device:  Graphics
}

Vector-Add

This example adds up two large float32 vectors.

cd examples/vadd
# go generate          # Optional, requires SYCL toolchain (see below)
go run main.go

You will see the result like

===============  Device Basic Properties  ===============
Running on device: ID = 1 , Name = Graphics @ 4.00 GHz.
=============== Device Compute Properties ===============
Max Group Size (X, Y, Z):    (1024, 1024, 1024)
Max Group Count (X, Y, Z):   (4294967295, 4294967295, 4294967295)
Max Total Group Size:        1024
Max Shared Local Memory:     65536
Num Subgroup Sizes:          3
Subgroup Sizes:              [8 16 32 0 0 0 0 0]
=============== Computation Configuration ===============
Group Size (X, Y, Z):        (1024, 1, 1)
Group Count:                 65536
Total Elements (N):          67108864
Buffer Size:                 256 MiB
===============    Calculation Results    ===============
GPU Execution Time:          56.114500 ms
GPU Throughput:              4.78 GiB/s
===============    Validation Results    ===============
CPU Execution Time:          77.190700 ms
CPU Throughput:              3.48 GiB/s
Test Passed!!!

More Examples

Example Description Source
vadd Vector addition — GPU kernel launch, memory copy, validation examples/vadd

The ze Package — High-Level API

All examples used ze sub-package, which wraps the raw Level Zero bindings into idiomatic Go with typed handles and method chains:

// Initialize
gpus, _ := ze.InitGPUDrivers()

// Context + Device
ctx, _ := gpus[0].ContextCreate()
devs, _ := gpus[0].DeviceGet()

// Memory
hostBuf, _ := ctx.MemAllocHost(size, alignment)
devBuf, _ := ctx.MemAllocDevice(devs[0], size, alignment)
defer ctx.MemFree(hostBuf)
defer ctx.MemFree(devBuf)

// Module + Kernel
mod, _ := ctx.ModuleCreate(devs[0], spirvBytes)
krn, _ := mod.KernelCreate("my_kernel")
krn.SetArgumentValue(0, unsafe.Sizeof(uintptr(0)), unsafe.Pointer(&devBuf))
krn.SetGroupSize(256, 1, 1)

// Command submission
q, _ := ctx.CommandQueueCreate(devs[0])
cl, _ := ctx.CommandListCreate(devs[0])
cl.AppendMemoryCopy(devBuf, hostBuf, size)
cl.AppendBarrier()
cl.AppendLaunchKernel(krn, &gozel.ZeGroupCount{Groupcountx: 4, Groupcounty: 1, Groupcountz: 1})
cl.AppendBarrier()
cl.Close()
q.ExecuteCommandLists(cl)
q.Synchronize()

Compared to the raw gozel.ZeCommandListCreate(...) calls, the ze package:

  • Eliminates boilerplate descriptor initialization (struct types, Stype fields)
  • Returns Go error instead of ZeResult codes
  • Provides method syntax on handles (ctx.CommandListCreate(dev) vs standalone functions)
  • Manages handle lifetimes with defer-friendly Destroy() methods

The ze package currently covers the most common workflows. Contributions to expand coverage are very welcome — see Contributing.

🙌 Have an example to share? We'd love to grow this table — see Contributing.

Contributing

Contributions of all kinds are welcome. Some particularly impactful areas:

  • Examples — Add new examples/ demonstrating different GPU compute patterns (matrix multiply, reduction, image processing, etc.). Every example helps new users get started faster.
  • ze package coverage — The high-level wrapper doesn't cover the full Level Zero surface yet. Wrapping additional APIs (events, fences, images, sysman queries, etc.) directly benefits all downstream users.
  • Testing — Help improve test coverage across packages.
  • Documentation — Improve godoc comments, add usage guides, or translate documentation.
  • Project showcase — If you use gozel in your project, open a PR to add it to the Projects Using gozel table above.

License

This project is licensed under the GNU Affero General Public License v3.0.


Appendix I. Architecture

The FFI layer (internal/zecall) loads the Level Zero loader library at runtime, caches procedure addresses, and dispatches calls through purego or Windows syscall. All pointer arguments are protected against GC collection during syscalls via go:uintptrescapes.

gozel/
├── api.go                    # Auto-generated: registers all L0 functions at init
├── core_*.go                 # Auto-generated: Core API bindings (ze*)
├── rntm_*.go                 # Auto-generated: Runtime API bindings (zer*)
├── sysm_*.go                 # Auto-generated: Sysman API bindings (zes*)
├── tols_*.go                 # Auto-generated: Tools API bindings (zet*)
├── ze/                       # High-level idiomatic Go wrapper
│   ├── init.go               #   Driver initialization
│   ├── context.go            #   Context management
│   ├── device.go             #   Device enumeration & properties
│   ├── module.go             #   SPIR-V module loading
│   ├── kernel.go             #   Kernel creation & argument binding
│   ├── mem.go                #   Device/host memory allocation
│   └── command.go            #   Command queues, lists, barriers
├── internal/zecall/          # purego FFI layer (loads ze_loader at runtime)
├── cmd/gen/                  # Code generator: parses L0 headers → Go source
├── spec/                     # Optional L0 SDK headers for dev purpose (input to cmd/gen)
└── examples/
    ├── quick_start/          # The quick start shown in this README
    └── vadd/                 # Vector addition: SYCL kernel + Go host

Appendix II. Regenerating Bindings from Headers

gozel includes a code generator (cmd/gen) that parses the four Level Zero API headers and produces all core_*.go, rntm_*.go, sysm_*.go, tols_*.go files plus api.go.

From a local SDK

Place (or symlink) the Level Zero SDK under spec/:

spec/
├── include/level_zero/
│   ├── ze_api.h      # Core
│   ├── zer_api.h     # Runtime
│   ├── zes_api.h     # Sysman
│   └── zet_api.h     # Tools
└── lib/

Then run:

go run ./cmd/gen -spec ./spec

From a specific release

The generator can download the SDK directly from the level-zero releases:

go run ./cmd/gen -spec v1.28.0

Via go generate

go generate .

This invokes cmd/gen with the local spec/ directory as configured in doc.go.

Appendix III. Building SPIR-V Kernels

GPU kernels in the examples folder are written in SYCL C++ and compiled to SPIR-V for embedding into Go programs, which is a little bit hacky. You can also use ocloc, which is a common practice and you can search for the build doc elsewhere. The build pipeline uses go generate directives:

main.cpp ──clang++ -fsycl──▶ device_kern.bc
                              │
                        sycl-post-link
                              │
                        ▼ device_kern_0.bc
                              │
                   clang++ -emit-llvm -S
                              │
                        ▼ device_kern.ll
                              │
                     llvm-spirv
                              │
                        ▼ main.spv          ← embedded via //go:embed

Prerequisites

Build

cd examples/vadd
go generate    # compiles main.cpp → main.spv
go run main.go # runs vector addition on GPU