Ollama 0.15.6 with SYCL backend for Intel GPU (~2x faster than Vulkan) by eSlider · Pull Request #1 · eSlider/ollama-intel-gpu

eSlider · 2026-02-12T17:39:01Z

Summary

• Upgrade ollama from 0.9.3 (IPEX-LLM bundle) to official v0.15.6
• Replace Vulkan backend with SYCL — custom-built ggml-sycl from upstream llama.cpp using Intel oneAPI 2025.1.1, delivering ~2x
inference speed on Intel GPUs
• Add GitHub Actions CI to build and push the Docker image to GHCR automatically
• Update Intel GPU driver stack to latest releases (Level-Zero 1.28.0, IGC 2.28.4, compute-runtime 26.05)

Why SYCL over Vulkan

Ollama's official release only ships a Vulkan backend for Intel GPUs. SYCL with oneAPI unlocks oneMKL, oneDNN, and Level-Zero
direct access — benchmarks show +45-100% token/s on integrated Arc (MTL) and +57-83% on discrete Arc.

Intel GPU	Vulkan	SYCL	Gain
MTL iGPU (155H)	~8-11 tok/s	~16 tok/s	+45-100%
Arc A770	~30-35 tok/s	~55 tok/s	+57-83%

Tested on Intel Core Ultra 7 155H (Meteor Lake): gemma3:1b — 27/27 layers on GPU, 10.2 tok/s generation, 65.3 tok/s prompt eval.

How the SYCL build works

Ollama intentionally excludes ggml-sycl source from its vendored ggml. This PR rebuilds it in a multi-stage Dockerfile:

Clones ollama v0.15.6 source (for ggml build system + headers)
Fetches ggml-sycl from the exact llama.cpp commit (a5bb8ba4) that ollama vendors — critical for ABI compatibility
Applies two patches via patch-sycl.py:
• graph_compute signature: ollama adds int batch_size parameter
• GGML_TENSOR_FLAG_COMPUTE: ollama removes this flag; without the patch, all compute nodes get skipped producing garbage output
Compiles with icpx + oneAPI, bundles runtime libs into the image

Changes

File	What changed
Dockerfile	Multi-stage build: oneAPI SYCL compile -> minimal Ubuntu runtime
patch-sycl.py	New — patches ggml-sycl for ollama API compatibility
docker-compose.yml	SYCL env vars, full /dev/dri access, debug logging
README.md	Full rewrite with architecture diagram, benchmarks, troubleshooting
CHANGELOG.md	Documents full migration history
.github/workflows/build-push.yml	New — CI to build and push image to GHCR
start-ollama.sh	Custom entrypoint fixing hardcoded OLLAMA_HOST (from IPEX-LLM era)
.gitignore	IDE and swap file exclusions

Test plan

docker compose build completes without errors
curl http://localhost:11434/api/tags returns model list
SYCL backend detected in logs (SYCL0, Intel(R) Arc(TM) Graphics)
All model layers offloaded to GPU (27/27)
Inference produces correct output (gemma3:1b, gemma3:4b)
CI workflow builds successfully on GitHub Actions
Pre-built image pullable from GHCR

Note

Medium Risk
Reworks the container build/run stack (multi-stage oneAPI build, custom SYCL runner, updated Intel GPU runtimes), which could break runtime compatibility or device detection if versions/patching drift. No direct auth/data-path changes, but it modifies the primary deployment artifact and defaults (e.g., WebUI auth disabled).

Overview
Switches the Docker build to a SYCL-first Intel GPU stack. The Dockerfile is rewritten into a multi-stage build that compiles libggml-sycl.so with Intel oneAPI, bundles required oneAPI runtime libraries, installs the official Ollama v0.15.6 binary, and removes unused runners (CUDA/MLX/Vulkan) while updating Intel GPU user-space drivers.

Adds SYCL compatibility patching and updates runtime defaults. Introduces patch-sycl.py to patch upstream ggml-sycl for Ollama API differences, and updates docker-compose.yml to pass the Ollama version build arg, expose 11434, increase shm_size, set SYCL/Level-Zero env vars, and disable WebUI auth while switching WebUI to :latest with local persisted data.

Adds automation and repo hygiene/docs. Adds a GitHub Actions workflow to build and push the image to GHCR with versioned tagging, plus new/updated README.md/CHANGELOG.md, .gitignore, and .github/FUNDING.yml, and removes legacy IPEX-LLM init/run scripts and the WSL2 compose file.

^{Written by Cursor Bugbot for commit e397010. This will update automatically on new commits. Configure here.}

Fix the ambiguous intel-basekit package in Dockerfile

Update Dockerfile to use Intel public ipex container

Update ipex-llm image from Intel to 2.2.0-SNAPSHOT

illlustrates -> illustrates

docs: update README.md

Update to latest ipex-llm dockerfile 20250211

Cache link

…ed in Dockerfile (if no build args provided)

Dockerfile ARGs to make it easier to use latest IPEX-LLM Ollama Portable Zip

Update to ipex-llm-2.2.0b20250313

Update Intel libraries

Update default to ipex-llm v2.2.0 (guide for v2.3.0-nightly in docs)

Revised `IPEXLLM_RELEASE_REPO` value and adjusted file and path references for consistency. Updated `docker-compose.yml` with refined environment variables, device mapping, restart policies, and added necessary port bindings for better functionality and maintainability.

…ormance

…duce image size

- level-zero v1.22.4 -> v1.28.0 - IGC v2.11.7 -> v2.28.4 - compute-runtime 25.18.33578.6 -> 26.05.37020.3 - libigdgmm 22.7.0 -> 22.9.0 - ipex-llm ollama nightly 2.3.0b20250612 -> 2.3.0b20250725 - Docker compose: disable webui auth, stateless webui volume - README formatting and GPU model update Co-authored-by: Cursor <cursoragent@cursor.com>

…trypoint The IPEX-LLM bundled start-ollama.sh hardcodes OLLAMA_HOST=127.0.0.1 and OLLAMA_KEEP_ALIVE=10m, overriding docker-compose environment variables and preventing external connections through Docker port mapping. - Add custom start-ollama.sh that honours env vars with sensible defaults - Mount it read-only into the container - Fix LD_LIBRARY_PATH env var syntax (: -> =) - Add .gitignore for IDE/swap/webui data files - Update CHANGELOG and README with fix documentation Co-authored-by: Cursor <cursoragent@cursor.com>

… Intel GPU Replace the IPEX-LLM portable zip (bundling a patched ollama 0.9.3 with SYCL) with the official ollama 0.15.6 release using the Vulkan backend for Intel GPU acceleration. The official ollama project does not ship a SYCL backend; Vulkan is their supported path for Intel GPUs. - Use official ollama binary with Vulkan runner (OLLAMA_VULKAN=1) - Strip CUDA/MLX runners from image to save space - Add mesa-vulkan-drivers for Intel ANV Vulkan ICD - Remove all IPEX-LLM env vars and wrapper scripts - Simplify entrypoint to /usr/bin/ollama serve directly - Clean up docker-compose.yml: remove IPEX build args and env vars Tested: Intel Arc Graphics (MTL) detected, 17/17 layers offloaded to Vulkan0 Co-authored-by: Cursor <cursoragent@cursor.com>

…on Intel GPUs Build ggml-sycl from upstream llama.cpp (commit a5bb8ba4, matching ollama's vendored ggml) using Intel oneAPI 2025.1.1 in a multi-stage Docker build. Patch two ollama-specific API divergences via patch-sycl.py: added batch_size parameter to graph_compute, removed GGML_TENSOR_FLAG_COMPUTE skip-check that caused all compute nodes to be bypassed. Tested: gemma3:1b — 27/27 layers on GPU, 10.2 tok/s gen, 65.3 tok/s prompt eval. Co-authored-by: Cursor <cursoragent@cursor.com>

Rewrite README with clear value proposition, architecture diagram, troubleshooting section, and streamlined structure. Update CHANGELOG to reflect full history of Vulkan-to-SYCL migration. Co-authored-by: Cursor <cursoragent@cursor.com>

Workflow triggers on push to main/release branches, tags, PRs, and manual dispatch. Uses Docker Buildx with GHA cache for faster rebuilds. Tags images with ollama version, git SHA, and branch/tag names. Co-authored-by: Cursor <cursoragent@cursor.com>

cursor

Cursor Bugbot has reviewed your changes and found 2 potential issues.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

.idea/.gitignore

Dockerfile

Co-authored-by: Andrey Oblivantsev <eslider@gmail.com>

mattcurf and others added 30 commits October 28, 2024 16:33

Merge pull request mattcurf#6 from mattcurf/fix_oneapi_dependency

ff2a979

Fix the ambiguous intel-basekit package in Dockerfile

Update webui

2e18d91

Update docker-compose-wsl2.yml

0e4cf4f

Merge pull request mattcurf#8 from pepijndevos/patch-1

eb23896

Update Dockerfile to use Intel public ipex container

c74f6f2

Merge pull request mattcurf#11 from mattcurf/ipex_intel_image

91d2045

Update Dockerfile to use Intel public ipex container

Update ipex-llm image from Intel to 2.2.0-SNAPSHOT

6df0d8d

Merge pull request mattcurf#21 from mattcurf/update_tags

07e8a24

Update ipex-llm image from Intel to 2.2.0-SNAPSHOT

Update to latest open-webui releases

b74bab0

Merge pull request mattcurf#24 from mattcurf/update-webui

d51c656

Update README.md

8e69333

Update README.md

c230c45

docs: update README.md

ddd565f

illlustrates -> illustrates

Merge pull request mattcurf#27 from eltociear/patch-1

1581a50

docs: update README.md

Update to latest ipex-llm dockerfile 20250211

765a8c0

Update README.md

f08a310

Merge pull request mattcurf#36 from mattcurf/updated-docker-image

ec7dec8

Update to latest ipex-llm dockerfile 20250211

Update to use new ipex portable .zip packages

2fc5265

Minor fixes

dd84c20

Merge branch 'main' into ollama_portable_zip

c47c879

Update README.md

fed3cf9

Increase context window size

fa579db

Merge pull request mattcurf#39 from mattcurf/ollama_portable_zip

db8d96c

Cache link

85e28fc

Merge pull request mattcurf#42 from mattcurf/fix-links

d81b21c

Cache link

Allow for user choice of ollama portable zip at build time

e1da4a4

Update compose file with build args

2c82aed

Updates to allow latest ollama in compose file, with fallback to cach…

1e92fbe

…ed in Dockerfile (if no build args provided)

Updated README.md for Dockerfile args.

b33c01f

Revert compose to cached .tgz by default.

451f910

mattcurf and others added 17 commits March 17, 2025 09:24

Merge pull request mattcurf#49 from blebo/dockerfile-args

86f0765

Dockerfile ARGs to make it easier to use latest IPEX-LLM Ollama Portable Zip

Update to ipex-llm-2.2.0b20250313

61288f5

Merge pull request mattcurf#50 from mattcurf/update_ipex

6964b45

Update to ipex-llm-2.2.0b20250313

Update default to ipex-llm v2.2.0 (guide for v2.3.0-nightly in docs)

504a1d3

Update Intel libraries

f1bbedb

Merge pull request mattcurf#55 from charlescng/update_intel_libs

dea2fd0

Update Intel libraries

Merge pull request mattcurf#54 from blebo/update-ipex-v2.2.0

8172339

Update default to ipex-llm v2.2.0 (guide for v2.3.0-nightly in docs)

Ignore shelf

c98fd71

Update Docker configurations and Intel GPU runtimes for improved perf…

0a7f974

…ormance

Clean up Dockerfile by adding autoremove and autoclean commands to re…

1239010

…duce image size

Rework README for better GitHub presentation

971852d

Rewrite README with clear value proposition, architecture diagram, troubleshooting section, and streamlined structure. Update CHANGELOG to reflect full history of Vulkan-to-SYCL migration. Co-authored-by: Cursor <cursoragent@cursor.com>

eSlider added the enhancement New feature or request label Feb 12, 2026

Create FUNDING.yml

e397010

cursor bot reviewed Feb 12, 2026

View reviewed changes

.idea/.gitignore Outdated Show resolved Hide resolved

Dockerfile Outdated Show resolved Hide resolved

cursoragent and others added 2 commits February 13, 2026 10:22

chore: remove committed JetBrains .idea gitignore

5924879

Co-authored-by: Andrey Oblivantsev <eslider@gmail.com>

Remove dead libsycl glob no-op in Dockerfile

9f56e70

Co-authored-by: Andrey Oblivantsev <eslider@gmail.com>

eSlider merged commit 0ab3060 into main Feb 13, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ollama 0.15.6 with SYCL backend for Intel GPU (~2x faster than Vulkan)#1

Ollama 0.15.6 with SYCL backend for Intel GPU (~2x faster than Vulkan)#1
eSlider merged 50 commits intomainfrom
release/Intel-Core-Ultra-7-155H

eSlider commented Feb 12, 2026 •

edited

Loading

Uh oh!

cursor bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Uh oh!

Conversation

eSlider commented Feb 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why SYCL over Vulkan

How the SYCL build works

Changes

Test plan

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

eSlider commented Feb 12, 2026 •

edited

Loading