Workflow triggers on push to main/release branches, tags, PRs, and
manual dispatch. Uses Docker Buildx with GHA cache for faster rebuilds.
Tags images with ollama version, git SHA, and branch/tag names.
Co-authored-by: Cursor <cursoragent@cursor.com>
Rewrite README with clear value proposition, architecture diagram,
troubleshooting section, and streamlined structure. Update CHANGELOG
to reflect full history of Vulkan-to-SYCL migration.
Co-authored-by: Cursor <cursoragent@cursor.com>
Build ggml-sycl from upstream llama.cpp (commit a5bb8ba4, matching ollama's
vendored ggml) using Intel oneAPI 2025.1.1 in a multi-stage Docker build.
Patch two ollama-specific API divergences via patch-sycl.py: added batch_size
parameter to graph_compute, removed GGML_TENSOR_FLAG_COMPUTE skip-check that
caused all compute nodes to be bypassed.
Tested: gemma3:1b — 27/27 layers on GPU, 10.2 tok/s gen, 65.3 tok/s prompt eval.
Co-authored-by: Cursor <cursoragent@cursor.com>
The IPEX-LLM bundled start-ollama.sh hardcodes OLLAMA_HOST=127.0.0.1 and
OLLAMA_KEEP_ALIVE=10m, overriding docker-compose environment variables and
preventing external connections through Docker port mapping.
- Add custom start-ollama.sh that honours env vars with sensible defaults
- Mount it read-only into the container
- Fix LD_LIBRARY_PATH env var syntax (: -> =)
- Add .gitignore for IDE/swap/webui data files
- Update CHANGELOG and README with fix documentation
Co-authored-by: Cursor <cursoragent@cursor.com>