diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..f0c38d4 --- /dev/null +++ b/.gitignore @@ -0,0 +1,9 @@ +# IDE +.idea/ + +# Swap files +*.swp +*.swo + +# Open WebUI local data +webui/ diff --git a/CHANGELOG.md b/CHANGELOG.md index 602f3e7..c7716d6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,6 +2,19 @@ ## 2026-02-12 +### Fix: Ollama not reachable from host via Docker port mapping + +The bundled IPEX-LLM `/start-ollama.sh` entrypoint hardcodes +`OLLAMA_HOST='127.0.0.1:11434'` and `OLLAMA_KEEP_ALIVE=10m`, overriding any +values set through Docker Compose environment variables. + +- Added a custom `start-ollama.sh` that respects environment variables + (`${OLLAMA_HOST:-0.0.0.0:11434}`, `${OLLAMA_KEEP_ALIVE:-24h}`) instead of + hardcoding them +- Mounted the script into the container as a read-only volume + (`./start-ollama.sh:/start-ollama.sh:ro`) +- Fixed `LD_LIBRARY_PATH` env var syntax in docker-compose.yml (`:` -> `=`) + ### Updated Intel GPU runtime stack to latest releases - **level-zero**: v1.22.4 -> v1.28.0 diff --git a/README.md b/README.md index fa0c9fd..06c738c 100644 --- a/README.md +++ b/README.md @@ -34,6 +34,17 @@ docker compose up Then launch your web browser to http://localhost:3000 to launch the web ui. Create a local OpenWeb UI credential, then click the settings icon in the top right of the screen, then select 'Models', then click 'Show', then download a model like 'llama3.1:8b-instruct-q8_0' for Intel ARC A770 16GB VRAM +### Custom `start-ollama.sh` entrypoint + +The upstream IPEX-LLM portable zip ships a `start-ollama.sh` that hardcodes +`OLLAMA_HOST=127.0.0.1` and `OLLAMA_KEEP_ALIVE=10m`, preventing the container +from accepting connections via Docker port mapping and ignoring Compose +environment overrides. + +This repo includes a corrected `start-ollama.sh` (mounted read-only into the +container) that honours environment variables set in `docker-compose.yml`, +falling back to sensible defaults (`0.0.0.0:11434`, `24h`). + ### Update to the latest IPEX-LLM Portable Zip Version To update to the latest portable zip version of IPEX-LLM's Ollama, update the compose file with the build arguments shown below, using the latest `ollama-*.tgz` release from https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly , then rebuild the image. diff --git a/docker-compose.yml b/docker-compose.yml index 7458265..02e1b40 100644 --- a/docker-compose.yml +++ b/docker-compose.yml @@ -15,10 +15,14 @@ services: volumes: - /tmp/.X11-unix:/tmp/.X11-unix - ollama-intel-gpu:/root/.ollama + - ./start-ollama.sh:/start-ollama.sh:ro + shm_size: "16G" environment: - ONEAPI_DEVICE_SELECTOR=level_zero:0 + #- SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 + #- SYCL_CACHE_PERSISTENT=1 - IPEX_LLM_NUM_CTX=16384 - - LD_LIBRARY_PATH:/opt/intel/oneapi/compiler/2024.2/lib + - LD_LIBRARY_PATH=/opt/intel/oneapi/compiler/2024.2/lib - DISPLAY=${DISPLAY} - OLLAMA_DEFAULT_KEEPALIVE="6h" - OLLAMA_HOST=0.0.0.0 @@ -38,7 +42,8 @@ services: ollama-webui: image: ghcr.io/open-webui/open-webui:latest container_name: ollama-webui - #volumes: + volumes: + - ./webui/data:/app/backend/data # - ollama-webui:/app/backend/data depends_on: - ollama-intel-gpu diff --git a/start-ollama.sh b/start-ollama.sh new file mode 100644 index 0000000..e444abe --- /dev/null +++ b/start-ollama.sh @@ -0,0 +1,18 @@ +#!/bin/bash +export OLLAMA_NUM_GPU=999 +export no_proxy=localhost,127.0.0.1 +export ZES_ENABLE_SYSMAN=1 +# [optional] under most circumstances, the following environment variable may improve performance, but sometimes this may also cause performance degradation +export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1 + +# Use OLLAMA_HOST and OLLAMA_KEEP_ALIVE from environment (set via docker-compose), +# falling back to sensible defaults if not set. +export OLLAMA_HOST="${OLLAMA_HOST:-0.0.0.0:11434}" +export OLLAMA_KEEP_ALIVE="${OLLAMA_KEEP_ALIVE:-24h}" + +# [optional] if you want to run on single GPU, use below command to limit GPU may improve performance +# export ONEAPI_DEVICE_SELECTOR=level_zero:0 +# If you have more than one dGPUs, according to your configuration you can use configuration like below, it will use the first and second card. +# export ONEAPI_DEVICE_SELECTOR="level_zero:0;level_zero:1" + +./ollama serve