Fix ollama not reachable from host due to hardcoded OLLAMA_HOST in entrypoint

The IPEX-LLM bundled start-ollama.sh hardcodes OLLAMA_HOST=127.0.0.1 and OLLAMA_KEEP_ALIVE=10m, overriding docker-compose environment variables and preventing external connections through Docker port mapping. - Add custom start-ollama.sh that honours env vars with sensible defaults - Mount it read-only into the container - Fix LD_LIBRARY_PATH env var syntax (: -> =) - Add .gitignore for IDE/swap/webui data files - Update CHANGELOG and README with fix documentation Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-12 15:18:37 +00:00
parent 96913a2a18
commit 8debf2010b
5 changed files with 58 additions and 2 deletions
@@ -0,0 +1,9 @@
+# IDE
+.idea/
+
+# Swap files
+*.swp
+*.swo
+
+# Open WebUI local data
+webui/
@@ -2,6 +2,19 @@

 ## 2026-02-12

+### Fix: Ollama not reachable from host via Docker port mapping
+
+The bundled IPEX-LLM `/start-ollama.sh` entrypoint hardcodes
+`OLLAMA_HOST='127.0.0.1:11434'` and `OLLAMA_KEEP_ALIVE=10m`, overriding any
+values set through Docker Compose environment variables.
+
+- Added a custom `start-ollama.sh` that respects environment variables
+  (`${OLLAMA_HOST:-0.0.0.0:11434}`, `${OLLAMA_KEEP_ALIVE:-24h}`) instead of
+  hardcoding them
+- Mounted the script into the container as a read-only volume
+  (`./start-ollama.sh:/start-ollama.sh:ro`)
+- Fixed `LD_LIBRARY_PATH` env var syntax in docker-compose.yml (`:` -> `=`)
+
 ### Updated Intel GPU runtime stack to latest releases

 - **level-zero**: v1.22.4 -> v1.28.0
@@ -34,6 +34,17 @@ docker compose up

 Then launch your web browser to http://localhost:3000 to launch the web ui.  Create a local OpenWeb UI credential, then click the settings icon in the top right of the screen, then select 'Models', then click 'Show', then download a model like 'llama3.1:8b-instruct-q8_0' for Intel ARC A770 16GB VRAM

+### Custom `start-ollama.sh` entrypoint
+
+The upstream IPEX-LLM portable zip ships a `start-ollama.sh` that hardcodes
+`OLLAMA_HOST=127.0.0.1` and `OLLAMA_KEEP_ALIVE=10m`, preventing the container
+from accepting connections via Docker port mapping and ignoring Compose
+environment overrides.
+
+This repo includes a corrected `start-ollama.sh` (mounted read-only into the
+container) that honours environment variables set in `docker-compose.yml`,
+falling back to sensible defaults (`0.0.0.0:11434`, `24h`).
+
 ### Update to the latest IPEX-LLM Portable Zip Version

 To update to the latest portable zip version of IPEX-LLM's Ollama, update the compose file with the build arguments shown below, using the latest `ollama-*.tgz` release from https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly , then rebuild the image.
@@ -15,10 +15,14 @@ services:
    volumes:
      - /tmp/.X11-unix:/tmp/.X11-unix
      - ollama-intel-gpu:/root/.ollama
+      - ./start-ollama.sh:/start-ollama.sh:ro
+    shm_size: "16G"
    environment:
      - ONEAPI_DEVICE_SELECTOR=level_zero:0
+      #- SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
+      #- SYCL_CACHE_PERSISTENT=1
      - IPEX_LLM_NUM_CTX=16384
-      - LD_LIBRARY_PATH:/opt/intel/oneapi/compiler/2024.2/lib
+      - LD_LIBRARY_PATH=/opt/intel/oneapi/compiler/2024.2/lib
      - DISPLAY=${DISPLAY}
      - OLLAMA_DEFAULT_KEEPALIVE="6h"
      - OLLAMA_HOST=0.0.0.0
@@ -38,7 +42,8 @@ services:
  ollama-webui:
    image: ghcr.io/open-webui/open-webui:latest
    container_name: ollama-webui
-    #volumes:
+    volumes:
+       - ./webui/data:/app/backend/data
    #  - ollama-webui:/app/backend/data
    depends_on:
      - ollama-intel-gpu
@@ -0,0 +1,18 @@
+#!/bin/bash
+export OLLAMA_NUM_GPU=999
+export no_proxy=localhost,127.0.0.1
+export ZES_ENABLE_SYSMAN=1
+# [optional] under most circumstances, the following environment variable may improve performance, but sometimes this may also cause performance degradation
+export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
+
+# Use OLLAMA_HOST and OLLAMA_KEEP_ALIVE from environment (set via docker-compose),
+# falling back to sensible defaults if not set.
+export OLLAMA_HOST="${OLLAMA_HOST:-0.0.0.0:11434}"
+export OLLAMA_KEEP_ALIVE="${OLLAMA_KEEP_ALIVE:-24h}"
+
+# [optional] if you want to run on single GPU, use below command to limit GPU may improve performance
+# export ONEAPI_DEVICE_SELECTOR=level_zero:0
+# If you have more than one dGPUs, according to your configuration you can use configuration like below, it will use the first and second card.
+# export ONEAPI_DEVICE_SELECTOR="level_zero:0;level_zero:1"
+
+./ollama serve