Fix ollama not reachable from host due to hardcoded OLLAMA_HOST in entrypoint

The IPEX-LLM bundled start-ollama.sh hardcodes OLLAMA_HOST=127.0.0.1 and
OLLAMA_KEEP_ALIVE=10m, overriding docker-compose environment variables and
preventing external connections through Docker port mapping.

- Add custom start-ollama.sh that honours env vars with sensible defaults
- Mount it read-only into the container
- Fix LD_LIBRARY_PATH env var syntax (: -> =)
- Add .gitignore for IDE/swap/webui data files
- Update CHANGELOG and README with fix documentation

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
2026-02-12 15:18:37 +00:00
parent 96913a2a18
commit 8debf2010b
5 changed files with 58 additions and 2 deletions
Vendored
+9
View File
@@ -0,0 +1,9 @@
# IDE
.idea/
# Swap files
*.swp
*.swo
# Open WebUI local data
webui/
+13
View File
@@ -2,6 +2,19 @@
## 2026-02-12
### Fix: Ollama not reachable from host via Docker port mapping
The bundled IPEX-LLM `/start-ollama.sh` entrypoint hardcodes
`OLLAMA_HOST='127.0.0.1:11434'` and `OLLAMA_KEEP_ALIVE=10m`, overriding any
values set through Docker Compose environment variables.
- Added a custom `start-ollama.sh` that respects environment variables
(`${OLLAMA_HOST:-0.0.0.0:11434}`, `${OLLAMA_KEEP_ALIVE:-24h}`) instead of
hardcoding them
- Mounted the script into the container as a read-only volume
(`./start-ollama.sh:/start-ollama.sh:ro`)
- Fixed `LD_LIBRARY_PATH` env var syntax in docker-compose.yml (`:` -> `=`)
### Updated Intel GPU runtime stack to latest releases
- **level-zero**: v1.22.4 -> v1.28.0
+11
View File
@@ -34,6 +34,17 @@ docker compose up
Then launch your web browser to http://localhost:3000 to launch the web ui. Create a local OpenWeb UI credential, then click the settings icon in the top right of the screen, then select 'Models', then click 'Show', then download a model like 'llama3.1:8b-instruct-q8_0' for Intel ARC A770 16GB VRAM
### Custom `start-ollama.sh` entrypoint
The upstream IPEX-LLM portable zip ships a `start-ollama.sh` that hardcodes
`OLLAMA_HOST=127.0.0.1` and `OLLAMA_KEEP_ALIVE=10m`, preventing the container
from accepting connections via Docker port mapping and ignoring Compose
environment overrides.
This repo includes a corrected `start-ollama.sh` (mounted read-only into the
container) that honours environment variables set in `docker-compose.yml`,
falling back to sensible defaults (`0.0.0.0:11434`, `24h`).
### Update to the latest IPEX-LLM Portable Zip Version
To update to the latest portable zip version of IPEX-LLM's Ollama, update the compose file with the build arguments shown below, using the latest `ollama-*.tgz` release from https://github.com/intel/ipex-llm/releases/tag/v2.3.0-nightly , then rebuild the image.
+7 -2
View File
@@ -15,10 +15,14 @@ services:
volumes:
- /tmp/.X11-unix:/tmp/.X11-unix
- ollama-intel-gpu:/root/.ollama
- ./start-ollama.sh:/start-ollama.sh:ro
shm_size: "16G"
environment:
- ONEAPI_DEVICE_SELECTOR=level_zero:0
#- SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
#- SYCL_CACHE_PERSISTENT=1
- IPEX_LLM_NUM_CTX=16384
- LD_LIBRARY_PATH:/opt/intel/oneapi/compiler/2024.2/lib
- LD_LIBRARY_PATH=/opt/intel/oneapi/compiler/2024.2/lib
- DISPLAY=${DISPLAY}
- OLLAMA_DEFAULT_KEEPALIVE="6h"
- OLLAMA_HOST=0.0.0.0
@@ -38,7 +42,8 @@ services:
ollama-webui:
image: ghcr.io/open-webui/open-webui:latest
container_name: ollama-webui
#volumes:
volumes:
- ./webui/data:/app/backend/data
# - ollama-webui:/app/backend/data
depends_on:
- ollama-intel-gpu
+18
View File
@@ -0,0 +1,18 @@
#!/bin/bash
export OLLAMA_NUM_GPU=999
export no_proxy=localhost,127.0.0.1
export ZES_ENABLE_SYSMAN=1
# [optional] under most circumstances, the following environment variable may improve performance, but sometimes this may also cause performance degradation
export SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
# Use OLLAMA_HOST and OLLAMA_KEEP_ALIVE from environment (set via docker-compose),
# falling back to sensible defaults if not set.
export OLLAMA_HOST="${OLLAMA_HOST:-0.0.0.0:11434}"
export OLLAMA_KEEP_ALIVE="${OLLAMA_KEEP_ALIVE:-24h}"
# [optional] if you want to run on single GPU, use below command to limit GPU may improve performance
# export ONEAPI_DEVICE_SELECTOR=level_zero:0
# If you have more than one dGPUs, according to your configuration you can use configuration like below, it will use the first and second card.
# export ONEAPI_DEVICE_SELECTOR="level_zero:0;level_zero:1"
./ollama serve