Adding OpenAI Whisper integration
This commit is contained in:
60
README.md
60
README.md
@@ -1,10 +1,14 @@
|
|||||||
# Run Ollama and Stable Diffusion with your Intel Arc GPU
|
# Run Ollama, Stable Diffusion and Automatic Speech Recognition with your Intel Arc GPU
|
||||||
|
|
||||||
|
[[Blog](https://blog.eleiton.dev/posts/llm-and-genai-in-docker/)]
|
||||||
|
|
||||||
Effortlessly deploy a Docker-based solution that uses [Open WebUI](https://github.com/open-webui/open-webui) as your user-friendly
|
Effortlessly deploy a Docker-based solution that uses [Open WebUI](https://github.com/open-webui/open-webui) as your user-friendly
|
||||||
AI Interface and [Ollama](https://github.com/ollama/ollama) for integrating Large Language Models (LLM).
|
AI Interface and [Ollama](https://github.com/ollama/ollama) for integrating Large Language Models (LLM).
|
||||||
|
|
||||||
Additionally, you can run [ComfyUI](https://github.com/comfyanonymous/ComfyUI) or [SD.Next](https://github.com/vladmandic/sdnext) docker containers to
|
Additionally, you can run [ComfyUI](https://github.com/comfyanonymous/ComfyUI) or [SD.Next](https://github.com/vladmandic/sdnext) docker containers to
|
||||||
streamline Stable Diffusion capabilities
|
streamline Stable Diffusion capabilities.
|
||||||
|
|
||||||
|
You can also run an optional docker container with [OpenAI Whisper](https://github.com/openai/whisper) to perform Automatic Speech Recognition (ASR) tasks.
|
||||||
|
|
||||||
All these containers have been optimized for Intel Arc Series GPUs on Linux systems by using [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).
|
All these containers have been optimized for Intel Arc Series GPUs on Linux systems by using [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).
|
||||||
|
|
||||||
@@ -27,13 +31,17 @@ All these containers have been optimized for Intel Arc Series GPUs on Linux syst
|
|||||||
|
|
||||||
3. ComfyUI
|
3. ComfyUI
|
||||||
* The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
|
* The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
|
||||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.6.10%2Bxpu&os=linux%2Fwsl2&package=docker)
|
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||||
|
|
||||||
4. SD.Next
|
4. SD.Next
|
||||||
* All-in-one for AI generative image based on Automatic1111
|
* All-in-one for AI generative image based on Automatic1111
|
||||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.6.10%2Bxpu&os=linux%2Fwsl2&package=docker)
|
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||||
* Uses a customized version of the SD.Next [docker file](https://github.com/vladmandic/sdnext/blob/dev/configs/Dockerfile.ipex), making it compatible with the Intel Extension for Pytorch image.
|
* Uses a customized version of the SD.Next [docker file](https://github.com/vladmandic/sdnext/blob/dev/configs/Dockerfile.ipex), making it compatible with the Intel Extension for Pytorch image.
|
||||||
|
|
||||||
|
5. OpenAI Whisper
|
||||||
|
* Robust Speech Recognition via Large-Scale Weak Supervision
|
||||||
|
* Uses as the base container the official [Intel® Extension for PyTorch](* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||||
|
|
||||||
## Setup
|
## Setup
|
||||||
Run the following commands to start your Ollama instance with Open WebUI
|
Run the following commands to start your Ollama instance with Open WebUI
|
||||||
```bash
|
```bash
|
||||||
@@ -54,6 +62,11 @@ For SD.Next
|
|||||||
$ podman compose -f docker-compose.sdnext.yml up
|
$ podman compose -f docker-compose.sdnext.yml up
|
||||||
```
|
```
|
||||||
|
|
||||||
|
If you want to run Whisper for automatic speech recognition, run this command in a different terminal:
|
||||||
|
```bash
|
||||||
|
$ podman compose -f docker-compose.whisper.yml up
|
||||||
|
```
|
||||||
|
|
||||||
## Validate
|
## Validate
|
||||||
Run the following command to verify your Ollama instance is up and running
|
Run the following command to verify your Ollama instance is up and running
|
||||||
```bash
|
```bash
|
||||||
@@ -89,6 +102,45 @@ When using Open WebUI, you should see this partial output in your console, indic
|
|||||||
* That's it, go back to Open WebUI main page and start chatting. Make sure to select the `Image` button to indicate you want to generate Images.
|
* That's it, go back to Open WebUI main page and start chatting. Make sure to select the `Image` button to indicate you want to generate Images.
|
||||||

|

|
||||||
|
|
||||||
|
## Using Automatic Speech Recognition
|
||||||
|
* This is an example of a command to transcribe audio files:
|
||||||
|
```bash
|
||||||
|
podman exec -it whisper-ipex whisper https://www.lightbulblanguages.co.uk/resources/ge-audio/hobbies-ge.mp3 --device xpu --model small --language German --task transcribe
|
||||||
|
```
|
||||||
|
* Response:
|
||||||
|
```bash
|
||||||
|
[00:00.000 --> 00:08.000] Ich habe viele Hobbys. In meiner Freizeit mache ich sehr gerne Sport, wie zum Beispiel Wasserball oder Radfahren.
|
||||||
|
[00:08.000 --> 00:13.000] Außerdem lese ich gerne und lerne auch gerne Fremdsprachen.
|
||||||
|
[00:13.000 --> 00:19.000] Ich gehe gerne ins Kino, höre gerne Musik und treffe mich mit meinen Freunden.
|
||||||
|
[00:19.000 --> 00:22.000] Früher habe ich auch viel Basketball gespielt.
|
||||||
|
[00:22.000 --> 00:26.000] Im Frühling und im Sommer werde ich viele Radtouren machen.
|
||||||
|
[00:26.000 --> 00:29.000] Außerdem werde ich viel schwimmen gehen.
|
||||||
|
[00:29.000 --> 00:33.000] Am liebsten würde ich das natürlich im Meer machen.
|
||||||
|
```
|
||||||
|
* This is an example of a command to translate audio files:
|
||||||
|
```bash
|
||||||
|
podman exec -it whisper-ipex whisper https://www.lightbulblanguages.co.uk/resources/ge-audio/hobbies-ge.mp3 --device xpu --model small --language German --task translate
|
||||||
|
```
|
||||||
|
* Response:
|
||||||
|
```bash
|
||||||
|
[00:00.000 --> 00:02.000] I have a lot of hobbies.
|
||||||
|
[00:02.000 --> 00:05.000] In my free time I like to do sports,
|
||||||
|
[00:05.000 --> 00:08.000] such as water ball or cycling.
|
||||||
|
[00:08.000 --> 00:10.000] Besides, I like to read
|
||||||
|
[00:10.000 --> 00:13.000] and also like to learn foreign languages.
|
||||||
|
[00:13.000 --> 00:15.000] I like to go to the cinema,
|
||||||
|
[00:15.000 --> 00:16.000] like to listen to music
|
||||||
|
[00:16.000 --> 00:19.000] and meet my friends.
|
||||||
|
[00:19.000 --> 00:22.000] I used to play a lot of basketball.
|
||||||
|
[00:22.000 --> 00:26.000] In spring and summer I will do a lot of cycling tours.
|
||||||
|
[00:26.000 --> 00:29.000] Besides, I will go swimming a lot.
|
||||||
|
[00:29.000 --> 00:33.000] Of course, I would prefer to do this in the sea.
|
||||||
|
```
|
||||||
|
* To use your own audio files instead of web files, place them in the `~/whisper-files` folder and access them like this:
|
||||||
|
```bash
|
||||||
|
podman exec -it whisper-ipex whisper YOUR_FILE_NAME.mp3 --device xpu --model small --task translate
|
||||||
|
```
|
||||||
|
|
||||||
## Updating the containers
|
## Updating the containers
|
||||||
If there are new updates in the [ipex-llm-inference-cpp-xpu](https://hub.docker.com/r/intelanalytics/ipex-llm-inference-cpp-xpu) docker Image or in the Open WebUI docker Image, you may want to update your containers, to stay up to date.
|
If there are new updates in the [ipex-llm-inference-cpp-xpu](https://hub.docker.com/r/intelanalytics/ipex-llm-inference-cpp-xpu) docker Image or in the Open WebUI docker Image, you may want to update your containers, to stay up to date.
|
||||||
|
|
||||||
|
|||||||
18
docker-compose.whisper.yml
Normal file
18
docker-compose.whisper.yml
Normal file
@@ -0,0 +1,18 @@
|
|||||||
|
version: '3'
|
||||||
|
|
||||||
|
services:
|
||||||
|
whisper-ipex:
|
||||||
|
build:
|
||||||
|
context: whisper
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
image: whisper-ipex:latest
|
||||||
|
container_name: whisper-ipex
|
||||||
|
restart: unless-stopped
|
||||||
|
devices:
|
||||||
|
- /dev/dri:/dev/dri
|
||||||
|
volumes:
|
||||||
|
- whisper-models-volume:/root/.cache/whisper
|
||||||
|
- ~/whisper-files:/app
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
whisper-models-volume: {}
|
||||||
16
whisper/Dockerfile
Normal file
16
whisper/Dockerfile
Normal file
@@ -0,0 +1,16 @@
|
|||||||
|
FROM intel/intel-extension-for-pytorch:2.7.10-xpu
|
||||||
|
|
||||||
|
ENV USE_XETLA=OFF
|
||||||
|
ENV SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||||
|
ENV SYCL_CACHE_PERSISTENT=1
|
||||||
|
|
||||||
|
# Install required packages
|
||||||
|
RUN apt-get update && apt-get install -y ffmpeg
|
||||||
|
|
||||||
|
# Download the Whisper repository
|
||||||
|
RUN pip install --upgrade pip && pip install -U openai-whisper
|
||||||
|
|
||||||
|
# Set the working directory to /app
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
CMD ["tail", "-f", "/dev/null"]
|
||||||
Reference in New Issue
Block a user