Adding OpenAI Whisper integration
This commit is contained in:
60
README.md
60
README.md
@@ -1,10 +1,14 @@
|
||||
# Run Ollama and Stable Diffusion with your Intel Arc GPU
|
||||
# Run Ollama, Stable Diffusion and Automatic Speech Recognition with your Intel Arc GPU
|
||||
|
||||
[[Blog](https://blog.eleiton.dev/posts/llm-and-genai-in-docker/)]
|
||||
|
||||
Effortlessly deploy a Docker-based solution that uses [Open WebUI](https://github.com/open-webui/open-webui) as your user-friendly
|
||||
AI Interface and [Ollama](https://github.com/ollama/ollama) for integrating Large Language Models (LLM).
|
||||
|
||||
Additionally, you can run [ComfyUI](https://github.com/comfyanonymous/ComfyUI) or [SD.Next](https://github.com/vladmandic/sdnext) docker containers to
|
||||
streamline Stable Diffusion capabilities
|
||||
streamline Stable Diffusion capabilities.
|
||||
|
||||
You can also run an optional docker container with [OpenAI Whisper](https://github.com/openai/whisper) to perform Automatic Speech Recognition (ASR) tasks.
|
||||
|
||||
All these containers have been optimized for Intel Arc Series GPUs on Linux systems by using [Intel® Extension for PyTorch](https://github.com/intel/intel-extension-for-pytorch).
|
||||
|
||||
@@ -27,13 +31,17 @@ All these containers have been optimized for Intel Arc Series GPUs on Linux syst
|
||||
|
||||
3. ComfyUI
|
||||
* The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
|
||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.6.10%2Bxpu&os=linux%2Fwsl2&package=docker)
|
||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||
|
||||
4. SD.Next
|
||||
* All-in-one for AI generative image based on Automatic1111
|
||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu&version=v2.6.10%2Bxpu&os=linux%2Fwsl2&package=docker)
|
||||
* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||
* Uses a customized version of the SD.Next [docker file](https://github.com/vladmandic/sdnext/blob/dev/configs/Dockerfile.ipex), making it compatible with the Intel Extension for Pytorch image.
|
||||
|
||||
5. OpenAI Whisper
|
||||
* Robust Speech Recognition via Large-Scale Weak Supervision
|
||||
* Uses as the base container the official [Intel® Extension for PyTorch](* Uses as the base container the official [Intel® Extension for PyTorch](https://pytorch-extension.intel.com/installation?platform=gpu)
|
||||
|
||||
## Setup
|
||||
Run the following commands to start your Ollama instance with Open WebUI
|
||||
```bash
|
||||
@@ -54,6 +62,11 @@ For SD.Next
|
||||
$ podman compose -f docker-compose.sdnext.yml up
|
||||
```
|
||||
|
||||
If you want to run Whisper for automatic speech recognition, run this command in a different terminal:
|
||||
```bash
|
||||
$ podman compose -f docker-compose.whisper.yml up
|
||||
```
|
||||
|
||||
## Validate
|
||||
Run the following command to verify your Ollama instance is up and running
|
||||
```bash
|
||||
@@ -89,6 +102,45 @@ When using Open WebUI, you should see this partial output in your console, indic
|
||||
* That's it, go back to Open WebUI main page and start chatting. Make sure to select the `Image` button to indicate you want to generate Images.
|
||||

|
||||
|
||||
## Using Automatic Speech Recognition
|
||||
* This is an example of a command to transcribe audio files:
|
||||
```bash
|
||||
podman exec -it whisper-ipex whisper https://www.lightbulblanguages.co.uk/resources/ge-audio/hobbies-ge.mp3 --device xpu --model small --language German --task transcribe
|
||||
```
|
||||
* Response:
|
||||
```bash
|
||||
[00:00.000 --> 00:08.000] Ich habe viele Hobbys. In meiner Freizeit mache ich sehr gerne Sport, wie zum Beispiel Wasserball oder Radfahren.
|
||||
[00:08.000 --> 00:13.000] Außerdem lese ich gerne und lerne auch gerne Fremdsprachen.
|
||||
[00:13.000 --> 00:19.000] Ich gehe gerne ins Kino, höre gerne Musik und treffe mich mit meinen Freunden.
|
||||
[00:19.000 --> 00:22.000] Früher habe ich auch viel Basketball gespielt.
|
||||
[00:22.000 --> 00:26.000] Im Frühling und im Sommer werde ich viele Radtouren machen.
|
||||
[00:26.000 --> 00:29.000] Außerdem werde ich viel schwimmen gehen.
|
||||
[00:29.000 --> 00:33.000] Am liebsten würde ich das natürlich im Meer machen.
|
||||
```
|
||||
* This is an example of a command to translate audio files:
|
||||
```bash
|
||||
podman exec -it whisper-ipex whisper https://www.lightbulblanguages.co.uk/resources/ge-audio/hobbies-ge.mp3 --device xpu --model small --language German --task translate
|
||||
```
|
||||
* Response:
|
||||
```bash
|
||||
[00:00.000 --> 00:02.000] I have a lot of hobbies.
|
||||
[00:02.000 --> 00:05.000] In my free time I like to do sports,
|
||||
[00:05.000 --> 00:08.000] such as water ball or cycling.
|
||||
[00:08.000 --> 00:10.000] Besides, I like to read
|
||||
[00:10.000 --> 00:13.000] and also like to learn foreign languages.
|
||||
[00:13.000 --> 00:15.000] I like to go to the cinema,
|
||||
[00:15.000 --> 00:16.000] like to listen to music
|
||||
[00:16.000 --> 00:19.000] and meet my friends.
|
||||
[00:19.000 --> 00:22.000] I used to play a lot of basketball.
|
||||
[00:22.000 --> 00:26.000] In spring and summer I will do a lot of cycling tours.
|
||||
[00:26.000 --> 00:29.000] Besides, I will go swimming a lot.
|
||||
[00:29.000 --> 00:33.000] Of course, I would prefer to do this in the sea.
|
||||
```
|
||||
* To use your own audio files instead of web files, place them in the `~/whisper-files` folder and access them like this:
|
||||
```bash
|
||||
podman exec -it whisper-ipex whisper YOUR_FILE_NAME.mp3 --device xpu --model small --task translate
|
||||
```
|
||||
|
||||
## Updating the containers
|
||||
If there are new updates in the [ipex-llm-inference-cpp-xpu](https://hub.docker.com/r/intelanalytics/ipex-llm-inference-cpp-xpu) docker Image or in the Open WebUI docker Image, you may want to update your containers, to stay up to date.
|
||||
|
||||
|
||||
18
docker-compose.whisper.yml
Normal file
18
docker-compose.whisper.yml
Normal file
@@ -0,0 +1,18 @@
|
||||
version: '3'
|
||||
|
||||
services:
|
||||
whisper-ipex:
|
||||
build:
|
||||
context: whisper
|
||||
dockerfile: Dockerfile
|
||||
image: whisper-ipex:latest
|
||||
container_name: whisper-ipex
|
||||
restart: unless-stopped
|
||||
devices:
|
||||
- /dev/dri:/dev/dri
|
||||
volumes:
|
||||
- whisper-models-volume:/root/.cache/whisper
|
||||
- ~/whisper-files:/app
|
||||
|
||||
volumes:
|
||||
whisper-models-volume: {}
|
||||
16
whisper/Dockerfile
Normal file
16
whisper/Dockerfile
Normal file
@@ -0,0 +1,16 @@
|
||||
FROM intel/intel-extension-for-pytorch:2.7.10-xpu
|
||||
|
||||
ENV USE_XETLA=OFF
|
||||
ENV SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS=1
|
||||
ENV SYCL_CACHE_PERSISTENT=1
|
||||
|
||||
# Install required packages
|
||||
RUN apt-get update && apt-get install -y ffmpeg
|
||||
|
||||
# Download the Whisper repository
|
||||
RUN pip install --upgrade pip && pip install -U openai-whisper
|
||||
|
||||
# Set the working directory to /app
|
||||
WORKDIR /app
|
||||
|
||||
CMD ["tail", "-f", "/dev/null"]
|
||||
Reference in New Issue
Block a user