Skip to content

Instantly share code, notes, and snippets.

@jpsutton
Last active January 28, 2025 19:00
Show Gist options
  • Save jpsutton/2ba0cdda57d7c03c83fa2c887947f506 to your computer and use it in GitHub Desktop.
Save jpsutton/2ba0cdda57d7c03c83fa2c887947f506 to your computer and use it in GitHub Desktop.
Run AI models on Docker Compose in a more secure way (no egress)

Notes

  1. This is jank.
  2. Modify setup_model.sh to change the AI model in use. If you change the model, you may need to delete the "ollama_data" folder.
  3. This setup assumes that you're using an Nvidia GPU and have the Nvidia Container Toolkit setup on the host OS.
  4. Steps to running this:
    • Download files from the gist to a folder.
    • Open a terminal to that folder, and run docker-compose up
  5. After everything is running, the web UI should be available on http://localhost:3000
services:
temporary:
image: ollama/ollama
volumes:
- ./ollama_data:/root/.ollama
- ./setup_model.sh:/setup_model.sh
entrypoint: ["/bin/bash", "-c"]
command: ["/setup_model.sh"]
backend:
image: ollama/ollama
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
volumes:
- ./ollama_data:/root/.ollama
networks:
- internal
depends_on:
temporary:
condition: service_completed_successfully
frontend:
networks:
- internal
- default
image: ghcr.io/open-webui/open-webui:main
volumes:
- ./open_webui_data:/app/backend/data
environment:
- OLLAMA_BASE_URL=http://backend:11434
depends_on:
backend:
condition: service_started
ports:
- "3000:8080"
networks:
internal:
internal: true
default:
driver: bridge
volumes:
ollama_data:
driver: local
open_webui_data:
driver: local
#!/bin/bash -x
ollama serve &
ollama_pid=$!
sleep 2s
# Change this line to specify a different AI model
ollama pull deepseek-r1:7b
sleep 2s
kill -9 $ollama_pid
exit 0
@jpsutton
Copy link
Author

jpsutton commented Jan 28, 2025

Changes needed for AMD GPU use (tested on my system with a Radeon RX 6600 XT):

services:
  temporary:
    image: ollama/ollama:rocm
    volumes:
      - ./ollama_data:/root/.ollama
      - ./setup_model.sh:/setup_model.sh
    entrypoint: ["/bin/bash", "-c"]
    command: ["/setup_model.sh"]

  backend:
    image: ollama/ollama:rocm
    volumes:
      - ./ollama_data:/root/.ollama
    networks:
      - internal
    depends_on:
      temporary:
        condition: service_completed_successfully
    devices:
      - /dev/kfd
      - /dev/dri
    environment:
      - HSA_OVERRIDE_GFX_VERSION=10.3.0

The HSA_OVERRIDE_GFX_VERSION env var is probably not needed on a card which is officially supported by ROCm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment