Skip to content

Instantly share code, notes, and snippets.

@hguerrero
Created February 14, 2025 19:01
Show Gist options
  • Save hguerrero/6905eb4ec1e1fc4d4c573a27eaeaa920 to your computer and use it in GitHub Desktop.
Save hguerrero/6905eb4ec1e1fc4d4c573a27eaeaa920 to your computer and use it in GitHub Desktop.
qwen2.5b deployment
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: llm-storage
spec:
accessModes:
- ReadWriteOnce
volumeMode: Filesystem
resources:
requests:
storage: 5Gi
---
kind: Deployment
apiVersion: apps/v1
metadata:
name: llm-server
spec:
replicas: 1
selector:
matchLabels:
app: llm-server
template:
metadata:
creationTimestamp: null
labels:
app: llm-server
spec:
volumes:
- name: llm-storage
persistentVolumeClaim:
claimName: llm-storage
containers:
- name: container
image: 'quay.io/wcaban/ollama:latest'
ports:
- containerPort: 11434
protocol: TCP
resources: {}
volumeMounts:
- name: llm-storage
mountPath: /.ollama
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
imagePullPolicy: Always
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "ollama pull qwen2.5:0.5b-instruct"]
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
name: llm
spec:
ports:
- port: 8000
protocol: TCP
targetPort: 11434
selector:
app: llm-server
type: ClusterIP
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment