Ollama lets you run powerful AI models locally on your computer, giving you complete control over your AI experience.
Privacy & Control
- Your conversations and data never leave your machine
- No concerns about data being used to train commercial models
- Perfect for sensitive work, personal projects, or confidential information
Cost & Independence
- No subscription fees, API costs, or usage limits
- Use AI as much as you want without worrying about bills
- No dependency on internet connectivity or service availability
Customization & Experimentation
- Try different models to find what works best for your needs
- Experiment with specialized models (coding, creative writing, etc.)
- Fine-tune models for your specific use cases
Performance
- Often faster responses than cloud services (no network latency)
- No rate limiting or throttling during heavy usage
- Performance improvements depend on your hardware (CPU, GPU, RAM)
# Install Ollama app (includes GUI and CLI)
brew install --cask ollama-app
# Verify installation
ollama --version
Note: Users on other operating systems can find installation instructions at ollama.com
# Download and run a model (this pulls the model if needed)
ollama run llama3.2
# List installed models
ollama list
# Download a model without running it
ollama pull mistral
- gpt-oss (20B): Agentic model with function calling and chain-of-thought • Tools and reasoning
- qwen3 (7B): Alibaba's multilingual model, excellent for diverse languages
- deepseek-r1 (7B): Advanced reasoning model for complex problem-solving • Advanced reasoning
- gemma3 (27B): Google's multimodal model with reasoning • Vision and reasoning
- llama3.2 (3B): Fast general-purpose model with tool use capabilities
- mistral (7B): Fast and capable, excellent for most tasks
The killer feature: Ollama serves models using the OpenAI API format, which means countless tools work with it out of the box.
Code Editors & IDEs
- Continue (VS Code extension): AI coding assistant
- Cursor: AI-powered code editor
- Codeium: Code completion and chat
Writing & Productivity
- Raycast AI: System-wide AI assistant on Mac
- PopClip: Text manipulation with AI
- Typeface: AI writing assistant
- MacWhisper: AI transcription with local processing
Chat Interfaces
- Msty: Cross-platform desktop app with advanced features like Knowledge Stacks and split chats
- Open WebUI: Beautiful ChatGPT-like interface
- Enchanted: Native Mac app for AI chat
- LM Studio: Model management with chat interface
Development & Automation
- LangChain: Build AI applications
- AutoGen: Multi-agent conversations
- N8n: Workflow automation with AI nodes
Most tools just need you to:
- Set the host to
localhost
and port to11434
- Set any API key (Ollama doesn't require one, but apps expect it)
- Choose your local model name instead of
gpt-4
Example configuration:
- Host:
localhost
- Port:
11434
- API Key:
ollama
(or any text) - Model:
llama3.2
ormistral
Instead of being locked into expensive commercial APIs, you get:
- Freedom to experiment with different models and approaches
- Privacy for your thoughts, code, and data
- Reliability that doesn't depend on external services
- Cost control - run AI as much as you want
- Access to the growing ecosystem of open-source AI tools
The OpenAI API compatibility means you can try local AI with tools you already use, then decide whether to stick with local models or use commercial services on a case-by-case basis.