Open WebUI Setup Guide: Your Own ChatGPT Alternative Running Locally
You're tired of paying OpenAI subscriptions. You want your data to stay on your servers. And you want a ChatGPT-like interface that actually looks good. Open WebUI gives you exactly that—a polished, feature-rich web interface for running local large language models through Ollama. This guide gets you from docker compose up to a fully working self-hosted AI chat interface in under 15 minutes.
What You'll Build
By the end of this guide, you'll have:
- Open WebUI running in Docker with a persistent database
- Ollama serving local LLMs (Llama, Mistral, Gemma, and more)
- GPU acceleration configured (if you have an NVIDIA or AMD card)
- Multi-user support with authentication enabled
- HTTPS access via reverse proxy for production use
- A working ChatGPT-style interface accessible from your browser
Prerequisites
Before you start, make sure you have:
- Docker 24.0+ and Docker Compose v2 installed
- At least 4GB RAM (8GB+ recommended, 16GB+ for larger models)
- At least 20GB free disk space for models and containers
- A modern CPU with AVX support (most Intel/AMD CPUs from 2011+)
- Optional: NVIDIA GPU with CUDA 12.8+ or AMD GPU with ROCm for acceleration
- Ports
3000(Open WebUI) and11434(Ollama) accessible
Step 1: Quick Start with Docker Run
The fastest way to test Open WebUI is a single Docker command. This pulls the latest image and starts the container with persistent storage:
docker run -d \
-p 3000:8080 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Visit http://localhost:3000 and you'll see the Open WebUI login screen. The first user to register becomes the admin automatically.
Note: This quick start assumes Ollama is running separately on your host at http://host.docker.internal:11434. If you don't have Ollama installed yet, skip to Step 2 for the full Docker Compose setup.
Step 2: Production Docker Compose with Ollama
For a proper setup, run both Ollama and Open WebUI together via Docker Compose. This ensures they can communicate over an internal Docker network and both have persistent storage.
Directory Structure
open-webui-project/
├── docker-compose.yml
└── .env
The docker-compose.yml File
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
environment:
- OLLAMA_ORIGINS=*
restart: unless-stopped
networks:
- ai-net
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
volumes:
- open-webui:/app/backend/data
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
- WEBUI_AUTH=True
- ENABLE_SIGNUP=True
- DEFAULT_MODELS=llama3.2
- DEFAULT_USER_ROLE=user
depends_on:
- ollama
restart: unless-stopped
networks:
- ai-net
volumes:
ollama:
open-webui:
networks:
ai-net:
driver: bridge
The .env File
# Generate a secret key: openssl rand -hex 32
WEBUI_SECRET_KEY=your-64-character-hex-secret-key-here
# Optional: pin specific versions
OLLAMA_DOCKER_TAG=latest
WEBUI_DOCKER_TAG=main
Deploy the Stack
cd open-webui-project
docker compose up -d
Both services will start. Ollama pulls the base image, and Open WebUI connects to it over the internal Docker network.
Step 3: Download Your First Model
With Ollama running, pull a model and start chatting. Here are some popular options:
# Llama 3.2 (3B params, fast, good for most tasks)
docker exec ollama ollama pull llama3.2
# Mistral 7B (strong reasoning)
docker exec ollama ollama pull mistral
# Gemma 2 9B (Google's model, great performance)
docker exec ollama ollama pull gemma2:9b
# DeepSeek Coder (for coding tasks)
docker exec ollama ollama pull deepseek-coder:6.7b
Model sizes vary from 2GB to 30GB+. Check your disk space before pulling large models. Once downloaded, they appear automatically in the Open WebUI model selector.
Step 4: Configure GPU Acceleration
Running LLMs on CPU works but is slow. If you have a GPU, enable acceleration for dramatically faster responses.
NVIDIA GPU (CUDA)
First, install the NVIDIA Container Toolkit on your host:
# Ubuntu/Debian
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt update && sudo apt install -y nvidia-container-toolkit
sudo systemctl restart docker
Then update your Compose file to use the CUDA-enabled Ollama image and pass the GPU through:
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
environment:
- OLLAMA_ORIGINS=*
restart: unless-stopped
networks:
- ai-net
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
AMD GPU (ROCm)
Use the ROCm-enabled Ollama image:
services:
ollama:
image: ollama/ollama:rocm
container_name: ollama
volumes:
- ollama:/root/.ollama
ports:
- "11434:11434"
devices:
- /dev/kfd:/dev/kfd
- /dev/dri:/dev/dri
restart: unless-stopped
networks:
- ai-net
Step 5: Secure Your Instance
Before exposing Open WebUI to the internet, lock it down properly.
Enable Authentication
Authentication is enabled by default (WEBUI_AUTH=True). The first user to register becomes the admin. To disable signups after creating your admin account:
environment:
- ENABLE_SIGNUP=False
- WEBUI_AUTH=True
Set a Persistent Secret Key
Without WEBUI_SECRET_KEY, you'll be logged out every time the container restarts. Generate one:
openssl rand -hex 32
Copy the output into your .env file as WEBUI_SECRET_KEY.
HTTPS with Reverse Proxy
For production, put Open WebUI behind Nginx, Traefik, or Caddy with valid TLS certificates. Here's a minimal Nginx config:
server {
listen 443 ssl http2;
server_name ai.yourdomain.com;
ssl_certificate /etc/letsencrypt/live/ai.yourdomain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/ai.yourdomain.com/privkey.pem;
location / {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
# WebSocket support required for Open WebUI
location /ws {
proxy_pass http://localhost:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
Step 6: Connect External APIs (Optional)
Open WebUI isn't limited to local models. You can also connect OpenAI, Anthropic, or other API providers alongside your local Ollama models.
Add OpenAI API as a Backend
In the Open WebUI admin panel, go to Admin Settings → Connections and add your OpenAI API key. Your users can then choose between local and cloud models in the chat interface.
Or set it via environment variable:
environment:
- OPENAI_API_KEY=sk-your-openai-key-here
- OPENAI_API_BASE_URL=https://api.openai.com/v1
Tips and Troubleshooting
WebSocket connection errors
Open WebUI requires WebSocket support. If you see connection errors behind a reverse proxy, ensure your proxy forwards WebSocket upgrade headers:
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
Ollama connection refused
If Open WebUI can't reach Ollama, verify the container can communicate:
docker exec open-webui curl http://ollama:11434/api/tags
If this fails, check that both containers are on the same Docker network and Ollama is healthy.
Model downloads are slow
Ollama downloads models from its registry. Speed depends on your internet connection. For air-gapped environments, you can pre-download models and mount them as a volume:
# On a machine with internet
docker exec ollama ollama pull llama3.2
docker cp ollama:/root/.ollama/models /path/to/backup
# On air-gapped machine, mount the models
volumes:
- /path/to/backup:/root/.ollama/models
Out of memory when loading models
LLMs are memory-hungry. If Ollama crashes on model load:
- Use a smaller model (try
llama3.2:1binstead ofllama3.2) - Enable swap space as a buffer
- Close other applications to free RAM
- Quantized models (Q4, Q5) use less memory at the cost of some accuracy
Updating Open WebUI
To update to the latest version:
docker compose pull
docker compose up -d
Or use Watchtower for automatic updates:
docker run --rm \
--volume /var/run/docker.sock:/var/run/docker.sock \
nickfedor/watchtower \
--run-once open-webui
Reset admin password
If you lose admin access, reset the database volume (this deletes all users and chats):
docker compose down
docker volume rm open-webui-project_open-webui
docker compose up -d
Next Steps
You now have a self-hosted ChatGPT alternative running locally. From here, you can:
- Experiment with different models (code, reasoning, multilingual)
- Set up document RAG by uploading files for context-aware chat
- Configure user roles and permissions for team access
- Integrate external APIs alongside local models
- Enable voice input and image generation pipelines
- Fine-tune models on your own data
Need Help With Production AI Infrastructure?
Running local LLMs is powerful, but production AI infrastructure involves model selection, GPU optimization, scaling, and security hardening. If your team needs help architecting a self-hosted AI platform, selecting the right models for your use case, or integrating local LLMs into your existing applications, get in touch with our team at Sysbrix. We've built and deployed AI infrastructure for startups and enterprises—and we can get yours running right.