Why Self-Host Your AI Chat Interface?
ChatGPT is convenient. It's also a black box that logs your conversations, throttles you on a free tier, and charges you for API access the moment you get serious. If you're building internal tools, handling sensitive data, or just tired of the dependency, there's a cleaner path: run the whole stack yourself.
Open WebUI gives you a polished, ChatGPT-style interface that connects to local models via Ollama — or to any OpenAI-compatible API. You own the data. You control the models. You set the rules.
This Open WebUI setup guide walks you through everything: prerequisites, installation, model management, HTTPS with Nginx, and the troubleshooting fixes you'll actually need.
Prerequisites
Before you start, make sure you have the following in place. Skipping any of these is the most common reason setups fail.
Hardware
- CPU-only: Any modern x86-64 server or VPS with at least 8 GB RAM. Expect slow inference — fine for light personal use.
- GPU (recommended): NVIDIA GPU with 8 GB+ VRAM for comfortable 7B model inference. 16 GB+ VRAM opens up 13B models. CUDA 12.x drivers required.
- Disk: At least 20 GB free. Model weights are large — a 7B Q4 model is ~4 GB; a 13B Q4 is ~8 GB.
Software
- Ubuntu 22.04 / 24.04 (or any Linux distro with systemd). This guide uses Ubuntu.
- Docker Engine 24+ and Docker Compose v2
- Ollama (we'll install it below)
- A domain name pointing to your server if you want HTTPS (optional but strongly recommended for anything beyond localhost)
Verify Docker is Ready
docker --version
docker compose version
Both commands should return version strings. If Docker isn't installed yet, follow the official Docker install guide.
Step 1 — Install Ollama
Ollama is the local model runtime that Open WebUI talks to. It handles model downloads, quantization, and the inference API on port 11434.
Install with the Official Script
curl -fsSL https://ollama.com/install.sh | sh
The script detects your GPU and installs the appropriate CUDA/ROCm runtime automatically. Once it finishes, Ollama runs as a systemd service.
Verify Ollama is Running
systemctl status ollama
curl http://localhost:11434/api/version
You should see {"version":"..."} from the API. If the service isn't running, start it manually:
systemctl enable --now ollama
Pull Your First Model
Pull a model before connecting Open WebUI — it makes the first-run experience much smoother. llama3.2 is a good starting point (3B, fast, capable):
# Fast, lightweight — good for testing
ollama pull llama3.2
# More capable — needs ~8 GB VRAM or patient CPU inference
ollama pull llama3.1:8b
# List downloaded models
ollama list
Step 2 — Deploy Open WebUI with Docker
Open WebUI ships as a single Docker image. There are two install paths depending on whether Ollama is running on the same host or a separate machine.
Option A: Ollama and Open WebUI on the Same Host (Most Common)
Use --add-host=host.docker.internal:host-gateway so the container can reach Ollama on the host's loopback interface:
docker run -d \
-p 3000:8080 \
--add-host=host.docker.internal:host-gateway \
-e OLLAMA_BASE_URL=http://host.docker.internal:11434 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
Option B: Remote Ollama Instance
If Ollama runs on a different server (e.g., a GPU box on your LAN), point directly at its IP:
docker run -d \
-p 3000:8080 \
-e OLLAMA_BASE_URL=http://192.168.1.50:11434 \
-v open-webui:/app/backend/data \
--name open-webui \
--restart always \
ghcr.io/open-webui/open-webui:main
For this to work, Ollama on the remote host must listen on all interfaces. Edit its systemd override:
# On the Ollama host:
sudo systemctl edit ollama
Add these lines in the editor that opens:
[Service]
Environment="OLLAMA_HOST=0.0.0.0"
Then reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart ollama
Open the Interface
Navigate to http://your-server-ip:3000 in your browser. The first account you create becomes the admin. Create it immediately — anyone who reaches the URL first becomes admin.
Step 3 — Docker Compose Setup (Production-Ready)
The single docker run command is fine for testing. For anything you'll actually run day-to-day, use Docker Compose so you can manage config in version control.
Create a project directory and a docker-compose.yml:
mkdir ~/open-webui && cd ~/open-webui
# docker-compose.yml
services:
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: always
ports:
- "127.0.0.1:3000:8080" # bind to localhost only — Nginx handles public access
volumes:
- open-webui-data:/app/backend/data
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
- OLLAMA_BASE_URL=http://host.docker.internal:11434
- WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY} # set in .env
- CORS_ALLOW_ORIGIN=https://ai.yourdomain.com
# Optional: connect to OpenAI as well
# - OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
open-webui-data:
Create a .env file alongside it:
# .env
WEBUI_SECRET_KEY=change-this-to-a-long-random-string-now
# OPENAI_API_KEY=sk-...
Start it:
docker compose up -d
docker compose logs -f open-webui
Step 4 — HTTPS with Nginx Reverse Proxy
Running Open WebUI over plain HTTP is fine on localhost. The moment it's reachable from the internet, you need HTTPS. Here's a minimal Nginx config with Let's Encrypt via Certbot.
Install Nginx and Certbot
sudo apt update
sudo apt install -y nginx certbot python3-certbot-nginx
Create the Nginx Site Config
sudo nano /etc/nginx/sites-available/open-webui
Paste the following — replace ai.yourdomain.com with your actual domain:
server {
listen 80;
server_name ai.yourdomain.com;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
proxy_read_timeout 300s;
proxy_send_timeout 300s;
client_max_body_size 50M;
}
}
Enable the site and issue a certificate:
sudo ln -s /etc/nginx/sites-available/open-webui /etc/nginx/sites-enabled/
sudo nginx -t
sudo systemctl reload nginx
# Issue and auto-configure HTTPS certificate
sudo certbot --nginx -d ai.yourdomain.com
Certbot rewrites the Nginx config to handle HTTPS and sets up auto-renewal. Your interface is now available at https://ai.yourdomain.com.
Note: The
proxy_read_timeout 300sis important. Local model inference can be slow. Without a generous timeout, Nginx will kill long responses mid-stream.
Step 5 — Managing Models and Configuration
With everything running, here's what to do inside Open WebUI to get the most out of it.
Pulling Models from the Admin Panel
Go to Admin Settings → Connections → Ollama → Manage. You can type any model name from the Ollama library and pull it directly from the UI — no terminal needed.
Recommended starting models:
llama3.2:3b— Fast, low VRAM, solid for general Q&Allama3.1:8b— Better reasoning, needs ~8 GB VRAMmistral:7b— Strong coding and instruction followingnomic-embed-text— Required for RAG (document search) — pull this too
Connecting OpenAI (Optional)
Open WebUI can proxy OpenAI models alongside local ones. Go to Admin Settings → Connections → OpenAI API and enter your API key. Your users get one interface for both local and cloud models.
Enabling RAG (Document Uploads)
RAG lets users upload PDFs, docs, and web pages and query them in chat. To enable it properly, set the embedding engine in Admin Settings → Documents. For fully local operation, set the embedding engine to Ollama and pick nomic-embed-text as the embedding model.
User Management
By default, Open WebUI allows anyone who reaches the URL to sign up. For a private deployment, go to Admin Settings → Users and set Default User Role to Pending — new accounts require admin approval before they can log in.
Step 6 — Tips, Gotchas, and Troubleshooting
This is the section most guides skip. These are the actual problems you'll hit.
Problem: Open WebUI Can't Connect to Ollama
The most common issue. The container can't reach the host's localhost — it has its own network namespace.
Fix: Make sure you used --add-host=host.docker.internal:host-gateway and set OLLAMA_BASE_URL=http://host.docker.internal:11434. Verify Ollama is listening:
curl http://host.docker.internal:11434/api/version
# If that fails from inside the container:
docker exec -it open-webui curl http://host.docker.internal:11434/api/version
Problem: Responses Time Out or Get Cut Off
Usually a proxy timeout. Increase proxy_read_timeout in Nginx (already covered above). If you're not behind Nginx, confirm the container itself isn't being killed — check logs:
docker logs open-webui --tail 50 -f
Problem: Out of Memory During Inference
You're trying to run a model that's too large for your VRAM or RAM. Options:
- Switch to a smaller quantization:
ollama pull llama3.1:8b-instruct-q4_0instead of the default Q4_K_M - Reduce context length in Ollama's Modelfile
- Use a smaller model entirely
Problem: WebSocket Errors in the Browser Console
Open WebUI uses WebSockets for streaming. If your Nginx config is missing the Upgrade and Connection headers, streaming breaks. The Nginx config above includes them — double-check they're present.
Problem: CORS Errors After Putting a Proxy in Front
Set CORS_ALLOW_ORIGIN to your public domain in the container's environment variables. In the Docker Compose config above, it's already in the template — just update the domain.
Tip: Keep Open WebUI Updated
The project moves fast. Pull the latest image regularly:
docker compose pull
docker compose up -d
Your data is in a named volume (open-webui-data), so it persists across image upgrades.
Tip: Back Up Your Volume
Everything — conversations, settings, user accounts — lives in the open-webui-data volume. Back it up before major upgrades:
docker run --rm \
-v open-webui-data:/data \
-v $(pwd):/backup \
alpine tar czf /backup/open-webui-backup-$(date +%Y%m%d).tar.gz /data
What You've Built
At this point you have:
- A self-hosted ChatGPT-style interface running on your own server
- Local LLM inference via Ollama — no data sent to third parties
- HTTPS access from any device via Nginx
- A production-ready Docker Compose setup you can version-control and redeploy anywhere
- Optional OpenAI API passthrough for when you want cloud models in the same UI
This setup handles personal use easily. For small teams it works fine too, as long as you manage user roles and keep Ollama behind a firewall.
Need Enterprise-Grade Deployment?
Single-server setups have limits. If you're looking at multi-GPU inference clusters, SSO integration, role-based access for larger teams, or private cloud deployments with compliance requirements — that's a different conversation.
The Sysbrix team has deployed and hardened AI infrastructure across production environments. We can help you architect something that scales and stays secure.