Open WebUI Setup Guide: Run Your Own ChatGPT-Style Interface for Any LLM

ChatGPT is convenient until you think about what you're sending to it. Customer data, internal documents, proprietary code — all of it leaving your network with every message. Open WebUI gives you a ChatGPT-quality interface that runs entirely on your own server, connects to local models via Ollama, supports OpenAI and any compatible API, handles multi-user access with accounts and roles, and never sends a single byte of your conversations to a third party. This Open WebUI setup guide gets you from zero to a fully operational private AI interface for your team.

Prerequisites

A Linux server or local machine (Ubuntu 20.04+ recommended)
Docker Engine and Docker Compose v2 installed
At least 2GB RAM — more if running local models via Ollama on the same machine
Ollama installed and running (for local models) — or an OpenAI/compatible API key for cloud models
Port 3000 available for Open WebUI
A domain name for team access (optional for local use, recommended for production)

Verify your starting point:

docker --version
docker compose version
free -h

# If using Ollama — confirm it's running
curl -s http://localhost:11434/api/tags | jq '[.models[].name]'

# Check port 3000 is free
sudo ss -tlnp | grep 3000

What Is Open WebUI and Why Use It?

Open WebUI (formerly Ollama WebUI) is a feature-rich, self-hosted web interface for LLMs. It started as a frontend for Ollama but has grown into a full-featured AI platform that supports virtually any LLM backend.

What You Get Out of the Box

ChatGPT-style conversation UI — conversation history, markdown rendering, code highlighting, and copy buttons
Multi-model support — switch between models mid-conversation; Ollama, OpenAI, Anthropic, and any OpenAI-compatible API all work simultaneously
Multi-user accounts — user registration, admin/user roles, and per-user conversation history
Document RAG — upload files to conversations and query them using built-in retrieval
Web search integration — connect SearXNG or other search engines for real-time web search grounding
Image generation — DALL-E and Stable Diffusion integration from the same interface
Voice input/output — speech-to-text and text-to-speech built in
Custom system prompts and personas — save and reuse prompts across sessions
Pipelines — filter and transform messages with custom Python middleware before they reach the LLM

Who It's For

Open WebUI is the right choice when you want a polished, daily-driver AI interface that non-technical users can actually use — without sending their prompts to OpenAI. It's particularly strong for teams that run local models via Ollama and want a shared interface with proper user management.

Installing Open WebUI

Option 1: With Bundled Ollama (Simplest All-in-One)

If you don't have Ollama installed yet and want everything in one container:

# CPU only — Ollama + Open WebUI bundled
docker run -d \
  -p 3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:ollama

# With NVIDIA GPU support
docker run -d \
  --gpus all \
  -p 3000:8080 \
  -v ollama:/root/.ollama \
  -v open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:cuda

Option 2: Docker Compose with External Ollama (Recommended)

If Ollama is already running on the host, connect Open WebUI to it via Docker Compose. This is the cleaner setup for production — each service owns its own lifecycle:

# docker-compose.yml
version: '3.8'

services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    restart: unless-stopped
    ports:
      - "3000:8080"
    environment:
      # Ollama running on host — use bridge IP on Linux
      - OLLAMA_BASE_URL=http://172.17.0.1:11434
      # Optional: pre-configure OpenAI-compatible API
      - OPENAI_API_BASE_URL=${OPENAI_API_BASE_URL:-}
      - OPENAI_API_KEY=${OPENAI_API_KEY:-}
      # Auth settings
      - WEBUI_SECRET_KEY=${WEBUI_SECRET_KEY}
      - WEBUI_NAME=My Private AI
      # Disable signup after first user (set to false to lock down)
      - ENABLE_SIGNUP=true
      # Default locale
      - DEFAULT_LOCALE=en
    volumes:
      - open_webui_data:/app/backend/data
    extra_hosts:
      - "host.docker.internal:host-gateway"

volumes:
  open_webui_data:

Create your .env file:

# .env
# Generate with: openssl rand -hex 32
WEBUI_SECRET_KEY=your-secret-key-here

# Optional: OpenAI or LiteLLM proxy
OPENAI_API_BASE_URL=https://api.openai.com/v1
OPENAI_API_KEY=sk-your-openai-key

# Or point at a LiteLLM proxy for unified model access:
# OPENAI_API_BASE_URL=http://litellm:4000/v1
# OPENAI_API_KEY=sk-litellm-your-key

Start the stack:

docker compose up -d
docker compose logs -f open-webui

Wait for Application startup complete in the logs. Open http://localhost:3000 — the first account you create automatically becomes the admin.

Configuring Models and Connections

Connecting to Ollama

Once logged in, go to Admin Panel → Settings → Connections. The Ollama URL should already be populated from the environment variable. Click the refresh icon next to it — you'll see a list of all models currently pulled in Ollama appear immediately.

If models aren't showing up, verify the Ollama URL is reachable from inside the container:

# Test Ollama reachability from inside the Open WebUI container
docker exec open-webui curl -s http://172.17.0.1:11434/api/tags | python3 -m json.tool

# If that fails, find the correct bridge IP:
ip addr show docker0 | grep 'inet ' | awk '{print $2}' | cut -d/ -f1
# Update OLLAMA_BASE_URL in .env with this IP, then restart:
docker compose up -d --force-recreate open-webui

Pulling Models from the UI

You can pull Ollama models directly from the Open WebUI interface without touching the terminal. Go to Admin Panel → Settings → Models → Pull a model from Ollama.com. Type a model name (e.g., llama3.2, mistral, codellama) and click pull. The download progress appears in real time.

Adding OpenAI or Compatible APIs

In Admin Panel → Settings → Connections → OpenAI API, add your API base URL and key. Open WebUI will list all available models from that API alongside your Ollama models in the model selector. Users can switch between local and cloud models per conversation.

This is where connecting Open WebUI to a LiteLLM proxy shines — one base URL and one key gives users access to GPT-4o, Claude, Gemini, and local models all from the same dropdown.

User Management and Access Control

Managing User Accounts

Open WebUI has a proper user management system. As admin, go to Admin Panel → Users to:

View all registered accounts and their last activity
Promote users to admin or demote to regular user
Suspend accounts to revoke access without deleting history
Delete accounts and all associated conversations

Controlling Registration

By default, anyone who reaches your Open WebUI URL can register. For a private team deployment, lock this down after your team has registered:

# Option 1: Disable signup entirely via environment variable
# Add to docker-compose.yml environment:
- ENABLE_SIGNUP=false

# Option 2: Require admin approval for new signups
# In Admin Panel → Settings → General:
# Set "Default User Role" to "pending"
# New users can register but can't use the app until an admin approves them

# Option 3: Invite-only via email domain restriction
- ENABLE_SIGNUP=true
- TRUSTED_EMAIL_HEADER=X-Forwarded-Email  # Works with OAuth proxies

Per-Model Access Control

Admins can restrict which models specific users or groups can access. Go to Admin Panel → Models — you can hide models from regular users, set usage limits, and customize the model name and description shown in the UI. Useful for hiding expensive cloud models from general users while keeping them available for specific accounts.

Serving Open WebUI Over HTTPS

For team access, put Open WebUI behind Nginx with a proper domain and SSL certificate:

server {
    listen 80;
    server_name ai.yourdomain.com;
    return 301 https://$host$request_uri;
}

server {
    listen 443 ssl http2;
    server_name ai.yourdomain.com;

    ssl_certificate /etc/letsencrypt/live/ai.yourdomain.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/ai.yourdomain.com/privkey.pem;
    ssl_protocols TLSv1.2 TLSv1.3;

    # File uploads for RAG
    client_max_body_size 100M;

    location / {
        proxy_pass http://localhost:3000;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection 'upgrade';
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_cache_bypass $http_upgrade;
        # Required for streaming LLM responses
        proxy_buffering off;
        proxy_read_timeout 300s;
    }
}

sudo nginx -t && sudo systemctl reload nginx
sudo certbot --nginx -d ai.yourdomain.com

# Verify HTTPS is working and streaming responses work
curl -I https://ai.yourdomain.com
# Should return 200 with valid SSL headers

After setting up HTTPS, update Open WebUI's base URL in Admin Panel → Settings → General → WebUI URL to https://ai.yourdomain.com. This ensures links, OAuth callbacks, and the PWA manifest all use the correct URL.

Tips, Gotchas, and Troubleshooting

Streaming Responses Cut Off Mid-Generation

If responses stop generating partway through, the proxy is killing the long-lived connection. The Nginx config above already handles this with proxy_buffering off and proxy_read_timeout 300s. Also check that no upstream firewall or load balancer has aggressive timeout rules. Test streaming directly against the container to isolate the issue:

# Test streaming directly against Open WebUI (bypassing Nginx)
curl -X POST http://localhost:3000/api/chat \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_SESSION_TOKEN' \
  -d '{
    "model": "llama3.2",
    "messages": [{"role": "user", "content": "Count to 50 slowly"}],
    "stream": true
  }'
# If this streams fine but the browser cuts off, the issue is in Nginx

"Failed to Connect to Ollama" After Working Fine

The Docker bridge IP can change if Docker restarts. Pin it by using host.docker.internal with the extra_hosts mapping (already in the Compose file above) instead of a hardcoded IP:

# Update OLLAMA_BASE_URL to use the stable hostname:
OLLAMA_BASE_URL=http://host.docker.internal:11434

# The extra_hosts mapping in Compose resolves this to the host IP:
extra_hosts:
  - "host.docker.internal:host-gateway"

# Restart after updating:
docker compose up -d --force-recreate open-webui

Documents Not Being Used in RAG Conversations

Open WebUI's built-in RAG requires an embedding model to be configured. Go to Admin Panel → Settings → Documents and set an embedding model. For local use, Ollama embedding models work — pull nomic-embed-text and select it as the embedding model. Without this configured, uploaded files are stored but not indexed for retrieval.

# Pull a local embedding model for RAG
ollama pull nomic-embed-text

# Verify it's available
curl http://localhost:11434/api/tags | jq '[.models[].name]'

# Then in Open WebUI:
# Admin Panel → Settings → Documents → Embedding Model
# Select: ollama/nomic-embed-text

Updating Open WebUI

docker compose pull open-webui
docker compose up -d open-webui

# Watch startup — database migrations run automatically
docker compose logs -f open-webui

# Verify version after update
docker inspect open-webui | jq '.[0].Config.Image'

All user accounts, conversation history, uploaded documents, and custom settings persist in the open_webui_data volume and survive updates.

Pro Tips

Use Pipelines for content filtering and logging — Open WebUI Pipelines let you intercept every message with custom Python before it reaches the LLM. Use this for PII detection, content moderation, audit logging, or cost tracking per user.
Create Model Presets for common use cases — go to Workspace → Models to create named presets that combine a base model with a system prompt, temperature, and other settings. Give your team presets like Code Assistant, Document Summarizer, and Customer Support so they get the right configuration without thinking about it.
Enable the PWA for a native-app feel — Open WebUI is a Progressive Web App. Tell your team to install it from their browser (Add to Home Screen on mobile, Install App in Chrome desktop) for a full-screen, app-like experience.
Set conversation tags and archive aggressively — conversation history grows fast in a shared instance. Encourage your team to tag and archive conversations so the sidebar stays useful.
Back up the data volume regularly — it contains all conversation history, user accounts, uploaded documents, and settings. A daily docker run --rm -v open_webui_data:/data -v /backup:/backup alpine tar czf /backup/open-webui-$(date +%Y-%m-%d).tar.gz /data is enough.

Wrapping Up

A complete Open WebUI setup gives your team a private, polished AI interface that matches ChatGPT's usability without sending a single message outside your infrastructure. Local models via Ollama mean conversations stay completely on-premise. Cloud model APIs give you access to GPT-4o or Claude when you need them — without giving up control of who uses what and what they're sending.

Deploy it with Docker Compose, configure your Ollama connection and at least one embedding model for RAG, put it behind Nginx with HTTPS, lock down registration after your team has signed up, and you have a daily-driver AI tool that scales from one person to a full company without changing the architecture.

Need a Private AI Platform Built for Your Team?

Deploying Open WebUI for a team of developers is one thing — deploying it for a whole company with SSO, audit trails, departmental model access controls, and on-premise GPU infrastructure is another. The sysbrix team designs and deploys private AI platforms that enterprise teams can actually rely on.

Talk to Us →

in Guides

# AI LLM Ollama Open WebUI Self-Hosted

Flowise Self-Host Guide: Build LLM Apps Visually Without Writing a Single Line of Glue Code

Learn how to self-host Flowise with Docker, build RAG chatbots and LLM pipelines using the drag-and-drop canvas, connect your models and tools, and expose them as production-ready APIs.