Teams often start with cloud BI quickly, then discover they need stricter data residency, clearer access controls, and predictable operating costs. This guide shows how to deploy Metabase with Docker Compose + Caddy on Ubuntu in a way you can run safely in production. The target reader is an operations engineer or platform owner responsible for uptime, security posture, and handover quality. By the end, you will have a hardened deployment pattern, secrets strategy, backup runbook, and practical troubleshooting playbook your team can actually use on-call.
Architecture and flow overview
The stack uses three core components: PostgreSQL for application state, Metabase as the analytics UI/query layer, and Caddy as the reverse proxy at the edge. Keeping these responsibilities separated gives you cleaner failure isolation and easier upgrades. PostgreSQL persistence stays on a named volume, Metabase runs statelessly from a pinned container image, and the reverse proxy owns TLS termination and browser-facing headers.
Data flow is straightforward but important to understand operationally. A browser request lands on Caddy, TLS is terminated, and traffic is proxied internally to Metabase. Metabase reads and writes configuration data to PostgreSQL while querying your upstream business databases through configured connections. This means backup priorities are clear: protect the Metabase app database first, then document connection metadata and environment variables as part of recovery readiness.
From a reliability standpoint, Docker Compose is enough for many SMB and mid-market analytics deployments when paired with disciplined runbooks. You can enforce controlled upgrades, monitor logs, and maintain fast rollback paths without introducing orchestration complexity too early. If your team later needs multi-host scheduling or autoscaling, this baseline still transitions cleanly to Kubernetes with the same service boundaries.
Prerequisites
Before deployment, validate ownership and access assumptions. The majority of production incidents in self-hosted analytics platforms are not image bugs; they are DNS drift, expired certificates, weak secrets, missing backups, or undocumented network rules. Treat this section as a gate checklist rather than a suggestion list.
- Ubuntu 22.04/24.04 server with at least 2 vCPU, 4โ8 GB RAM, and stable disk IOPS.
- A DNS record (for example
analytics.example.com) pointing to your server. - Shell access with sudo privileges and a change window for initial rollout.
- A secrets policy: generated high-entropy values, no plaintext secrets in git repos.
- Outbound network access from Metabase to your source databases and SMTP provider.
Step-by-step deployment
1) Prepare host packages and Docker engine
Install Docker Engine and Compose plugin from Dockerโs official repository so you can pin and upgrade predictably. Avoid mixing distro-provided Docker packages with upstream packages unless your internal baseline explicitly requires that. After installation, add your operator user to the docker group and re-login to avoid repeated sudo during routine operations.
sudo apt-get update && sudo apt-get install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update && sudo apt-get install -y docker-ce docker-ce-cli containerd.io docker-compose-plugin
sudo usermod -aG docker $USERManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
2) Create project directory and secrets file
Use a single project directory so ownership, backups, and permission reviews stay simple. Store runtime secrets in a local .env file with restricted permissions. Never hardcode these values in docker-compose.yml, and never commit real values to source control.
# /opt/metabase/.env
POSTGRES_PASSWORD=replace-with-64-char-random-string
MB_ENCRYPTION_SECRET_KEY=replace-with-another-64-char-random-stringManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
3) Define Compose services
Compose should encode restart policy, health checks, network segmentation, and explicit image tags. Those decisions prevent silent drift and reduce recovery ambiguity. The example below includes PostgreSQL health checking so Metabase startup waits for database readiness, which avoids noisy boot loops and partial initialization behavior.
version: "3.9"
services:
db:
image: postgres:16-alpine
container_name: metabase-db
restart: unless-stopped
environment:
POSTGRES_DB: metabase
POSTGRES_USER: metabase
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- db_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U metabase"]
interval: 10s
timeout: 5s
retries: 5
networks: [backend]
metabase:
image: metabase/metabase:v0.49.15
container_name: metabase
restart: unless-stopped
environment:
MB_DB_TYPE: postgres
MB_DB_DBNAME: metabase
MB_DB_PORT: 5432
MB_DB_USER: metabase
MB_DB_PASS: ${POSTGRES_PASSWORD}
MB_DB_HOST: db
MB_SITE_URL: https://analytics.example.com
MB_ENCRYPTION_SECRET_KEY: ${MB_ENCRYPTION_SECRET_KEY}
JAVA_TIMEZONE: America/Chicago
depends_on:
db:
condition: service_healthy
networks: [backend]
caddy:
image: caddy:2.8
container_name: metabase-caddy
restart: unless-stopped
ports:
- "80:80"
- "443:443"
volumes:
- ./Caddyfile:/etc/caddy/Caddyfile:ro
- caddy_data:/data
- caddy_config:/config
depends_on:
- metabase
networks: [backend]
volumes:
db_data:
caddy_data:
caddy_config:
networks:
backend:
driver: bridgeManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
4) Configure reverse proxy
The reverse proxy layer is where user-facing reliability and security become visible: TLS behavior, request forwarding, and headers. Keep the config minimal but intentional. Start with a small, auditable configuration and only add complexity after you observe real traffic patterns.
# /opt/metabase/Caddyfile
analytics.example.com {
encode gzip zstd
reverse_proxy metabase:3000
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
X-Frame-Options "SAMEORIGIN"
Referrer-Policy "strict-origin-when-cross-origin"
}
}Manual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
5) Start services and validate baseline health
Pull images explicitly before startup so deployment logs are clean and deterministic. Once containers are up, verify service status and check Metabase logs for successful migrations. If migrations fail, stop and fix immediately rather than retrying blindlyโrepeated startup attempts can obscure the original error context.
cd /opt/metabase
docker compose --env-file .env pull
docker compose --env-file .env up -d
docker compose psManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
Configuration and secret-handling best practices
For production, move from ad-hoc secrets to a managed process quickly. At minimum, rotate POSTGRES_PASSWORD and MB_ENCRYPTION_SECRET_KEY on a documented schedule and after staffing changes. Long-lived static credentials are one of the most common weak points in self-hosted analytics footprints.
Use a separate read-only database account for each analytics source where possible. This limits blast radius if a BI credential is exposed and simplifies audit posture. Also enforce network-level egress controls so Metabase can only reach approved database hosts and service endpoints.
Enable SMTP and alerting early, not after launch. Password reset workflows and incident notifications are critical operational dependencies. A deployment that works only in daylight hours with one operator online is not production-ready.
Plan upgrade policy before your first upgrade. Pin image versions in Compose, define test/stage/prod promotion steps, and require a rollback checkpoint (database backup + previous image digest). This single discipline prevents most avoidable downtime during maintenance windows.
Verification checklist
Use an explicit verification sequence to avoid false confidence. A green browser load is not enough. Confirm TLS responses, container health, and application logs together so you can distinguish transient startup behavior from stable service health.
# service health
curl -I https://analytics.example.com
# container status
docker compose ps
# check application logs for migration completion
docker compose logs --tail=120 metabaseManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
After initial login, complete these acceptance checks:
- Create one sample dashboard and verify rendering latency is acceptable.
- Test at least one SQL/native query and one GUI-generated query path.
- Validate role-based access with a non-admin test user.
- Confirm email delivery for invites/password reset.
- Record baseline metrics (CPU, memory, response time) for future comparison.
Common issues and fixes
Metabase loops on startup with DB connection errors
Root cause is usually incorrect env values, DNS resolution in the Docker network, or DB service not healthy. Confirm MB_DB_HOST=db and verify PostgreSQL health check status. If credentials were changed, update both database and Metabase env values together and restart in a controlled sequence.
TLS works intermittently or certificate issuance fails
Check DNS A/AAAA records and firewall ports first. Most cert issues are edge reachability problems, not proxy bugs. Ensure 80/443 are open and not already bound by another process. If using cloud firewalls, validate both inbound policy and provider-level security groups.
Dashboard performance degrades during business hours
Start with database query plans and indexing, not container limits alone. Metabase surfaces query behavior but cannot compensate for unindexed warehouse tables. Apply sensible caching where applicable and enforce dashboard design standards to avoid expensive unbounded queries.
Backups exist but restores fail under pressure
Backup success without restore testing is a known anti-pattern. Schedule quarterly restore drills in a disposable environment and verify user login, dashboard integrity, and database connectivity after restore. Keep recovery-time targets realistic and documented.
#!/usr/bin/env bash
set -euo pipefail
TS=$(date +%F-%H%M)
mkdir -p /var/backups/metabase
docker exec metabase-db pg_dump -U metabase metabase | gzip > /var/backups/metabase/metabase-${TS}.sql.gz
find /var/backups/metabase -type f -mtime +14 -deleteManual copy fallback: If the copy button does not work in your browser/editor, select the full block and copy it directly.
FAQ
Can I run Metabase without PostgreSQL in production?
It is technically possible to start with an embedded app database in some contexts, but production should use PostgreSQL for reliability, backup consistency, and safer upgrades.
How often should I upgrade Metabase?
Use a regular monthly cadence for minor versions with security advisories handled faster. Always stage upgrades first and include rollback checkpoints.
What is the minimum backup policy for this stack?
At minimum: daily logical dumps, 14-day retention, encrypted off-host copy, and quarterly restore drills. If analytics is business-critical, increase frequency and shorten recovery objectives.
Should I expose Metabase directly instead of using Caddy?
No. Keeping a reverse proxy in front gives cleaner TLS management, better header controls, and easier future integration with SSO/WAF tooling.
How do I handle secrets rotation with minimal downtime?
Rotate one secret domain at a time (database credentials, then app encryption keys when applicable), document sequence, and perform changes during defined windows with backups ready.
When should we move from Compose to Kubernetes?
Move when you outgrow single-host operations: multi-team ownership, strict multi-environment policy controls, or high availability requirements that Compose cannot satisfy cleanly.
Internal links
If you are building a broader self-hosted platform, these related guides are useful next steps:
- Production Guide: Deploy Outline with Docker Compose, Caddy, PostgreSQL, and Redis on Ubuntu
- Production Guide: Deploy Kestra with Docker Compose + Traefik + PostgreSQL on Ubuntu
- Deploy Grafana with Docker Compose and Traefik on Ubuntu: Production-Ready Observability Guide
Talk to us
Need help deploying a production-ready Outline knowledge platform, integrating SSO, or building secure backup and upgrade runbooks for your team? We can help with architecture, hardening, migration, and operational readiness.