Production Guide: Deploy OpenObserve with Docker Compose + Traefik + ClickHouse on Ubuntu

If your team is shipping multiple services and you need fast, cost-efficient log search without committing to a heavyweight observability stack, OpenObserve is a practical option. In this guide, we deploy OpenObserve in a production-oriented setup using Docker Compose, Traefik, and ClickHouse on Ubuntu. The goal is not just to get a green container status, but to build an implementation you can operate confidently: predictable networking, TLS termination, secret handling, retention controls, backups, and recovery checks.

We use a realistic flow: Traefik terminates HTTPS and routes traffic to OpenObserve, OpenObserve persists and indexes data in ClickHouse, and you ingest logs from common agents over OTLP/HTTP. Along the way, we include verification commands and failure-mode troubleshooting so you can move from “it starts” to “it survives production incidents.”

Architecture and flow overview

The deployment uses three layers:

Edge layer (Traefik): Handles HTTPS certificates, HTTP-to-HTTPS redirects, and upstream routing.
Application layer (OpenObserve): Ingestion endpoint, stream/query engine, dashboarding, and alerting workflows.
Data layer (ClickHouse): Durable storage and fast analytical queries for high-cardinality log data.

Request path for the UI is:

Client browser → Traefik (443) → OpenObserve UI/API container → ClickHouse backend

Ingestion path for application logs is:

Application/collector → OpenObserve ingest endpoint (OTLP/HTTP) → ClickHouse storage

Why this layout works well in production:

Traefik gives a clean TLS and routing boundary with simple label-based config.
ClickHouse keeps query latency stable as data grows, especially for analytical log queries.
Docker Compose keeps operational overhead low while preserving clean service separation.

# Expected request flow
# Browser -> Traefik :443 -> OpenObserve :5080 -> ClickHouse :8123/:9000
# Collector -> OpenObserve ingest endpoint -> ClickHouse

If the copy button does not work in your browser/editor, manually select and copy the command block.

Prerequisites

Ubuntu 22.04+ host with 4 vCPU / 8 GB RAM minimum (8 vCPU / 16 GB recommended for sustained ingest).
Docker Engine and Docker Compose plugin installed.
DNS A/AAAA record for your OpenObserve hostname (example: logs.example.com) pointing to the server.
Ports 80/443 reachable from the internet.
A strong admin password for OpenObserve and secure credentials for ClickHouse.

Before deployment, create a dedicated service user (optional but recommended), update base packages, and verify clock sync. Time drift causes confusing ingest/query behavior in log systems.

sudo apt update && sudo apt -y upgrade
sudo timedatectl set-ntp true
sudo timedatectl status

# Docker quick check
docker --version
docker compose version

If the copy button does not work in your browser/editor, manually select and copy the command block.

Step-by-step deployment

1) Create project layout and external network

Use a dedicated directory so config, env files, and Compose definitions stay versionable. We also create a shared Docker network used by Traefik and backend services.

sudo mkdir -p /opt/openobserve/{traefik,clickhouse,data,config}
sudo chown -R $USER:$USER /opt/openobserve
cd /opt/openobserve

docker network create edge_net || true

If the copy button does not work in your browser/editor, manually select and copy the command block.

2) Create environment file

Place secrets in .env and keep the file readable only by privileged operators. Avoid committing it to Git. Rotate secrets on handoff or incident response events.

cat > /opt/openobserve/.env <<'EOF'
DOMAIN=logs.example.com
[email protected]
TRAEFIK_DASH_AUTH=admin:$2y$05$replace_with_htpasswd_hash
[email protected]
OO_ADMIN_PASSWORD=replace-with-strong-password
CLICKHOUSE_DB=openobserve
CLICKHOUSE_USER=oo_user
CLICKHOUSE_PASSWORD=replace-with-long-random-secret
TZ=UTC
EOF

chmod 600 /opt/openobserve/.env

If the copy button does not work in your browser/editor, manually select and copy the command block.

3) Create Docker Compose stack

The stack includes Traefik for ingress, ClickHouse for storage, and OpenObserve for ingest/query/UI. Traefik labels define routing and TLS. We intentionally keep each service explicit so debugging is straightforward.

version: "3.9"
services:
  traefik:
    image: traefik:v3.1
    container_name: traefik
    command:
      - --api.dashboard=true
      - --providers.docker=true
      - --providers.docker.exposedbydefault=false
      - --entrypoints.web.address=:80
      - --entrypoints.websecure.address=:443
      - --certificatesresolvers.le.acme.email=${ACME_EMAIL}
      - --certificatesresolvers.le.acme.storage=/letsencrypt/acme.json
      - --certificatesresolvers.le.acme.httpchallenge=true
      - --certificatesresolvers.le.acme.httpchallenge.entrypoint=web
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./traefik:/letsencrypt
    networks: [edge_net]

  clickhouse:
    image: clickhouse/clickhouse-server:24.8
    container_name: clickhouse
    environment:
      - CLICKHOUSE_DB=${CLICKHOUSE_DB}
      - CLICKHOUSE_USER=${CLICKHOUSE_USER}
      - CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
      - TZ=${TZ}
    volumes:
      - ./clickhouse:/var/lib/clickhouse
    ulimits:
      nofile: 262144
    networks: [edge_net]

  openobserve:
    image: public.ecr.aws/zinclabs/openobserve:latest
    container_name: openobserve
    depends_on: [clickhouse]
    environment:
      - ZO_ROOT_USER_EMAIL=${OO_ADMIN_EMAIL}
      - ZO_ROOT_USER_PASSWORD=${OO_ADMIN_PASSWORD}
      - ZO_LOCAL_MODE=false
      - ZO_META_STORE=postgres
      - ZO_CLICKHOUSE_HOST=clickhouse
      - ZO_CLICKHOUSE_PORT=8123
      - ZO_CLICKHOUSE_USER=${CLICKHOUSE_USER}
      - ZO_CLICKHOUSE_PASSWORD=${CLICKHOUSE_PASSWORD}
      - ZO_CLICKHOUSE_DATABASE=${CLICKHOUSE_DB}
      - TZ=${TZ}
    labels:
      - traefik.enable=true
      - traefik.http.routers.openobserve.rule=Host(`${DOMAIN}`)
      - traefik.http.routers.openobserve.entrypoints=websecure
      - traefik.http.routers.openobserve.tls=true
      - traefik.http.routers.openobserve.tls.certresolver=le
      - traefik.http.services.openobserve.loadbalancer.server.port=5080
      - traefik.http.routers.openobserve-http.rule=Host(`${DOMAIN}`)
      - traefik.http.routers.openobserve-http.entrypoints=web
      - traefik.http.routers.openobserve-http.middlewares=redirect-to-https
      - traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https
    volumes:
      - ./data:/data
      - ./config:/config
    networks: [edge_net]

networks:
  edge_net:
    external: true

If the copy button does not work in your browser/editor, manually select and copy the command block.

4) Launch and check container health

Bring up the stack and verify that all services are running before testing TLS and login.

cd /opt/openobserve
docker compose up -d
docker compose ps
docker compose logs --tail=80 traefik
docker compose logs --tail=80 openobserve
docker compose logs --tail=80 clickhouse

If the copy button does not work in your browser/editor, manually select and copy the command block.

5) Create a first stream and test ingest/query

After first login, create an organization/stream in OpenObserve. Send sample JSON events via HTTP, then query them in the UI. Validate timestamps, fields, and parsing behavior before connecting real workloads.

curl -k -u "${OO_ADMIN_EMAIL}:${OO_ADMIN_PASSWORD}" \
  -H "Content-Type: application/json" \
  -X POST "https://${DOMAIN}/api/default/default/_json" \
  -d '{"level":"info","service":"payments-api","message":"startup ok","env":"prod"}'

If the copy button does not work in your browser/editor, manually select and copy the command block.

Configuration and secrets handling best practices

Production reliability depends heavily on safe defaults and disciplined secret handling. Keep these practices in place:

Least privilege: Use separate credentials for ClickHouse and avoid sharing root-level credentials across systems.
Network boundaries: Keep ClickHouse private on internal Docker networking; do not publish database ports unless truly needed.
Secret storage: Store .env in restricted paths with 600 permissions; rotate secrets at regular intervals.
Data retention: Define stream-level retention early to prevent runaway storage costs.
Backup policy: Snapshot ClickHouse and OpenObserve data directories on a schedule; test restore monthly.
Audit and alerting: Create baseline alerts for ingestion drops, query failures, and disk usage thresholds.

For teams with strict compliance requirements, move secrets from local env files into an external secrets manager and inject them at runtime. Even in Compose deployments, this materially lowers leakage risk during handoffs and incident debugging.

# Example backup (local snapshot pattern)
cd /opt/openobserve
sudo tar -czf /var/backups/openobserve-data-$(date +%F).tgz data clickhouse config

# Example restore validation (dry run listing)
tar -tzf /var/backups/openobserve-data-$(date +%F).tgz | head

If the copy button does not work in your browser/editor, manually select and copy the command block.

Verification checklist

HTTPS certificate is valid and auto-renew path exists in Traefik logs.
OpenObserve login succeeds with non-default admin credentials.
Sample ingestion appears in the expected stream within seconds.
Query latency remains acceptable under representative log volume.
ClickHouse disk growth aligns with retention expectations.
Backup archive can be listed and restore drill is documented.

# Endpoint checks
curl -I https://${DOMAIN}

# Quick service-level diagnostics
docker compose ps
docker stats --no-stream

docker compose logs --tail=120 openobserve | egrep -i "error|warn|panic" || true
docker compose logs --tail=120 clickhouse | egrep -i "error|exception" || true

If the copy button does not work in your browser/editor, manually select and copy the command block.

Common issues and fixes

TLS certificate not issued

Most failures are DNS mismatch, blocked port 80, or stale challenge data. Confirm A/AAAA records, inbound firewall rules, and Traefik ACME settings. Restart Traefik after fixing DNS and watch logs for fresh challenge attempts.

OpenObserve starts but cannot query data

Usually a backend connectivity or credential mismatch to ClickHouse. Validate ZO_CLICKHOUSE_* values and test from the OpenObserve container to ClickHouse over internal DNS.

Ingestion works intermittently

Check timestamp formatting from clients, stream-level schema drift, and backpressure from undersized disk I/O. Add ingest buffering and increase resource reservations if sustained bursts are expected.

Disk fills too quickly

Set stream retention aggressively at first, then relax as you understand actual query needs. Logging everything forever is expensive and rarely useful; archive long-tail data to object storage instead.

# Network and DNS checks inside containers
docker exec -it openobserve getent hosts clickhouse
docker exec -it openobserve sh -c 'wget -qO- http://clickhouse:8123/ping'

# Confirm domain resolution from host
getent hosts ${DOMAIN}

If the copy button does not work in your browser/editor, manually select and copy the command block.

FAQ

1) Can I run OpenObserve without ClickHouse?

Yes, but for production-scale workloads, a dedicated analytical backend is strongly preferred for predictable query performance and operational headroom.

2) Is Docker Compose enough for production?

For many small and mid-sized teams, yes—if you include proper backups, monitoring, resource planning, and change control. Compose is often operationally simpler than a premature Kubernetes migration.

3) How should I size this stack initially?

Start with ingest rate, retention target, and query concurrency. For pilot deployments, 8 vCPU and 16 GB RAM is a practical baseline, then scale based on real usage metrics.

4) How do I secure admin access?

Use strong unique credentials, restrict dashboard access with IP allowlists/VPN where possible, and rotate secrets after team or environment changes.

5) What is the safest way to onboard application logs?

Use a collector (for example, OpenTelemetry Collector) with buffering and retry enabled. Avoid direct ad-hoc ingestion from every app instance without backpressure controls.

6) How often should I test restore?

At least monthly. Backup files are only useful if you can restore within your recovery objective in a repeatable runbook.

7) Can I migrate this to Kubernetes later?

Yes. Keep your configuration modular, externalize secrets, and document ingress/storage assumptions so migration is incremental rather than disruptive.

Related guides

Talk to us

If you want this implemented with hardened defaults, observability, and tested recovery playbooks, our team can help.

in Guides

# ClickHouse DevOps Docker Compose Observability OpenObserve Traefik

Production Guide: Deploy Docmost with Docker Compose, Traefik, and PostgreSQL on Ubuntu

A practical, operations-first deployment with security controls, backup strategy, and day-2 troubleshooting.