Skip to Content

Production Guide: Deploy OpenObserve with Kubernetes + Helm + cert-manager + ingress-nginx on Ubuntu

A production-oriented, operations-first deployment guide with secure defaults, observability, upgrade strategy, and recovery runbooks.

Introduction: real-world use case

Many teams adopt centralized observability only after a costly outage exposes blind spots in logs, metrics, and traces. In a growing platform, one failed rollout can impact checkout, notifications, and background workers at the same time. You need a single operational view that works for engineers on-call at 2 AM and for platform owners planning capacity two quarters ahead. This guide shows how to deploy OpenObserve in a production-ready Kubernetes stack on Ubuntu, using Helm for repeatability, cert-manager for TLS automation, and ingress-nginx for controlled traffic entry. We focus on practical operations: secure defaults, predictable upgrades, controlled storage growth, and incident-ready verification steps.

The deployment path here is built for teams that already run mixed workloads in Kubernetes and need an observability platform that can be deployed quickly without sacrificing reliability. We will configure namespaces, secrets, persistent storage, ingress policy, and health checks, then prove the setup through synthetic log generation and query validation. You will finish with a defensible baseline you can hand over to SRE, platform, or DevOps teams with minimal ambiguity.

Architecture and flow overview

At a high level, traffic enters through ingress-nginx, TLS certificates are issued and renewed by cert-manager, and OpenObserve runs in a dedicated namespace with persistent storage. Agents and applications send logs over HTTP ingestion endpoints. Query and dashboard access is exposed through a single hostname protected by TLS. Kubernetes handles scheduling and self-healing, while your storage class determines durability and throughput.

  • Control plane: Helm release lifecycle, Kubernetes RBAC, namespace boundaries.
  • Data path: application/agent logs → OpenObserve ingest API → persisted storage.
  • Security path: TLS termination at ingress, secret-based credentials, restricted service exposure.
  • Operations path: probes, rolling updates, backup/export strategy, and runbook-driven troubleshooting.

Prerequisites

Before deployment, ensure your Ubuntu nodes are healthy, synchronized with NTP, and running a supported Kubernetes version. You should have cluster-admin privileges for initial setup. For production, keep kubeconfig access limited to your platform team and avoid shared admin credentials.

  • Ubuntu-based Kubernetes cluster (single-node for pilot or multi-node for production).
  • kubectl and Helm installed on an operator workstation.
  • ingress-nginx running and reachable from your DNS zone.
  • cert-manager installed and issuer strategy selected (Let's Encrypt or private CA).
  • A DNS record for your OpenObserve hostname (for example, observe.example.com).
  • StorageClass suitable for persistent write/read workloads.
kubectl version --short
helm version
kubectl get nodes -o wide
kubectl get storageclass
kubectl get pods -n ingress-nginx
kubectl get pods -n cert-manager

If the copy button does not work in your browser/editor, manually select the code block and copy it.

Step-by-step deployment

1) Create namespace and baseline labels

Isolating observability components in their own namespace simplifies policy, resource controls, and disaster-recovery procedures. Labels help with cost allocation and fleet visibility.

kubectl create namespace openobserve
kubectl label namespace openobserve app.kubernetes.io/part-of=observability
kubectl label namespace openobserve owner=platform-team

If the copy button does not work in your browser/editor, manually select the code block and copy it.

2) Add and update Helm repositories

Always pin chart versions in production to avoid accidental behavior changes on future installs.

helm repo add openobserve https://openobserve.github.io/helm-charts
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm repo add jetstack https://charts.jetstack.io
helm repo update

If the copy button does not work in your browser/editor, manually select the code block and copy it.

3) Create values file with production defaults

Use declarative values so every environment can be recreated from source control. Avoid ad-hoc "kubectl edit" as your primary configuration mechanism.

cat > openobserve-values.yaml <<'EOF'
replicaCount: 2
resources:
  requests:
    cpu: 500m
    memory: 1Gi
  limits:
    cpu: "2"
    memory: 4Gi
persistence:
  enabled: true
  size: 200Gi
  storageClass: fast-ssd
service:
  type: ClusterIP
ingress:
  enabled: true
  className: nginx
  hosts:
    - host: observe.example.com
      paths:
        - path: /
          pathType: Prefix
  tls:
    - secretName: openobserve-tls
      hosts:
        - observe.example.com
extraEnv:
  - name: ZO_ROOT_USER_EMAIL
    valueFrom:
      secretKeyRef:
        name: openobserve-admin
        key: email
  - name: ZO_ROOT_USER_PASSWORD
    valueFrom:
      secretKeyRef:
        name: openobserve-admin
        key: password
EOF

If the copy button does not work in your browser/editor, manually select the code block and copy it.

4) Create secrets securely

Never hardcode admin credentials in Helm values committed to public repositories. Generate strong credentials and rotate on a schedule.

kubectl -n openobserve create secret generic openobserve-admin   [email protected]   --from-literal=password="$(openssl rand -base64 32)"

If the copy button does not work in your browser/editor, manually select the code block and copy it.

5) Install or upgrade OpenObserve

Use helm upgrade --install so the same command supports both first deploy and iterative rollouts.

helm upgrade --install openobserve openobserve/openobserve   --namespace openobserve   --values openobserve-values.yaml   --history-max 10   --wait --timeout 10m

If the copy button does not work in your browser/editor, manually select the code block and copy it.

6) Configure cert-manager issuer and ingress TLS

Certificate automation eliminates operational drift and expired cert incidents. Keep issuer definitions version-controlled.

cat > clusterissuer-letsencrypt.yaml <<'EOF'
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-prod
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: [email protected]
    privateKeySecretRef:
      name: letsencrypt-prod
    solvers:
      - http01:
          ingress:
            class: nginx
EOF
kubectl apply -f clusterissuer-letsencrypt.yaml

If the copy button does not work in your browser/editor, manually select the code block and copy it.

7) Apply ingress annotation policy

Lock down ingress behavior to reduce accidental public exposure and enforce HTTPS.

kubectl -n openobserve annotate ingress openobserve   cert-manager.io/cluster-issuer=letsencrypt-prod --overwrite
kubectl -n openobserve annotate ingress openobserve   nginx.ingress.kubernetes.io/force-ssl-redirect="true" --overwrite
kubectl -n openobserve annotate ingress openobserve   nginx.ingress.kubernetes.io/proxy-body-size="20m" --overwrite

If the copy button does not work in your browser/editor, manually select the code block and copy it.

Configuration and secrets handling best practices

Store Helm values and Kubernetes manifests in a private Git repository with branch protections. Use secret managers (Vault, External Secrets, or sealed-secrets workflows) in mature environments. Treat admin credentials as short-lived bootstrap credentials and move to SSO or delegated auth once your topology supports it. Define explicit CPU and memory requests to avoid noisy-neighbor impact during ingestion spikes.

For high-ingestion teams, configure retention intentionally instead of relying on defaults. Tie retention windows to compliance requirements and incident forensics needs. If legal retention is longer than hot storage limits, design an export pipeline to object storage and document restore procedures before you need them.

Verification checklist

Verification should prove readiness across control-plane health, data ingestion, query response, TLS validity, and persistence behavior after pod restarts.

kubectl -n openobserve get pods
kubectl -n openobserve get svc,ingress
kubectl -n openobserve describe certificate
curl -I https://observe.example.com
kubectl -n openobserve logs deploy/openobserve --tail=100

If the copy button does not work in your browser/editor, manually select the code block and copy it.

# Synthetic ingestion check (example endpoint)
curl -u "[email protected]:YOUR_PASSWORD"   -H "Content-Type: application/json"   -d '[{"level":"info","service":"checkout","message":"synthetic event","env":"prod"}]'   https://observe.example.com/api/default/default/_json

# Query back to validate index/search path in UI/API

If the copy button does not work in your browser/editor, manually select the code block and copy it.

After ingestion, confirm events are searchable in the expected stream and that dashboards load without latency spikes. Restart one pod intentionally and verify data durability and service continuity.

Common issues and fixes

Ingress returns 404 or default backend

Validate ingress class, host rules, and DNS record alignment. Ensure the ingress resource references the same class used by your controller.

TLS certificate not issuing

Check cert-manager logs, challenge resources, and ACME reachability. Most failures are DNS mismatch or blocked HTTP-01 challenge paths.

Pods pending due to storage

Review StorageClass defaults and PVC events. In cloud-backed clusters, quota or zone constraints frequently block PVC binding.

High memory usage during ingestion bursts

Increase memory requests/limits, tune batch sizes at the agent side, and scale replicas. Validate node capacity before changing pod limits.

Slow query performance after growth

Review retention and stream partitioning strategy. Archive stale data and keep frequently queried windows in faster tiers.

Credential leakage risk in pipelines

Move credentials out of plaintext CI variables, enforce secret scanning in repositories, and rotate immediately if accidental exposure occurs.

FAQ

Can I start with a single replica for a pilot?

Yes. For pilot environments, one replica is acceptable. For production, run at least two replicas and validate failover behavior before go-live.

Do I need cert-manager if I already use a reverse proxy?

You can terminate TLS upstream, but cert-manager keeps certificate lifecycle inside Kubernetes and reduces manual renewal risk.

How should I size persistent storage initially?

Estimate daily ingestion volume, retention target, and compression behavior. Start with headroom of at least 30–40% and monitor growth weekly.

What is the safest way to rotate admin credentials?

Create a second admin account or controlled maintenance window, update secret values, restart workloads safely, and verify login before revoking old credentials.

How do I integrate with existing log shippers?

Use standard ingestion endpoints and test with a staging stream first. Validate field mappings and timestamp formats before switching production traffic.

What backup approach works best for this stack?

Use scheduled PVC snapshots where available plus periodic export of critical streams/configuration metadata. Test restore quarterly, not only backup creation.

Can I run this in an air-gapped environment?

Yes, with mirrored container registries, internal PKI, and offline chart artifacts. Document image update workflow to keep patching practical.

Related guides on SysBrix

Talk to us

Need help building a production-grade observability stack, integrating secure ingestion pipelines, or creating runbooks your team can execute under pressure? We can help with architecture, hardening, migration planning, and day-2 operations.

Contact Us

Deploy OpenProject with Docker Compose and Traefik on Ubuntu (Production Guide)
A practical, operations-first guide for secure project management hosting with backups, TLS, and day-2 maintenance.