Skip to Content

Production Guide: Deploy Kestra with Docker Compose + Nginx + PostgreSQL on Ubuntu

A production-oriented deployment with reverse proxy, TLS, backups, hardening, and operational runbooks.

Teams often adopt workflow orchestration only after ad-hoc cron jobs, shell scripts, and one-off automations start to fail under real operational pressure. Kestra gives engineering and operations teams a central control plane for scheduled workflows, event-driven pipelines, and repeatable automation. In a production environment, however, the orchestration service itself becomes critical infrastructure: if it is unstable, the business processes connected to it become unstable too.

This guide walks through a production-first deployment pattern on Ubuntu using Docker Compose for service lifecycle management, PostgreSQL for durable state, and Nginx as the public edge. Instead of focusing only on “it runs on my machine,” we focus on what matters in day-two operations: secure secret handling, controlled upgrades, predictable backups, health verification, failure diagnosis, and recovery steps your team can run during incidents.

By the end, you will have a hardened baseline that supports staging-to-production promotion, minimizes downtime during upgrades, and gives operators a clear checklist for verification after every change window.

Architecture and flow overview

The architecture follows a simple but robust split: Nginx handles internet exposure and TLS, while the orchestration application and database stay behind localhost/container networking. This isolates internal services from direct internet access and gives you one clean control point for rate limits, request logging, and future web application firewall integration.

PostgreSQL runs in a dedicated container with persistent storage mounted from the host. Application state, execution metadata, and workflow history all live there, which makes backup and restore predictable. A separate application storage directory is mounted for logs and task artifacts. This separation is important because it lets you tune retention policies per data type and avoid accidental deletion during routine cleanup.

Operationally, this model supports predictable maintenance windows: you can back up the database, pull new images, perform health checks, and roll back by version if needed. Teams that standardize these controls early generally avoid the most common self-hosting failures: undocumented upgrades, hidden secret sprawl, and backups that are “configured” but never tested.

Internet
   |
   v
Nginx (TLS termination, rate limiting, request logs)
   |
   v
Kestra web + API container
   |
   +--> Worker/executor container
   |
   +--> PostgreSQL (persistent volume, daily backups)

If the copy button does not work in your browser/editor, manually select the code block and copy.

Prerequisites

Before deploying, confirm you have ownership of DNS, shell access with sudo privileges, and a domain that can terminate TLS publicly. If your organization requires outbound proxies, private package mirrors, or centralized certificate management, align those controls first so your deployment script does not diverge from policy after go-live.

  • Ubuntu 22.04/24.04 VM (4 vCPU, 8 GB RAM minimum for small teams)
  • DNS record pointing your domain (e.g., automations.example.com) to the server
  • Docker Engine + Docker Compose plugin installed
  • Ports 80/443 open on firewall/security group
  • SMTP credentials for notification emails (optional but recommended)
  • A secure location for environment secrets (vault or restricted file permissions)

Step-by-step deployment

1) Prepare host packages and runtime

Install Docker, Compose, Nginx, and Certbot in one change window. Keeping edge and runtime dependencies explicit prevents drift when handoffs occur between operators.

sudo apt update && sudo apt upgrade -y
sudo apt install -y ca-certificates curl gnupg lsb-release
sudo mkdir -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install -y docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin nginx certbot python3-certbot-nginx
sudo systemctl enable --now docker nginx
sudo usermod -aG docker $USER

If the copy button does not work in your browser/editor, manually select the code block and copy.

2) Create filesystem layout and seed secrets

Use dedicated directories under /opt/kestra so backup jobs, retention rules, and permissions are easy to reason about. Generate secrets now; do not postpone this step.

sudo mkdir -p /opt/kestra/{app,postgres,backups,logs}
sudo chown -R $USER:$USER /opt/kestra
cd /opt/kestra
openssl rand -base64 36
openssl rand -hex 32

If the copy button does not work in your browser/editor, manually select the code block and copy.

cat > /opt/kestra/.env <<'EOF'
DOMAIN=automations.example.com
POSTGRES_DB=kestra
POSTGRES_USER=kestra
POSTGRES_PASSWORD=CHANGE_ME_STRONG_PASSWORD
KESTRA_SECRET_KEY=CHANGE_ME_LONG_RANDOM_SECRET
TZ=UTC
EOF
chmod 600 /opt/kestra/.env

If the copy button does not work in your browser/editor, manually select the code block and copy.

3) Define services with Docker Compose

The compose manifest encodes dependency order, restart policies, and data mounts. Pin image tags in regulated environments; use latest only if you have strict canary and rollback discipline.

cat > /opt/kestra/docker-compose.yml <<'EOF'
services:
  postgres:
    image: postgres:16
    restart: unless-stopped
    env_file: .env
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - ./postgres:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
      interval: 10s
      timeout: 5s
      retries: 10

  kestra:
    image: kestra/kestra:latest
    restart: unless-stopped
    env_file: .env
    depends_on:
      postgres:
        condition: service_healthy
    command: server standalone
    environment:
      KESTRA_CONFIGURATION: |
        datasources:
          postgres:
            url: jdbc:postgresql://postgres:5432/${POSTGRES_DB}
            driverClassName: org.postgresql.Driver
            username: ${POSTGRES_USER}
            password: ${POSTGRES_PASSWORD}
        kestra:
          url: http://0.0.0.0:8080
          secret: ${KESTRA_SECRET_KEY}
    ports:
      - "127.0.0.1:8080:8080"
    volumes:
      - ./app:/app/storage
      - ./logs:/app/logs
EOF
cd /opt/kestra && docker compose pull && docker compose up -d

If the copy button does not work in your browser/editor, manually select the code block and copy.

4) Publish through Nginx and enable TLS

Expose only Nginx to the internet. The application stays bound to localhost on port 8080, reducing direct attack surface. After cert issuance, monitor renewal timers and keep email contacts valid.

cat > /etc/nginx/sites-available/kestra <<'EOF'
server {
  listen 80;
  server_name automations.example.com;

  location / {
    proxy_pass http://127.0.0.1:8080;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
    proxy_set_header X-Forwarded-Proto $scheme;
    proxy_read_timeout 300;
  }
}
EOF
ln -s /etc/nginx/sites-available/kestra /etc/nginx/sites-enabled/kestra
nginx -t && systemctl reload nginx
certbot --nginx -d automations.example.com --non-interactive --agree-tos -m [email protected]

If the copy button does not work in your browser/editor, manually select the code block and copy.

Configuration and secrets handling best practices

Keep secrets in .env with restrictive permissions and rotate them on a schedule tied to your security policy. For mature environments, migrate secret material to Vault, AWS Secrets Manager, or another centralized backend and inject at runtime. Avoid embedding credentials in compose YAML or shell history.

Operationally, treat configuration as code: commit redacted templates, maintain a separate secure variable store, and record every runtime change through pull requests or ticketed runbooks. This gives auditability and removes the “tribal knowledge” problem that causes outages when key personnel are unavailable.

For notification channels (email, Slack, webhooks), start with non-production endpoints and validate failure behavior. You want explicit, testable alert routes before connecting workflows that trigger customer-facing processes or financial operations.

cat > /opt/kestra/backup-postgres.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
cd /opt/kestra
source .env
TS=$(date +%F-%H%M%S)
docker compose exec -T postgres pg_dump -U "$POSTGRES_USER" "$POSTGRES_DB" | gzip > "/opt/kestra/backups/kestra-${TS}.sql.gz"
find /opt/kestra/backups -type f -name "*.sql.gz" -mtime +14 -delete
EOF
chmod +x /opt/kestra/backup-postgres.sh
(crontab -l 2>/dev/null; echo "30 2 * * * /opt/kestra/backup-postgres.sh") | crontab -

If the copy button does not work in your browser/editor, manually select the code block and copy.

Verification checklist

Verification should happen after initial deployment and after every upgrade. Start with process health, then functional checks, then data checks. Do not stop at “container is up.” Validate that workflows execute and persist results correctly.

  • All containers are healthy and restart automatically after reboot.
  • HTTPS endpoint serves valid certificate and redirects HTTP to HTTPS.
  • Database connectivity works from app container and query latency is acceptable.
  • A sample workflow executes successfully and writes expected metadata.
  • Backups are generated and restoration procedure is documented and tested.
cd /opt/kestra
docker compose ps
docker compose logs --tail=100 kestra
curl -I https://automations.example.com
docker exec -it $(docker compose ps -q postgres) psql -U "$POSTGRES_USER" -d "$POSTGRES_DB" -c "SELECT now();"

If the copy button does not work in your browser/editor, manually select the code block and copy.

Common issues and fixes

Container restarts in a loop

Usually caused by invalid env vars, bad database credentials, or stale schema expectations after image upgrades. Inspect service logs, compare against previous known-good compose file, and roll back quickly if needed.

TLS certificate issuance fails

Check DNS propagation and ensure port 80 is reachable from the public internet. If your cloud firewall blocks 80 by default, certbot HTTP-01 challenges will fail even if HTTPS appears open.

Workflow latency spikes during business hours

Confirm host CPU saturation and PostgreSQL I/O behavior. Increase resource limits, tune worker concurrency, and move to managed PostgreSQL if sustained growth outpaces single-node performance.

After upgrade, UI loads but jobs fail

This often signals plugin/runtime compatibility or migration mismatches. Follow staged rollout: backup, pull images, run smoke tests, and keep rollback package ready.

cd /opt/kestra
docker compose down
cp docker-compose.yml docker-compose.yml.prev
# apply new compose / env changes
docker compose pull
docker compose up -d
# if health checks fail, restore previous compose and restart
mv docker-compose.yml.prev docker-compose.yml
docker compose up -d

If the copy button does not work in your browser/editor, manually select the code block and copy.

FAQ

Can I run this without PostgreSQL and use an embedded database?

For production, you should not. PostgreSQL gives durability, better concurrency, and safer backup/restore patterns. Embedded databases can be acceptable for local experiments only.

How often should I back up the database?

At minimum daily for low-change environments; for high-frequency workflow updates, use hourly snapshots plus point-in-time strategies. Always test restore drills monthly.

Should I separate worker and web services?

Yes as load grows. Start in a single service for simplicity, then split worker capacity when queue depth and execution variance increase.

How do I rotate secrets without downtime?

Introduce dual-valid windows where possible: update backend credentials, deploy new app secrets, restart services in controlled order, then revoke old credentials after verification.

What monitoring should I add first?

Container health, host resource metrics, HTTP latency/error rates, queue depth, and failed workflow counts. Alert on sustained failure rates rather than one-off transient errors.

When should I move to Kubernetes?

Move when you need multi-node scheduling, automated failover, and standardized platform governance. If your current Compose setup meets SLOs, optimize operations first before adding orchestration complexity.

Related internal guides

Use these related guides to compare deployment trade-offs and operational guardrails:

Talk to us

If you want this implemented with hardened defaults, observability, and tested recovery playbooks, our team can help.

Contact Us

Production Guide: Deploy Meilisearch with Kubernetes + Helm + ingress-nginx on Ubuntu
A practical production runbook for secure Meilisearch deployment, scaling, verification, and recovery.