Skip to Content

Production Guide: Deploy Langfuse with Docker Compose + Caddy + Postgresql on Ubuntu

A production-grade deployment playbook with TLS, persistence, secrets hygiene, observability hooks, and repeatable operations.

Teams often start with ad-hoc LLM tracing and quickly outgrow spreadsheets, scattered logs, and one-off dashboards. A product squad ships a retrieval upgrade, support reports answer-quality drift, and suddenly everyone needs evidence: which prompt version regressed, which model change increased latency, and which customer segments are most affected. In practice, this is where Langfuse becomes operationally critical. This guide provides a production-oriented pattern on Ubuntu using Docker Compose, PostgreSQL for durable state, and Caddy for automatic HTTPS, so engineering and AI teams can trust telemetry behind release decisions.

Architecture and flow overview

The stack separates edge routing, app runtime, and durable data. Caddy handles TLS lifecycle and forwards traffic to the application container. PostgreSQL stores traces, sessions, and operational metadata in persistent volumes. Configuration and secrets are externalized for safe rotation, and backup drills are tested regularly. This architecture favors reliability and recoverability over quick-start shortcuts.

  • Edge: Caddy with auto certificates and secure headers.
  • App: Langfuse service in an isolated Docker network.
  • Data: PostgreSQL with persistent volume and health checks.
  • Ops: Logging, backup/restore routines, upgrade checklists.

Prerequisites

  • Ubuntu 22.04/24.04 with 2+ vCPU and 4+ GB RAM.
  • DNS A record for your domain, such as langfuse.example.com.
  • Sudo access and firewall policy allowing 22/80/443.
  • Docker Engine and Docker Compose plugin.
  • Encrypted off-host location for backup retention.

Step-by-step deployment

1) Host baseline and runtime checks

Patch the host, tighten network exposure, and verify container tooling before creating the application structure.

sudo apt update && sudo apt -y upgrade
sudo apt install -y ca-certificates curl gnupg ufw jq
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw --force enable

docker --version
docker compose version
df -h

If the copy button does not work in your browser/editor, manually select the block and copy it.

2) Project layout and secrets

Create a deterministic directory layout. Keep runtime secrets in files with restrictive permissions and never commit them to version control.

sudo mkdir -p /opt/langfuse/{compose,secrets,backups,caddy}
sudo chown -R $USER:$USER /opt/langfuse
cd /opt/langfuse

cat > secrets/langfuse.env << 'EOF'
LANGFUSE_SALT=replace-with-random-hex
LANGFUSE_NEXTAUTH_SECRET=replace-with-long-random-string
LANGFUSE_INIT_ORG_ID=main-org
LANGFUSE_INIT_PROJECT_ID=production
POSTGRES_DB=langfuse
POSTGRES_USER=langfuse
POSTGRES_PASSWORD=replace-with-strong-password
EOF
chmod 600 secrets/langfuse.env

If the copy button does not work in your browser/editor, manually select the block and copy it.

3) Compose services for app, DB, and proxy

Pin image versions, define health checks, and isolate services in a private network to make operations predictable.

cd /opt/langfuse/compose
cat > docker-compose.yml << 'EOF'
services:
  postgres:
    image: postgres:16
    container_name: langfuse-postgres
    env_file:
      - ../secrets/langfuse.env
    environment:
      POSTGRES_DB: ${POSTGRES_DB}
      POSTGRES_USER: ${POSTGRES_USER}
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
    volumes:
      - pg_data:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"]
      interval: 10s
      timeout: 5s
      retries: 10
    restart: unless-stopped

  langfuse:
    image: langfuse/langfuse:2
    container_name: langfuse-app
    env_file:
      - ../secrets/langfuse.env
    environment:
      DATABASE_URL: postgresql://${POSTGRES_USER}:${POSTGRES_PASSWORD}@postgres:5432/${POSTGRES_DB}
      NEXTAUTH_URL: https://langfuse.example.com
    depends_on:
      postgres:
        condition: service_healthy
    expose:
      - "3000"
    restart: unless-stopped

  caddy:
    image: caddy:2
    container_name: langfuse-caddy
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ../caddy/Caddyfile:/etc/caddy/Caddyfile:ro
      - caddy_data:/data
      - caddy_config:/config
    depends_on:
      - langfuse
    restart: unless-stopped

volumes:
  pg_data:
  caddy_data:
  caddy_config:
EOF

If the copy button does not work in your browser/editor, manually select the block and copy it.

4) Caddy reverse proxy configuration

Use Caddy for automatic certificate management and set conservative security headers for better baseline hardening.

cat > /opt/langfuse/caddy/Caddyfile << 'EOF'
langfuse.example.com {
  encode zstd gzip
  reverse_proxy langfuse:3000

  header {
    Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
    X-Content-Type-Options "nosniff"
    X-Frame-Options "SAMEORIGIN"
    Referrer-Policy "strict-origin-when-cross-origin"
  }

  log {
    output file /var/log/caddy/langfuse-access.log
    format json
  }
}
EOF

If the copy button does not work in your browser/editor, manually select the block and copy it.

5) Launch and verify

Bring the stack up and validate readiness from both container and endpoint perspectives. Use logs for first-pass triage.

cd /opt/langfuse/compose
docker compose --env-file ../secrets/langfuse.env up -d

docker compose ps
docker compose logs --tail=80 postgres
docker compose logs --tail=80 langfuse
docker compose logs --tail=80 caddy
curl -I https://langfuse.example.com

If the copy button does not work in your browser/editor, manually select the block and copy it.

Configuration and secrets handling

Operational discipline matters more than initial setup speed. Keep separate credentials per environment, document ownership for secret rotation, and avoid exposing values in shell history or CI logs. When rotating credentials, stage changes in a maintenance window with a rollback plan: rotate database user password, update environment file, restart dependent services, and run smoke tests. For compliance-oriented teams, route secret creation through a vault-backed workflow and enforce MFA for operators with production access.

For long-lived deployments, define explicit configuration governance: who can change proxy headers, who can alter retention, and what checks must pass before image upgrades. This prevents accidental drift where one emergency fix creates hidden risk months later. Add a lightweight change-management checklist for every production edit: objective, blast radius, rollback command, verification output, and owner sign-off. These small process controls dramatically reduce incident time-to-mitigation because operators are never guessing what changed.

It is also useful to separate application and platform ownership. Application teams can own prompt schemas, project setup, and observability dashboards, while platform teams own network hardening, backup integrity, and upgrade orchestration. Clear boundaries prevent duplication and ensure critical controls are not forgotten during rapid release cycles.

Verification checklist

  • HTTPS endpoint returns a valid certificate and expected hostname.
  • PostgreSQL health checks remain healthy after restarts.
  • Application login and trace ingestion work end-to-end.
  • Container restart policy is active for all critical services.
  • Backup file is generated and recoverable in test restore.
  • Logs contain enough request context for incident debugging.

Backup and recovery operations

A production guide is incomplete without proven recovery. At minimum, schedule daily logical dumps and retain a rolling history aligned with your business RPO. Pair this with periodic restore drills in a non-production environment. Measure restore time and document constraints such as expected downtime and storage requirements. If your organization uses object storage with lifecycle policies, encrypt archives before upload and ensure restore keys are controlled by a different admin boundary than runtime credentials.

mkdir -p /opt/langfuse/backups
cat > /opt/langfuse/backups/backup.sh << 'EOF'
#!/usr/bin/env bash
set -euo pipefail
TS=$(date +%F-%H%M%S)
cd /opt/langfuse/compose
docker compose exec -T postgres pg_dump -U "$POSTGRES_USER" "$POSTGRES_DB" | gzip > /opt/langfuse/backups/langfuse-$TS.sql.gz
EOF
chmod +x /opt/langfuse/backups/backup.sh

# restore drill:
# zcat /opt/langfuse/backups/langfuse-YYYY-MM-DD-HHMMSS.sql.gz | docker compose exec -T postgres psql -U "$POSTGRES_USER" "$POSTGRES_DB

If the copy button does not work in your browser/editor, manually select the block and copy it.

Common issues and fixes

TLS certificate not issued

Verify DNS points to the right host, ensure ports 80/443 are reachable publicly, and check Caddy logs for ACME challenges.

Application starts but authentication is unstable

Confirm auth secrets are consistent and not regenerated on each deploy. Drift here causes intermittent session failures.

Unexpected 502 errors

Inspect app startup timing and resource limits. Add health-check-based dependency ordering and review memory pressure.

Slow query performance after growth

Review PostgreSQL indexing strategy and resource tuning. Consider managed DB if operational overhead becomes significant.

Backups exist but restores fail under pressure

Run regular restore drills, pin compatible PostgreSQL versions, and document exact runbooks in your incident handbook.

Configuration drift across environments

Codify compose and proxy configuration in source control and require PR review for production changes.

FAQ

1) Is a single-server deployment acceptable for production?

For early-stage workloads, yesβ€”if you have tested backups, monitoring, and a clear upgrade path. Scale out as reliability targets tighten.

2) Should I pin image versions or follow latest tags?

Pin versions. Promote upgrades through staging first, then production after validation. This keeps rollback deterministic.

3) How frequently should I test restores?

At least monthly, and after major schema or version changes. Untested backups are an operational risk, not a safeguard.

4) What are the most important security controls?

Strong secrets, least privilege, encrypted transport, limited network exposure, and auditable admin workflows.

5) Can I replace local PostgreSQL with a managed service later?

Yes. Keep connection settings externalized and migration runbooks documented to reduce cutover risk.

6) How do I monitor this stack effectively?

Combine synthetic uptime checks, service health metrics, and structured logs with actionable alert thresholds.

7) What is the safest way to rotate credentials?

Rotate one secret domain at a time, verify each step, and keep rollback credentials available until full validation passes.

Related guides

Talk to us

Need help deploying and hardening production AI platforms, improving reliability, or building practical runbooks for your operations team? We can help with architecture, migration, security, and ongoing optimization.

Contact Us

Production Guide: Deploy RabbitMQ with Docker Compose + Caddy on Ubuntu
Secure, persistent, and production-ready messaging with TLS, health checks, and operational validation.