Skip to Content

Production Guide: Deploy langfuse with Kubernetes, Helm, and Ingress-NGINX on Ubuntu

Production-first deployment with secure secrets handling, verification, and practical troubleshooting.

AI platform teams often discover too late that prompt telemetry and model behavior are not consistently visible across environments. Langfuse helps teams track traces, evaluate outputs, and investigate regressions before small quality drops become expensive incidents. This production guide focuses on deploying Langfuse on Kubernetes with Helm and Ingress-NGINX in a way that remains stable during upgrades, secret rotation, and growth.

The deployment pattern below is intended for practical operations. It balances quick setup with production controls: namespace boundaries, TLS-ready ingress, secret hygiene, rollout validation, and repeatable troubleshooting. The goal is not only to bring the service online, but to keep it reliable during real traffic and on-call events.

Architecture and flow overview

  • Kubernetes schedules Langfuse pods with defined resources.
  • Helm manages release lifecycle and rollback.
  • Ingress-NGINX routes external traffic into cluster services.
  • Secrets deliver sensitive values without embedding them in source.
  • Verification checks confirm readiness before handoff.

Operationally, traffic enters ingress, routes to the service, and lands on healthy pods. Engineering teams then validate traces, latency, and logs after each release to keep behavior predictable.

Prerequisites

  • Ubuntu admin host with kubectl and helm.
  • Reachable Kubernetes cluster.
  • Ingress-NGINX already running.
  • TLS certificate workflow prepared.
  • Dedicated namespace for observability workloads.

Step-by-step deployment

1) Create namespace

Keep this workload isolated for safer permissions and cleaner operations.

kubectl create namespace observability
kubectl config set-context --current --namespace=observability
kubectl get namespace observability

If the copy button does not work in your browser/editor, select the block and copy manually.

2) Create application secrets

Store sensitive values in Kubernetes secrets and rotate them after initial validation.

cat > langfuse-secrets.env <<'EOF'
LANGFUSE_SALT=replace-with-random-value
NEXTAUTH_SECRET=replace-with-session-secret
DATABASE_URL=postgresql://langfuse:[email protected]:5432/langfuse
EOF
kubectl create secret generic langfuse-secrets --from-env-file=langfuse-secrets.env

If the copy button does not work in your browser/editor, select the block and copy manually.

3) Add and refresh Helm repository

Use official chart sources and refresh metadata before installation.

helm repo add langfuse https://langfuse.github.io/langfuse-k8s
helm repo update

If the copy button does not work in your browser/editor, select the block and copy manually.

4) Create values file

Define ingress class, resources, autoscaling, and secret references in one reviewed config.

cat > values-langfuse.yaml <<'EOF'
fullnameOverride: langfuse
ingress:
  enabled: true
  className: nginx
resources:
  requests:
    cpu: 250m
    memory: 512Mi
  limits:
    cpu: "1"
    memory: 2Gi
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 6
  targetCPUUtilizationPercentage: 70
envFrom:
  - secretRef:
      name: langfuse-secrets
EOF

If the copy button does not work in your browser/editor, select the block and copy manually.

5) Install or upgrade the release

The wait flag makes CI fail fast if rollout conditions are not met.

helm upgrade --install langfuse langfuse/langfuse --namespace observability --values values-langfuse.yaml --wait --timeout 15m

If the copy button does not work in your browser/editor, select the block and copy manually.

6) Validate objects and rollout status

Check pods, service, and ingress status before exposing the endpoint widely.

kubectl get pods -n observability
kubectl get svc,ingress -n observability
kubectl rollout status deploy/langfuse --timeout=300s

If the copy button does not work in your browser/editor, select the block and copy manually.

Configuration and secrets handling

Most production outages in this stage come from configuration drift or poor secret lifecycle management. Keep environment-specific values separate, enforce strict RBAC on secret reads, and use explicit runbooks for emergency rotation.

When adopting GitOps, commit only non-sensitive chart values and inject secrets at runtime through your approved secret manager. This preserves auditability without exposing credentials.

  • Rotate session and database secrets on schedule.
  • Use environment-specific secret objects.
  • Review who can read secrets in namespace RBAC.
  • Document rollback steps that include secret state.

Verification

Run these checks after each deployment and upgrade:

  • Pods are ready with no restart loops.
  • Ingress routes to service successfully.
  • Logs show successful app startup.
  • Sample traces appear in Langfuse dashboard.
  • Resource usage stays inside limits.
kubectl get pods -n observability
kubectl logs deploy/langfuse --tail=120
kubectl top pods -n observability || true

If the copy button does not work in your browser/editor, select the block and copy manually.

Common issues and fixes

Pods crash after deployment

Validate secret keys and database URL format. Most startup failures map directly to missing or malformed environment variables.

Ingress returns unexpected responses

Confirm ingress class and host rules, then inspect controller logs for route mismatch.

High latency under load

Increase resource requests/limits and review autoscaling thresholds with real traffic patterns.

Upgrade regressions

Use Helm rollback immediately, then test the same upgrade path in staging with controlled traffic.

Certificate or TLS issues

Verify certificate secret placement and renewal workflow before declaring rollout complete.

Metrics visibility gaps

Ensure app instrumentation and trace ingestion paths are enabled and validated by a smoke workload.

FAQ

Can this run on a small cluster initially?

Yes. Start small for validation, then increase replicas and resource limits for production workloads.

Do I have to use Helm?

You can deploy with manifests, but Helm simplifies repeatability, upgrades, and rollback in most teams.

How often should secrets be rotated?

At minimum quarterly and immediately after any suspected exposure or staffing change.

What backup model works best?

Daily backups plus verified restore tests on a regular schedule keeps recovery trustworthy.

How do I reduce downtime during upgrades?

Use multiple replicas, readiness probes, and controlled rolling updates.

Can I integrate this with GitOps pipelines?

Yes. Keep non-secret values in Git and inject secrets at runtime from your manager.

What if copy buttons are stripped by the editor?

Each code block includes manual-copy fallback text so commands remain usable.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Production readiness is less about a single successful install and more about consistency over time. Treat each deployment as an operational event: validate release health, monitor key service indicators, and capture lessons learned in a runbook your team can execute under pressure. This approach reduces firefighting and keeps observability reliable as usage grows.

Related internal guides

Talk to us

Need help deploying or hardening Keycloak in production? We can help with identity architecture, secure migration planning, SSO integration patterns, and day-2 operational runbooks tailored to your team.

Contact Us

Production Guide: Deploy Keycloak with Docker Compose + Traefik + PostgreSQL on Ubuntu
A production-ready Keycloak deployment blueprint with TLS, secure secrets handling, backup discipline, verification checks, and practical troubleshooting runbooks.