Identity and access control often becomes the hidden failure point in growing SaaS stacks. Teams start with ad-hoc local users across Grafana, internal admin panels, Git services, and deployment tools, then discover too late that onboarding is inconsistent, offboarding is risky, and auditability is weak. This guide shows a production-oriented way to deploy Keycloak with Docker Compose + Traefik + PostgreSQL on Ubuntu so you can centralize authentication, enforce stronger policies, and keep operations manageable as your team scales.
The workflow is written for operators who need practical reliability, not just a quick demo. We will pin image versions, isolate secrets, run TLS termination at Traefik, validate end-to-end behavior, and define backup plus rollback habits you can trust during incidents. By the end, you will have a hardened Keycloak deployment with predictable day-2 operations and clear runbooks.
Architecture and flow overview
This pattern separates concerns cleanly: PostgreSQL stores Keycloak state, Keycloak handles identity/auth flows, and Traefik handles HTTPS, certificate issuance, and edge routing. Docker Compose provides reproducible service lifecycle management on a single Ubuntu host (or VM). This keeps operational complexity lower than Kubernetes while still being production-capable for many internal and SMB workloads.
- Traefik: Public ingress, TLS certificates via ACME, security headers, request routing.
- Keycloak: Identity provider, realms, clients, users, MFA policies, SSO endpoints.
- PostgreSQL: Durable relational store for realms, sessions, and configuration state.
- Docker network: Private east-west traffic; only Traefik exposes ports publicly.
Request path: user hits https://sso.example.com โ Traefik terminates TLS and routes to Keycloak container โ Keycloak reads/writes state in PostgreSQL โ auth responses flow back through Traefik. Secrets never appear in Compose files directly; they are injected from a locked-down environment file.
Prerequisites
- Ubuntu 22.04/24.04 host with sudo access.
- DNS A/AAAA record for your identity domain (example:
sso.example.com) pointing to the host. - Open ports 80/443 to the host.
- Docker Engine + Docker Compose plugin installed.
- An email for ACME certificate registration.
- A strong password manager and secure location for secret backups.
# Verify Docker and Compose
sudo docker --version
sudo docker compose version
# Recommended: confirm time sync for TLS/session correctness
timedatectl status
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
Step-by-step deployment
1) Prepare folders, network, and least-privilege file permissions
Create a dedicated project directory with explicit ownership and strict file modes. Keep operational files together so backups and audits are predictable.
sudo mkdir -p /opt/keycloak/{traefik,keycloak,postgres,data}
sudo chown -R $USER:$USER /opt/keycloak
cd /opt/keycloak
# Optional shared network (if Traefik already exists elsewhere)
docker network create edge || true
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
2) Create environment secrets file
Never hardcode credentials in docker-compose.yml. Use a dedicated .env with restrictive permissions and rotate values on a schedule.
cd /opt/keycloak
cat > .env <<'EOF'
DOMAIN=sso.example.com
[email protected]
POSTGRES_DB=keycloak
POSTGRES_USER=keycloak
POSTGRES_PASSWORD=REPLACE_WITH_LONG_RANDOM
KC_DB_USERNAME=keycloak
KC_DB_PASSWORD=REPLACE_WITH_LONG_RANDOM
KC_BOOTSTRAP_ADMIN_USERNAME=admin
KC_BOOTSTRAP_ADMIN_PASSWORD=REPLACE_WITH_LONG_RANDOM
EOF
chmod 600 .env
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
3) Define Compose stack
The stack below uses version-pinned images and health checks to reduce surprise failures during upgrades. We explicitly wait for database readiness and keep restart policies conservative.
cat > /opt/keycloak/docker-compose.yml <<'EOF'
services:
traefik:
image: traefik:v3.0
container_name: traefik
command:
- --api.dashboard=true
- --providers.docker=true
- --providers.docker.exposedbydefault=false
- --entrypoints.web.address=:80
- --entrypoints.websecure.address=:443
- --certificatesresolvers.le.acme.tlschallenge=true
- --certificatesresolvers.le.acme.email=${ACME_EMAIL}
- --certificatesresolvers.le.acme.storage=/letsencrypt/acme.json
ports:
- "80:80"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- /opt/keycloak/traefik:/letsencrypt
networks:
- edge
restart: unless-stopped
postgres:
image: postgres:16
container_name: keycloak-postgres
environment:
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- /opt/keycloak/data/postgres:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER} -d ${POSTGRES_DB}"]
interval: 10s
timeout: 5s
retries: 12
networks:
- edge
restart: unless-stopped
keycloak:
image: quay.io/keycloak/keycloak:25.0
container_name: keycloak
command: start --proxy=edge --hostname=${DOMAIN} --http-enabled=true
environment:
KC_DB: postgres
KC_DB_URL: jdbc:postgresql://postgres:5432/${POSTGRES_DB}
KC_DB_USERNAME: ${KC_DB_USERNAME}
KC_DB_PASSWORD: ${KC_DB_PASSWORD}
KC_BOOTSTRAP_ADMIN_USERNAME: ${KC_BOOTSTRAP_ADMIN_USERNAME}
KC_BOOTSTRAP_ADMIN_PASSWORD: ${KC_BOOTSTRAP_ADMIN_PASSWORD}
KC_HEALTH_ENABLED: "true"
KC_METRICS_ENABLED: "true"
depends_on:
postgres:
condition: service_healthy
labels:
- traefik.enable=true
- traefik.http.routers.keycloak.rule=Host(`${DOMAIN}`)
- traefik.http.routers.keycloak.entrypoints=websecure
- traefik.http.routers.keycloak.tls=true
- traefik.http.routers.keycloak.tls.certresolver=le
- traefik.http.services.keycloak.loadbalancer.server.port=8080
- traefik.http.middlewares.secureHeaders.headers.contentTypeNosniff=true
- traefik.http.middlewares.secureHeaders.headers.frameDeny=true
- traefik.http.middlewares.secureHeaders.headers.referrerPolicy=no-referrer
- traefik.http.routers.keycloak.middlewares=secureHeaders@docker
networks:
- edge
restart: unless-stopped
networks:
edge:
external: true
EOF
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
4) Launch and inspect service health
Bring up the stack in detached mode, then inspect logs and health indicators before first login. Early verification catches DNS, certificate, and database misconfigurations quickly.
cd /opt/keycloak
docker compose --env-file .env up -d
docker compose ps
docker compose logs --tail=100 postgres
docker compose logs --tail=100 keycloak
docker compose logs --tail=100 traefik
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
5) First-login hardening checklist
Immediately after first admin login, avoid leaving default realm behavior unchanged. Production incidents are often caused by weak session policies and forgotten bootstrap credentials.
# Quick connectivity checks from the host
curl -I https://sso.example.com
curl -fsS https://sso.example.com/health/ready
# Optional: validate certificate chain details
echo | openssl s_client -connect sso.example.com:443 -servername sso.example.com 2>/dev/null | openssl x509 -noout -issuer -subject -dates
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
In the Keycloak admin console, complete these hardening tasks: disable unused flows, set realm password policies (length, complexity, expiration if needed), enable brute-force detection, configure OTP for privileged groups, shorten token lifetimes for high-risk apps, and create least-privilege admin roles. If possible, use a break-glass account stored offline and audited monthly.
Configuration and secrets handling best practices
For production, treat identity as critical infrastructure. That means disciplined secret handling, immutable deploy patterns, and controlled drift. Prefer rotating credentials with documented runbooks over one-time setup assumptions.
- Secret scope: Use unique DB and admin credentials per environment (dev/stage/prod), never shared.
- Storage: Keep secrets in a vault or encrypted store; local
.envshould be a short-lived cache only. - Rotation: Rotate bootstrap admin and DB passwords at regular intervals and after staff changes.
- Backups: Backup PostgreSQL with retention policy plus encrypted offsite copy.
- Upgrade discipline: Test Keycloak upgrades in staging with representative clients and SSO flows.
# Example nightly PostgreSQL backup (host cron)
mkdir -p /opt/keycloak/backups
docker exec keycloak-postgres \
pg_dump -U "$POSTGRES_USER" "$POSTGRES_DB" \
| gzip > /opt/keycloak/backups/keycloak-$(date +%F-%H%M).sql.gz
# Keep last 14 days
find /opt/keycloak/backups -type f -name 'keycloak-*.sql.gz' -mtime +14 -delete
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
# Zero-downtime-friendly image update pattern (maintenance window still recommended)
cd /opt/keycloak
docker compose pull keycloak traefik postgres
docker compose up -d
# If migration fails, rollback to pinned prior image tags and redeploy
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
Verification checklist
Do not treat โcontainers are runningโ as success. Verify behavior from user, app, and operator perspectives.
- Open
https://sso.example.comand confirm valid TLS certificate. - Login to admin console and verify realm settings persist after container restart.
- Create a test realm, client, and user; complete a full login flow.
- Confirm token issuance and expected claims for your app integration.
- Restart stack and re-test auth path to confirm persistence.
- Run backup + restore drill in staging at least once before production cutover.
# Functional smoke checks
curl -fsS https://sso.example.com/realms/master/.well-known/openid-configuration | jq '.issuer,.token_endpoint'
# Restart simulation
cd /opt/keycloak
docker compose restart keycloak
sleep 10
curl -fsS https://sso.example.com/health/ready
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
Common issues and fixes
Issue: Traefik serves 404 for the Keycloak hostname
Cause: Router rule mismatch or wrong domain in .env. Fix: confirm DOMAIN value, labels, and DNS. Run docker compose logs traefik and verify the router is created for your host rule.
Issue: Login loops or invalid redirect URI errors
Cause: Client redirect URI mismatch in Keycloak or stale app config. Fix: explicitly set exact callback URLs and web origins for each app; avoid broad wildcards unless required by controlled environments.
Issue: Slow admin console or intermittent DB connection failures
Cause: Under-provisioned host resources, noisy neighbors, or PostgreSQL tuning defaults. Fix: inspect host CPU/memory I/O, tune PostgreSQL shared buffers/work mem conservatively, and isolate workload on dedicated VM if needed.
Issue: Certificates not issuing
Cause: Port 80 blocked, DNS not propagated, or ACME rate limits after repeated failures. Fix: validate DNS, ensure inbound port 80 reaches Traefik, and temporarily test against Letโs Encrypt staging endpoint before retrying production issuance.
# Fast troubleshooting bundle
cd /opt/keycloak
docker compose ps
docker compose logs --tail=200 traefik keycloak postgres
nslookup sso.example.com
sudo ss -tulpen | grep -E ':80|:443'
If the copy button does not work in your browser/editor view, manually select the code block and copy it.
FAQ
Can I run Keycloak without PostgreSQL in production?
You can use embedded or alternative options for testing, but production should use PostgreSQL for durability, better operational tooling, and safer upgrades. It also simplifies backup and restore processes that security/compliance teams expect.
Is Docker Compose enough for enterprise identity workloads?
For many teams, yesโespecially internal SSO, moderate traffic, and predictable growth. Compose offers simpler operations. If you need multi-region HA, advanced autoscaling, or strict platform governance, Kubernetes may be a better long-term fit.
How should I handle admin accounts securely?
Create named admin users with role scoping, enforce MFA, disable shared credentials, and keep one break-glass account in secure offline storage. Log and review privileged actions regularly. Rotate bootstrap credentials immediately after deployment.
What is the safest way to upgrade Keycloak versions?
Pin versions, clone production config in staging, run migration and integration tests against real client apps, snapshot the database, then execute a planned maintenance window with rollback tags pre-defined. Never do untested in-place major upgrades.
Do I need Redis for this setup?
Not for a basic single-node deployment. Keycloak can operate with PostgreSQL only. Introduce additional components only when your measured scale, performance, or architecture requirements justify that complexity.
How often should I run backup restore drills?
At least quarterly for small teams and monthly for critical production systems. A backup without restore validation is only a hypothesis. Practice recovery steps until they are routine and time-bounded.
Related guides
- Deploy OpenProject with Docker Compose + Caddy + PostgreSQL
- Deploy GlitchTip with Docker Compose + Caddy + PostgreSQL
- Deploy Grafana with Docker Compose and Traefik
Talk to us
Need help deploying or hardening Keycloak in production? We can help with identity architecture, secure migration planning, SSO integration patterns, and day-2 operational runbooks tailored to your team.