Apache Superset is a strong fit when a team has grown past ad hoc spreadsheets but is not ready to buy a heavy proprietary BI suite. A practical deployment has to do more than start the web container: it needs persistent PostgreSQL metadata, Redis-backed caching, a TLS reverse proxy, controlled admin onboarding, backup routines, and a clear verification path for analysts who depend on dashboards during business hours. This guide follows the current SysBrix Guides pattern and keeps the stack understandable for a small operations team.
The real-world use case is an internal analytics portal for sales, finance, and operations teams. Superset connects to governed databases, lets analysts build charts, and gives leadership a shared place for dashboards without exposing database credentials to every browser. The deployment below uses Docker Compose on Ubuntu, Caddy for automatic HTTPS, PostgreSQL for Superset metadata, Redis for cache/task coordination, and a worker process for reports and background jobs.
Architecture and flow overview
Traffic enters through Caddy on ports 80 and 443, terminates TLS, and forwards only local requests to the Superset web service on 127.0.0.1:8088. Superset stores users, roles, dashboards, saved queries, and database connection definitions in PostgreSQL. Redis supports caching and asynchronous work so dashboard loads do not force every request through the database layer. A separate worker container handles scheduled reports and background tasks, which keeps the web process responsive during exports or alert generation.
The operational boundary is simple: Caddy is the only public-facing service, Docker networks keep PostgreSQL and Redis private, and the host stores backups under /opt/superset/backups. This pattern is intentionally boring. It is easier to audit, easier to restore, and easier to hand off than a single all-in-one container where secrets, data, and proxy configuration are mixed together.
Prerequisites
- Ubuntu 22.04 or 24.04 with Docker Engine and the Docker Compose plugin installed.
- A DNS record such as
bi.example.compointing to the server. - Ports 80 and 443 open to the internet; database and Redis ports should remain private.
- At least 2 vCPU and 4 GB RAM for a small team; 4 vCPU and 8 GB RAM is safer for many dashboards.
- A tested backup destination outside the VM, such as object storage or a backup server.
Step-by-step deployment
Start by creating a predictable directory layout. Keeping configuration, backups, and logs under one application path makes future upgrades and incident response much easier.
mkdir -p /opt/superset/{caddy,pythonpath,backups,logs}
cd /opt/superset
chmod 700 /opt/superset/backups
If the copy button is unavailable in your browser, manually select and copy the command above.
Create the environment file next. The secret key must remain stable after initial deployment because Superset uses it for session and encryption-related behavior. Generate it once, store it in your password manager, and do not rotate it casually during a normal container upgrade.
python3 - <<'PY'
import secrets
print(secrets.token_urlsafe(64))
PY
If the copy button is unavailable in your browser, manually select and copy the command above.
cat > .env <<'EOF'
SUPERSET_SECRET_KEY=replace-with-64-random-characters
POSTGRES_DB=superset
POSTGRES_USER=superset
POSTGRES_PASSWORD=replace-with-strong-db-password
REDIS_PASSWORD=replace-with-strong-redis-password
SUPERSET_DOMAIN=bi.example.com
[email protected]
ADMIN_PASSWORD=replace-with-temporary-admin-password
EOF
chmod 600 .env
If the copy button is unavailable in your browser, manually select and copy the command above.
Add the Compose definition. This keeps the public web port bound to localhost only; Caddy will be responsible for internet traffic and certificate management. PostgreSQL and Redis are private Docker services and should not be published on the host.
services:
db:
image: postgres:16-alpine
restart: unless-stopped
environment:
POSTGRES_DB: ${POSTGRES_DB}
POSTGRES_USER: ${POSTGRES_USER}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
volumes:
- db_data:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $$POSTGRES_USER -d $$POSTGRES_DB"]
interval: 10s
timeout: 5s
retries: 10
redis:
image: redis:7-alpine
restart: unless-stopped
command: ["redis-server", "--requirepass", "${REDIS_PASSWORD}"]
volumes:
- redis_data:/data
superset:
image: apache/superset:latest
restart: unless-stopped
env_file: .env
depends_on:
db:
condition: service_healthy
redis:
condition: service_started
volumes:
- ./pythonpath:/app/pythonpath
- ./logs:/app/superset_home/logs
ports:
- "127.0.0.1:8088:8088"
command: ["/bin/sh", "-c", "superset db upgrade && superset init && gunicorn -w 4 -k gevent --timeout 120 -b 0.0.0.0:8088 'superset.app:create_app()'"]
worker:
image: apache/superset:latest
restart: unless-stopped
env_file: .env
depends_on:
- db
- redis
volumes:
- ./pythonpath:/app/pythonpath
command: ["celery", "--app=superset.tasks.celery_app:app", "worker", "--pool=prefork", "-O", "fair", "-c", "2"]
volumes:
db_data:
redis_data:
If the copy button is unavailable in your browser, manually select and copy the command above.
Add Superset configuration. The settings below enable proxy awareness, secure cookies, Redis caching, and common production features. Treat this file as application configuration, not as a dumping ground for passwords; secrets should stay in .env or a real secret manager.
cat > pythonpath/superset_config.py <<'PY'
import os
SECRET_KEY = os.environ["SUPERSET_SECRET_KEY"]
SQLALCHEMY_DATABASE_URI = f"postgresql+psycopg2://{os.environ['POSTGRES_USER']}:{os.environ['POSTGRES_PASSWORD']}@db:5432/{os.environ['POSTGRES_DB']}"
CACHE_CONFIG = {
"CACHE_TYPE": "RedisCache",
"CACHE_DEFAULT_TIMEOUT": 300,
"CACHE_REDIS_HOST": "redis",
"CACHE_REDIS_PORT": 6379,
"CACHE_REDIS_PASSWORD": os.environ["REDIS_PASSWORD"],
"CACHE_KEY_PREFIX": "superset_cache_",
}
DATA_CACHE_CONFIG = CACHE_CONFIG
ENABLE_PROXY_FIX = True
SESSION_COOKIE_SECURE = True
SESSION_COOKIE_SAMESITE = "Lax"
WTF_CSRF_ENABLED = True
FEATURE_FLAGS = {
"ALERT_REPORTS": True,
"DASHBOARD_RBAC": True,
}
PY
If the copy button is unavailable in your browser, manually select and copy the command above.
Configure Caddy to terminate TLS. If Caddy runs directly on the host, keep this Caddyfile under the same application directory and use your preferred systemd service or containerized Caddy pattern. The important part is that external traffic reaches Caddy first, not the Superset container.
cat > caddy/Caddyfile <<'EOF'
{$SUPERSET_DOMAIN} {
encode zstd gzip
reverse_proxy 127.0.0.1:8088
header {
Strict-Transport-Security "max-age=31536000; includeSubDomains; preload"
X-Content-Type-Options "nosniff"
Referrer-Policy "strict-origin-when-cross-origin"
}
}
EOF
If the copy button is unavailable in your browser, manually select and copy the command above.
Initialize the database, create the first administrator, and start the full stack. After the first login, immediately create named personal accounts and disable shared use of the bootstrap admin password.
docker compose up -d db redis
docker compose run --rm superset superset fab create-admin --username admin --firstname Superset --lastname Admin --email "$ADMIN_EMAIL" --password "$ADMIN_PASSWORD"
docker compose up -d
If the copy button is unavailable in your browser, manually select and copy the command above.
Configuration and secrets handling best practices
Production BI systems usually fail from weak operational practices rather than from missing features. Keep database passwords, Redis passwords, and the Superset secret key out of Git. Limit access to /opt/superset/.env, and document who can retrieve break-glass credentials. For larger environments, replace the local .env file with a deployment pipeline that injects secrets from a vault.
Use Superset roles deliberately. Give analysts only the database and dashboard permissions they need, separate dashboard authors from administrators, and review permissions after team changes. If Superset connects to production databases, create read-only database users and consider views that expose only approved columns. For sensitive analytics, put Superset behind SSO or a private access gateway rather than relying only on local passwords.
Plan upgrades as application changes, not as casual container pulls. Check the release notes, snapshot PostgreSQL, test dashboards, and keep rollback instructions nearby. Superset metadata is valuable business state; protect it the same way you protect the databases that feed the dashboards.
Verification checklist
Verify from both the server and a real browser. The container health endpoint can pass while TLS, cookies, or reverse-proxy headers are still wrong for users.
docker compose ps
docker compose logs --tail=80 superset
curl -I https://bi.example.com/health
curl -s http://127.0.0.1:8088/health
If the copy button is unavailable in your browser, manually select and copy the command above.
- Confirm the login page loads at the public domain over HTTPS.
- Create a non-admin test user and confirm role restrictions behave as expected.
- Add one read-only database connection and build a small test chart.
- Restart the stack and verify dashboards remain available after containers recreate.
- Check that the browser sees secure cookies and no mixed-content warnings.
Backups and recovery
The minimum backup is a PostgreSQL dump of Superset metadata. That dump includes dashboards, saved queries, roles, and connection definitions. It does not back up the external data warehouses Superset queries, so document those separately.
cat > backup-superset.sh <<'EOF'
#!/usr/bin/env bash
set -euo pipefail
cd /opt/superset
stamp=$(date +%Y%m%d-%H%M%S)
docker compose exec -T db pg_dump -U "$POSTGRES_USER" "$POSTGRES_DB" | gzip > "backups/superset-$stamp.sql.gz"
find backups -name 'superset-*.sql.gz' -mtime +14 -delete
EOF
chmod 700 backup-superset.sh
./backup-superset.sh
If the copy button is unavailable in your browser, manually select and copy the command above.
Copy backup files off the host and test restores regularly. A backup that has never been restored is only a hopeful file. For a small environment, a weekly restore into a staging VM is enough to catch missing credentials, broken scripts, and incompatible container versions.
gunzip -c backups/superset-YYYYMMDD-HHMMSS.sql.gz | docker compose exec -T db psql -U superset superset
docker compose restart superset worker
If the copy button is unavailable in your browser, manually select and copy the command above.
Common issues and fixes
Login redirects loop after enabling HTTPS. Confirm ENABLE_PROXY_FIX = True, that Caddy forwards the request cleanly, and that users access only the HTTPS hostname. Clear old cookies after changing cookie settings.
Dashboards are slow under light usage. Check whether charts are repeatedly querying a large production table without caching. Add Redis caching, use database views, and set reasonable dashboard refresh intervals instead of letting every user force a live query.
Workers start but reports never send. Confirm the worker can reach Redis and that alert/report feature flags are enabled. Reports also need browser dependencies in some deployment patterns, so validate with a simple scheduled report before relying on it for executives.
Database connections work for admins only. Superset permissions are role-based. Grant datasource and database access to the intended role, then test with a non-admin user instead of assuming dashboards inherit admin visibility.
Container upgrades break startup. Roll back to the previous image tag, restore the metadata dump if migrations partially ran, and test the target version in staging. Avoid using floating tags in regulated environments.
FAQ
Can I use SQLite for Superset metadata?
SQLite is fine for a local demo, but it is not appropriate for a production BI portal. Use PostgreSQL so metadata survives upgrades, supports concurrency, and can be backed up cleanly.
Should Superset connect directly to production databases?
It can, but use read-only credentials, network restrictions, and views or replicas where possible. Superset should not hold broad write-capable credentials for operational databases.
How many workers do I need?
Start with one small worker for reports and background tasks. Increase worker concurrency only after observing queue delays, CPU usage, and memory pressure during dashboard exports.
What should I monitor first?
Monitor container restarts, HTTP 5xx responses, PostgreSQL disk usage, Redis availability, backup success, and dashboard response time. Add application logs to your central logging system early.
How do I rotate the admin password?
Create named administrator accounts, sign in with one of them, and update or disable the bootstrap admin account. Do not continue using a shared password after the deployment is validated.
Can I put Superset behind Cloudflare or another access layer?
Yes. Keep Caddy as the local TLS/reverse-proxy layer if it fits your standard pattern, then place your access gateway in front. Re-test redirects, secure cookies, and large dashboard responses after adding the gateway.
What is the safest upgrade process?
Pin image versions, take a metadata backup, run the new version in staging, execute database migrations there, and validate critical dashboards. Only then schedule a production upgrade window.
Related guides
- Production Guide: Deploy ntfy with Docker Compose + Caddy + Auth + Attachments on Ubuntu
- Production Guide: Deploy Meilisearch with Docker Compose + Caddy + Master Key on Ubuntu
- Production Guide: Deploy PocketBase with systemd + NGINX + SQLite + UFW on Ubuntu
Talk to us
If you want this implemented with hardened defaults, observability, and tested recovery playbooks, our team can help.