Self-Host Grafana: Complete Production Setup Guide

Why Self-Host Grafana?

Monitoring is non-negotiable for any production system. You need to know when things break before your users do. Grafana is the best open-source dashboard and visualisation platform available — it connects to Prometheus, Loki, InfluxDB, Elasticsearch, CloudWatch, and dozens more data sources, then turns that data into dashboards that actually make sense.

The managed Grafana Cloud is excellent, but self-hosting gives you full control: no per-user pricing, no data egress limits, no vendor lock-in, and the ability to keep all your metrics and logs inside your own infrastructure. If you're running a multi-service stack, a Kubernetes cluster, or just want to monitor a few VPSs without paying per metric, self-hosting is the right call.

This guide walks you through the complete self-host Grafana path: Docker deployment, data source provisioning, dashboard setup as code, HTTPS with Nginx, and the troubleshooting fixes you'll actually need. For deeper production patterns, we also have dedicated guides covering specific stack combinations: How to Deploy Grafana in Production with Docker Compose + systemd, Production Guide: Deploy Grafana Loki + Promtail with Docker Compose + Traefik + Let's Encrypt on Ubuntu, and Production Guide: Deploy Grafana + Prometheus with Docker Compose + Nginx on Ubuntu.

Prerequisites

Get these in place before you start. Missing pieces cause confusing failure modes.

Server Requirements

CPU: 2 cores minimum, 4+ recommended for production workloads
RAM: 4 GB minimum, 8 GB+ if running Prometheus and Loki alongside Grafana
Disk: 20 GB SSD minimum, 50 GB+ recommended for Prometheus time-series retention
OS: Ubuntu 22.04 or 24.04 (this guide uses Ubuntu)

Software

Docker Engine 24+ and Docker Compose v2
A domain name with an A record pointed at your server (for HTTPS)
Ports 80, 443, and 3000 available

Verify Docker

docker --version
docker compose version
# Both should return version strings — Docker 24+, Compose v2+

If Docker isn't installed, follow the official Docker install guide first.

Step 1 — Deploy Grafana with Docker

Grafana's official Docker image is the fastest way to get running. For production, use Docker Compose with persistent volumes and environment configuration.

Quick Start (Single Container)

docker run -d \
  -p 3000:3000 \
  --name=grafana \
  -v grafana-storage:/var/lib/grafana \
  grafana/grafana-enterprise

This pulls the latest Grafana Enterprise image (free, includes all OSS features) and mounts a named volume for data persistence. Access it at http://your-server-ip:3000. Default login is admin / admin — you'll be forced to change the password on first login.

Production Docker Compose Setup

Create a project directory and compose file:

mkdir ~/grafana && cd ~/grafana

Create docker-compose.yml:

services:
  grafana:
    image: grafana/grafana-enterprise:latest
    container_name: grafana
    restart: unless-stopped
    ports:
      - "127.0.0.1:3000:3000"   # localhost only — Nginx handles public traffic
    volumes:
      - grafana-data:/var/lib/grafana
      - ./provisioning:/etc/grafana/provisioning
    environment:
      - GF_SECURITY_ADMIN_USER=${GF_ADMIN_USER:-admin}
      - GF_SECURITY_ADMIN_PASSWORD=${GF_ADMIN_PASSWORD}
      - GF_USERS_ALLOW_SIGN_UP=false
      - GF_SERVER_ROOT_URL=https://grafana.yourdomain.com
      - GF_SECURITY_CSRF_ADDITIONAL_HEADERS=X-Forwarded-Host
      - GF_SECURITY_CSRF_TRUSTED_ORIGINS=https://grafana.yourdomain.com
    networks:
      - monitoring

volumes:
  grafana-data:

networks:
  monitoring:
    driver: bridge

Create the .env file:

# .env — keep this out of version control
GF_ADMIN_USER=admin
GF_ADMIN_PASSWORD=your-strong-password-here

docker compose up -d
docker compose logs -f grafana

Once logs show "HTTP Server Listen" on port 3000, Grafana is ready. The provisioning volume mount lets you configure data sources and dashboards as code — covered in Step 3.

Step 2 — Add Prometheus for Metrics

Grafana without data is just an empty UI. Prometheus is the standard time-series database for metrics collection. We'll add it to the same Docker Compose stack.

Prometheus Configuration

Create the config directory and Prometheus config:

mkdir -p prometheus

Create prometheus/prometheus.yml:

global:
  scrape_interval: 15s
  evaluation_interval: 15s

scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets: ['localhost:9090']

  - job_name: 'node-exporter'
    static_configs:
      - targets: ['node-exporter:9100']

  - job_name: 'grafana'
    static_configs:
      - targets: ['grafana:3000']

Update Docker Compose

Add Prometheus and Node Exporter to your docker-compose.yml:

  prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    restart: unless-stopped
    volumes:
      - ./prometheus:/etc/prometheus
      - prometheus-data:/prometheus
    command:
      - '--config.file=/etc/prometheus/prometheus.yml'
      - '--storage.tsdb.path=/prometheus'
      - '--storage.tsdb.retention.time=30d'
      - '--web.enable-lifecycle'
    ports:
      - "127.0.0.1:9090:9090"
    networks:
      - monitoring

  node-exporter:
    image: prom/node-exporter:latest
    container_name: node-exporter
    restart: unless-stopped
    volumes:
      - /proc:/host/proc:ro
      - /sys:/host/sys:ro
      - /:/rootfs:ro
    command:
      - '--path.procfs=/host/proc'
      - '--path.sysfs=/host/sys'
      - '--path.rootfs=/rootfs'
      - '--collector.filesystem.mount-points-exclude=^/(sys|proc|dev|host|etc)($$|/)'
    ports:
      - "127.0.0.1:9100:9100"
    networks:
      - monitoring

volumes:
  grafana-data:
  prometheus-data:

docker compose up -d
docker compose ps

All three services should show as "Up". Prometheus is now scraping metrics from itself, Grafana, and the host machine via Node Exporter.

Step 3 — Provision Data Sources and Dashboards as Code

Manually clicking through the UI to add data sources and import dashboards doesn't scale. Grafana's provisioning system lets you define everything in YAML and JSON files that live in version control.

Provision a Prometheus Data Source

Create the provisioning directory structure:

mkdir -p provisioning/datasources provisioning/dashboards

Create provisioning/datasources/prometheus.yml:

apiVersion: 1

datasources:
  - name: Prometheus
    type: prometheus
    access: proxy
    url: http://prometheus:9090
    isDefault: true
    editable: false

Provision a Dashboard

Download the official Node Exporter dashboard JSON and place it in the provisioning folder:

# Download the popular Node Exporter Full dashboard
curl -L \
  "https://grafana.com/api/dashboards/1860/revisions/latest/download" \
  -o provisioning/dashboards/node-exporter.json

Create provisioning/dashboards/dashboards.yml to tell Grafana where to find dashboard files:

apiVersion: 1

providers:
  - name: 'default'
    orgId: 1
    folder: ''
    type: file
    disableDeletion: false
    editable: true
    options:
      path: /etc/grafana/provisioning/dashboards

# Restart Grafana to pick up provisioning changes
docker compose restart grafana

After restart, navigate to Dashboards → Browse — you'll see the Node Exporter Full dashboard pre-loaded and connected to your Prometheus data source.

Step 4 — HTTPS with Nginx and Let's Encrypt

Running Grafana over HTTP is fine for local testing. For anything production-facing, you need HTTPS. Grafana's own docs recommend a reverse proxy for TLS termination.

Install Nginx and Certbot

sudo apt update
sudo apt install -y nginx certbot python3-certbot-nginx

Nginx Site Configuration

Create /etc/nginx/sites-available/grafana and replace grafana.yourdomain.com with your actual domain:

server {
    listen 80;
    server_name grafana.yourdomain.com;

    location / {
        proxy_pass         http://127.0.0.1:3000;
        proxy_http_version 1.1;

        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_set_header   X-Forwarded-Host $host;

        proxy_read_timeout 300s;
        proxy_send_timeout 300s;

        # WebSocket support for Grafana Live
        proxy_set_header   Upgrade $http_upgrade;
        proxy_set_header   Connection $connection_upgrade;
    }
}

sudo ln -s /etc/nginx/sites-available/grafana /etc/nginx/sites-enabled/
sudo nginx -t && sudo systemctl reload nginx

# Issue TLS certificate
sudo certbot --nginx -d grafana.yourdomain.com

After Certbot completes, update GF_SERVER_ROOT_URL in your .env to use the HTTPS URL, then restart Grafana:

docker compose down
docker compose up -d

Note: The X-Forwarded-Host and X-Forwarded-Proto headers are required for Grafana to generate correct absolute URLs in alerts and invite links. Without them, links in email notifications will point to localhost:3000 instead of your public domain.

Step 5 — Alerts and Notifications

Dashboards are for humans. Alerts are for catching problems while you sleep. Grafana's built-in alerting can evaluate rules against any data source and send notifications via email, Slack, PagerDuty, or webhook.

Configure SMTP for Email Alerts

Add to your docker-compose.yml environment block:

      - GF_SMTP_ENABLED=true
      - GF_SMTP_HOST=smtp.yourprovider.com:587
      - [email protected]
      - GF_SMTP_PASSWORD=${SMTP_PASSWORD}
      - [email protected]
      - GF_SMTP_FROM_NAME=Grafana Alerts

Create an Alert Rule

In the Grafana UI, go to Alerting → Alert rules → New alert rule. A simple starting point: alert when CPU usage exceeds 80% for 5 minutes.

Query: 100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) > 80
Condition: IS ABOVE 80
Evaluation: every 1m for 5m
Contact point: your email or Slack webhook

Provision Alert Rules as Code (Grafana 9+)

Alert rules can also be provisioned via YAML. Create provisioning/alerting/alerts.yml:

apiVersion: 1

groups:
  - orgId: 1
    name: infrastructure
    folder: Infrastructure
    interval: 60s
    rules:
      - uid: high-cpu-usage
        title: High CPU Usage
        condition: B
        data:
          - refId: A
            relativeTimeRange:
              from: 300
              to: 0
            datasourceUid: prometheus
            model:
              expr: '100 - (avg by(instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)'
              refId: A
          - refId: B
            relativeTimeRange:
              from: 0
              to: 0
            datasourceUid: __expr__
            model:
              type: threshold
              expression: A
              conditions:
                - evaluator:
                    type: gt
                    params:
                      - 80
        noDataState: NoData
        execErrState: Error
        for: 5m
        annotations:
          summary: "CPU usage is above 80%"
        labels:
          severity: warning

Step 6 — Troubleshooting and Production Tips

These are the issues you'll actually hit. Most have fast fixes.

Problem: "Bad Gateway" or "Connection Refused"

Grafana container isn't running, or Nginx can't reach it on port 3000.

docker compose ps grafana
docker compose logs grafana --tail 50

# Verify Grafana is listening on the expected port
docker exec grafana netstat -tlnp | grep 3000

Common cause: Grafana failed to start because the provisioning YAML has a syntax error. Check logs for "failed to provision" messages.

Problem: Data Source Shows "Health Check Failed"

The data source URL is wrong, or the target service isn't reachable from Grafana's container.

# From inside the Grafana container, test connectivity
docker exec -it grafana wget -qO- http://prometheus:9090/-/healthy

In Docker Compose, use service names (prometheus, loki) as hostnames — not localhost. Each container has its own network namespace.

Problem: Dashboards Not Appearing After Provisioning

Check the dashboard JSON for a valid uid field. Duplicate UIDs cause provisioning to skip files silently. Also verify the dashboard provider YAML points to the correct path:

# Check for JSON syntax errors in dashboard files
for f in provisioning/dashboards/*.json; do
  python3 -m json.tool "$f" > /dev/null && echo "OK: $f" || echo "INVALID: $f"
done

Problem: Alerts Not Sending

Check the alert rule evaluation state in Alerting → Alert rules. If it shows "Error" or "NoData", the query is failing. If it shows "Firing" but no notification was sent, check the contact point configuration and Grafana's SMTP settings in the logs:

docker compose logs grafana | grep -i "alert\|smtp\|notification"

Tip: Back Up Grafana Data Before Upgrades

All dashboards, data source configs, users, and alert rules live in the grafana-data volume. Back it up before major upgrades:

docker run --rm \
  -v grafana-data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/grafana-$(date +%Y%m%d).tar.gz /data

Tip: Use Grafana's Database Export for Migration

Grafana stores everything in SQLite by default (in /var/lib/grafana/grafana.db). For migrations or disaster recovery, copy this file:

docker cp grafana:/var/lib/grafana/grafana.db ./grafana-backup.db

Tip: Enable Anonymous Viewing for Public Dashboards

If you want to share a dashboard without requiring login, enable anonymous access (carefully — only for non-sensitive metrics):

      - GF_AUTH_ANONYMOUS_ENABLED=true
      - GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer

What You've Built

At the end of this self-host Grafana guide, you have:

Grafana running in Docker with persistent storage and environment-based configuration
Prometheus collecting metrics from itself, Grafana, and the host via Node Exporter
Provisioning set up for data sources and dashboards as code — version-controllable, repeatable
HTTPS via Nginx with Let's Encrypt auto-renewal
Alert rules for catching infrastructure issues before they become outages
A Docker Compose stack that's portable across environments

This is a solid foundation for monitoring a single server or small cluster. For larger deployments, you'll want to expand into the full Grafana stack: Loki for logs, Tempo for traces, and Mimir for long-term metric storage. Our dedicated guides cover those exact setups: Grafana Loki + Promtail with Traefik and Grafana + Prometheus with Nginx.

Need Enterprise-Grade Monitoring?

A single-server Grafana stack handles a lot. When you're monitoring multi-node clusters, Kubernetes, or need high-availability alerting with on-call rotation, the architecture gets more complex. You'll need distributed Prometheus (Mimir), log aggregation at scale (Loki with S3 backend), and careful capacity planning.

The Sysbrix team builds and maintains production monitoring infrastructure for teams who need visibility without the ops overhead. If you're evaluating whether self-hosted monitoring makes sense for your scale, we're happy to walk through it.

Talk to Us About Enterprise Monitoring →

See Everything: How to Self-Host Grafana for Production Monitoring