Deploy VictoriaMetrics Single Server with systemd + Nginx TLS on Ubuntu (Production Guide)

A practical, production-grade setup for fast, low-footprint metrics storage with secure remote ingestion.

Intro with a real-world use case

When teams outgrow basic host metrics and need dependable long-term trend analysis, they often want something lighter than a full distributed TSDB cluster. VictoriaMetrics is a strong fit for this stage: compact footprint, fast ingestion, and good query performance on modest hardware. In this guide, we deploy VictoriaMetrics on Ubuntu using a native binary under systemd and front it with Nginx TLS so external writes and reads are controlled and auditable.

This walkthrough is production-oriented. Instead of just making the process run, we focus on safe defaults, service hardening, explicit retention controls, and practical failure-recovery routines. The goal is to give operators a setup that is easy to maintain six months from now, including clear verification checkpoints and common-issue playbooks.

Architecture/flow overview

The service listens on localhost only. Nginx handles HTTPS and selectively proxies API paths. This separation keeps direct internet traffic away from the metrics process. Prometheus-compatible clients send remote_write traffic to the public endpoint, while queries are proxied through controlled routes. We also keep disk paths explicit and isolate the process with systemd hardening so the blast radius is reduced if something goes wrong.

VictoriaMetrics process and storage at /var/lib/victoriametrics
systemd for restart policy, startup order, and security boundaries
Nginx TLS frontend with endpoint-level controls
Firewall and basic-auth controls for write endpoints

Prerequisites

Ubuntu 22.04/24.04 server (2 vCPU, 4 GB RAM minimum)
DNS record for your metrics hostname
sudo privileges
Open ports 80/443 and SSH access
A change window long enough for verification tests

Step-by-step deployment

1) Prepare host baseline

Start with OS updates and required packages. Keep this host single-purpose if possible so resource behavior remains predictable under load.

sudo apt update && sudo apt -y upgrade
sudo apt -y install curl tar nginx ufw jq apache2-utils
sudo useradd --system --no-create-home --shell /usr/sbin/nologin victoriametrics
sudo mkdir -p /var/lib/victoriametrics
sudo chown -R victoriametrics:victoriametrics /var/lib/victoriametrics

If copy does not work in your browser/editor, manually select the block and copy.

2) Install pinned VictoriaMetrics release

Pinning a version avoids surprise regressions. Upgrade deliberately after staging tests.

VM_VERSION="v1.102.1"
curl -L -o /tmp/vm.tar.gz "https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/${VM_VERSION}/victoria-metrics-linux-amd64-${VM_VERSION}.tar.gz"
tar -xzf /tmp/vm.tar.gz -C /tmp
sudo install -m 0755 /tmp/victoria-metrics-prod /usr/local/bin/victoria-metrics
/usr/local/bin/victoria-metrics --version

If copy does not work in your browser/editor, manually select the block and copy.

3) Configure systemd service with guardrails

We set restarts, file limits, and restrictive filesystem permissions. The service binds to localhost so only the reverse proxy can expose it.

sudo tee /etc/systemd/system/victoriametrics.service >/dev/null <<'EOF'
[Unit]
Description=VictoriaMetrics
After=network-online.target
Wants=network-online.target

[Service]
User=victoriametrics
Group=victoriametrics
ExecStart=/usr/local/bin/victoria-metrics \
  -storageDataPath=/var/lib/victoriametrics \
  -retentionPeriod=6 \
  -httpListenAddr=127.0.0.1:8428
Restart=always
RestartSec=5
LimitNOFILE=65535
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/victoriametrics

[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now victoriametrics
sudo systemctl status victoriametrics --no-pager

If copy does not work in your browser/editor, manually select the block and copy.

4) Configure Nginx TLS and path controls

Expose only explicit routes. Keep everything else denied by default so accidental endpoint exposure is minimized.

sudo tee /etc/nginx/sites-available/vm.conf >/dev/null <<'EOF'
server {
  listen 80;
  server_name metrics.example.com;
  return 301 https://$host$request_uri;
}

server {
  listen 443 ssl http2;
  server_name metrics.example.com;

  ssl_certificate     /etc/letsencrypt/live/metrics.example.com/fullchain.pem;
  ssl_certificate_key /etc/letsencrypt/live/metrics.example.com/privkey.pem;

  location /api/v1/query {
    proxy_pass http://127.0.0.1:8428;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-Proto https;
  }

  location /api/v1/write {
    auth_basic "Restricted";
    auth_basic_user_file /etc/nginx/.vm_htpasswd;
    proxy_pass http://127.0.0.1:8428;
    proxy_set_header Host $host;
    proxy_set_header X-Forwarded-Proto https;
  }

  location / { return 404; }
}
EOF
sudo ln -sf /etc/nginx/sites-available/vm.conf /etc/nginx/sites-enabled/vm.conf
sudo nginx -t && sudo systemctl reload nginx

If copy does not work in your browser/editor, manually select the block and copy.

5) Protect write access and firewall surface

Restrict ingestion endpoints with basic auth and firewall policy. If available, combine with IP allowlisting.

sudo htpasswd -c /etc/nginx/.vm_htpasswd ingest_user
sudo chmod 640 /etc/nginx/.vm_htpasswd
sudo chown root:www-data /etc/nginx/.vm_htpasswd

sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw --force enable

If copy does not work in your browser/editor, manually select the block and copy.

Configuration and secrets handling

Store endpoint credentials in your secrets manager, not in Git, shell startup files, or ticket comments. Rotate credentials regularly and after team changes. For production change control, keep a parameter inventory documenting retention window, resource limits, and write-access policy. The inventory becomes your baseline during incidents and audits.

Set conservative retention first and observe real usage. Many teams overestimate needed retention in early phases and then struggle with disk pressure. Start with a bounded period, trend the data footprint for at least a week, and scale storage based on measured growth rather than assumptions.

If multiple clients remote_write to this node, define ownership and budget for metric cardinality per team. Enforce label hygiene early; otherwise storage growth and query latency can degrade quickly. Add a regular review for top cardinality sources and drop low-value labels before they become a chronic reliability problem.

Verification

Use repeatable checks after deployment and after every upgrade. Capture outputs in your operations record.

# Local service checks
systemctl is-active victoriametrics
curl -s http://127.0.0.1:8428/health

# Public endpoint checks
curl -I https://metrics.example.com/api/v1/query

# Optional test write
curl -u ingest_user:YOUR_PASSWORD -X POST   "https://metrics.example.com/api/v1/write"   --data-binary @sample.prom

If copy does not work in your browser/editor, manually select the block and copy.

Then inspect logs for at least one normal traffic cycle:

journalctl -u victoriametrics -n 100 --no-pager
sudo tail -n 100 /var/log/nginx/access.log
sudo tail -n 100 /var/log/nginx/error.log

If copy does not work in your browser/editor, manually select the block and copy.

Common issues/fixes

TLS is valid but requests still fail

Check DNS resolution and ensure Nginx is serving the expected hostname. Misaligned server_name values are a common root cause.

Writes return 401 or 403

Validate htpasswd credentials, verify the client is targeting the write endpoint, and confirm there is no stale proxy cache.

Unexpected disk growth

Audit metric cardinality, reduce noisy labels, and shorten retention while you remediate the source of growth.

Service repeatedly restarts

Read systemd logs and verify ownership/permissions under /var/lib/victoriametrics. Also confirm binary path and startup flags are valid after upgrades.

Slow query response under burst traffic

Review CPU steal, I/O wait, and high-cardinality queries. Consider query guardrails and dedicated resources if mixed workloads share the host.

FAQ

1) Is this setup enough for production?

For many teams, yes. It is suitable for single-node production where high simplicity and fast operations are priorities. You can evolve to clustered architecture later.

2) Why use systemd instead of a container here?

Native systemd deployments are straightforward on dedicated Ubuntu hosts and reduce moving parts. For platform teams standardized on containers, both approaches are valid.

3) How often should credentials be rotated?

At minimum quarterly, and immediately after incident response or staffing changes affecting operational access.

4) What is the most common scaling pitfall?

Unchecked metric cardinality. Teams often add dynamic labels that multiply series count and storage unexpectedly.

5) How do I plan backup and restore?

Take regular snapshots, keep retention for restore points, and test restore drills in an isolated environment at least quarterly.

6) Can I migrate gradually from existing Prometheus storage?

Yes. A phased migration with dual-write during validation is common and reduces rollback risk.

7) Should query and write endpoints share the same auth policy?

Not always. Keep write endpoints tightly controlled, and apply least privilege to query users based on operational needs.

Related guides

Talk to us

If you want help implementing production observability, hardening metric pipelines, or planning a phased migration, we can help you design and operate it safely.

in Guides

# DevOps Guides Monitoring Ubuntu VictoriaMetrics

Deploy Umami with Docker Compose + Cloudflare Tunnel on Ubuntu (Production Guide)

A practical, production-oriented guide for privacy-friendly web analytics without exposing your server publicly.