Deploy VictoriaMetrics Single Server with systemd + Nginx TLS on Ubuntu (Production Guide)
A practical, production-grade setup for fast, low-footprint metrics storage with secure remote ingestion.
Intro with a real-world use case
When teams outgrow basic host metrics and need dependable long-term trend analysis, they often want something lighter than a full distributed TSDB cluster. VictoriaMetrics is a strong fit for this stage: compact footprint, fast ingestion, and good query performance on modest hardware. In this guide, we deploy VictoriaMetrics on Ubuntu using a native binary under systemd and front it with Nginx TLS so external writes and reads are controlled and auditable.
This walkthrough is production-oriented. Instead of just making the process run, we focus on safe defaults, service hardening, explicit retention controls, and practical failure-recovery routines. The goal is to give operators a setup that is easy to maintain six months from now, including clear verification checkpoints and common-issue playbooks.
Architecture/flow overview
The service listens on localhost only. Nginx handles HTTPS and selectively proxies API paths. This separation keeps direct internet traffic away from the metrics process. Prometheus-compatible clients send remote_write traffic to the public endpoint, while queries are proxied through controlled routes. We also keep disk paths explicit and isolate the process with systemd hardening so the blast radius is reduced if something goes wrong.
- VictoriaMetrics process and storage at
/var/lib/victoriametrics - systemd for restart policy, startup order, and security boundaries
- Nginx TLS frontend with endpoint-level controls
- Firewall and basic-auth controls for write endpoints
Prerequisites
- Ubuntu 22.04/24.04 server (2 vCPU, 4 GB RAM minimum)
- DNS record for your metrics hostname
- sudo privileges
- Open ports 80/443 and SSH access
- A change window long enough for verification tests
Step-by-step deployment
1) Prepare host baseline
Start with OS updates and required packages. Keep this host single-purpose if possible so resource behavior remains predictable under load.
sudo apt update && sudo apt -y upgrade
sudo apt -y install curl tar nginx ufw jq apache2-utils
sudo useradd --system --no-create-home --shell /usr/sbin/nologin victoriametrics
sudo mkdir -p /var/lib/victoriametrics
sudo chown -R victoriametrics:victoriametrics /var/lib/victoriametrics
If copy does not work in your browser/editor, manually select the block and copy.
2) Install pinned VictoriaMetrics release
Pinning a version avoids surprise regressions. Upgrade deliberately after staging tests.
VM_VERSION="v1.102.1"
curl -L -o /tmp/vm.tar.gz "https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/${VM_VERSION}/victoria-metrics-linux-amd64-${VM_VERSION}.tar.gz"
tar -xzf /tmp/vm.tar.gz -C /tmp
sudo install -m 0755 /tmp/victoria-metrics-prod /usr/local/bin/victoria-metrics
/usr/local/bin/victoria-metrics --version
If copy does not work in your browser/editor, manually select the block and copy.
3) Configure systemd service with guardrails
We set restarts, file limits, and restrictive filesystem permissions. The service binds to localhost so only the reverse proxy can expose it.
sudo tee /etc/systemd/system/victoriametrics.service >/dev/null <<'EOF'
[Unit]
Description=VictoriaMetrics
After=network-online.target
Wants=network-online.target
[Service]
User=victoriametrics
Group=victoriametrics
ExecStart=/usr/local/bin/victoria-metrics \
-storageDataPath=/var/lib/victoriametrics \
-retentionPeriod=6 \
-httpListenAddr=127.0.0.1:8428
Restart=always
RestartSec=5
LimitNOFILE=65535
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/victoriametrics
[Install]
WantedBy=multi-user.target
EOF
sudo systemctl daemon-reload
sudo systemctl enable --now victoriametrics
sudo systemctl status victoriametrics --no-pager
If copy does not work in your browser/editor, manually select the block and copy.
4) Configure Nginx TLS and path controls
Expose only explicit routes. Keep everything else denied by default so accidental endpoint exposure is minimized.
sudo tee /etc/nginx/sites-available/vm.conf >/dev/null <<'EOF'
server {
listen 80;
server_name metrics.example.com;
return 301 https://$host$request_uri;
}
server {
listen 443 ssl http2;
server_name metrics.example.com;
ssl_certificate /etc/letsencrypt/live/metrics.example.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/metrics.example.com/privkey.pem;
location /api/v1/query {
proxy_pass http://127.0.0.1:8428;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto https;
}
location /api/v1/write {
auth_basic "Restricted";
auth_basic_user_file /etc/nginx/.vm_htpasswd;
proxy_pass http://127.0.0.1:8428;
proxy_set_header Host $host;
proxy_set_header X-Forwarded-Proto https;
}
location / { return 404; }
}
EOF
sudo ln -sf /etc/nginx/sites-available/vm.conf /etc/nginx/sites-enabled/vm.conf
sudo nginx -t && sudo systemctl reload nginx
If copy does not work in your browser/editor, manually select the block and copy.
5) Protect write access and firewall surface
Restrict ingestion endpoints with basic auth and firewall policy. If available, combine with IP allowlisting.
sudo htpasswd -c /etc/nginx/.vm_htpasswd ingest_user
sudo chmod 640 /etc/nginx/.vm_htpasswd
sudo chown root:www-data /etc/nginx/.vm_htpasswd
sudo ufw allow OpenSSH
sudo ufw allow 80/tcp
sudo ufw allow 443/tcp
sudo ufw --force enable
If copy does not work in your browser/editor, manually select the block and copy.
Configuration and secrets handling
Store endpoint credentials in your secrets manager, not in Git, shell startup files, or ticket comments. Rotate credentials regularly and after team changes. For production change control, keep a parameter inventory documenting retention window, resource limits, and write-access policy. The inventory becomes your baseline during incidents and audits.
Set conservative retention first and observe real usage. Many teams overestimate needed retention in early phases and then struggle with disk pressure. Start with a bounded period, trend the data footprint for at least a week, and scale storage based on measured growth rather than assumptions.
If multiple clients remote_write to this node, define ownership and budget for metric cardinality per team. Enforce label hygiene early; otherwise storage growth and query latency can degrade quickly. Add a regular review for top cardinality sources and drop low-value labels before they become a chronic reliability problem.
Verification
Use repeatable checks after deployment and after every upgrade. Capture outputs in your operations record.
# Local service checks
systemctl is-active victoriametrics
curl -s http://127.0.0.1:8428/health
# Public endpoint checks
curl -I https://metrics.example.com/api/v1/query
# Optional test write
curl -u ingest_user:YOUR_PASSWORD -X POST "https://metrics.example.com/api/v1/write" --data-binary @sample.prom
If copy does not work in your browser/editor, manually select the block and copy.
Then inspect logs for at least one normal traffic cycle:
journalctl -u victoriametrics -n 100 --no-pager
sudo tail -n 100 /var/log/nginx/access.log
sudo tail -n 100 /var/log/nginx/error.log
If copy does not work in your browser/editor, manually select the block and copy.
Common issues/fixes
TLS is valid but requests still fail
Check DNS resolution and ensure Nginx is serving the expected hostname. Misaligned server_name values are a common root cause.
Writes return 401 or 403
Validate htpasswd credentials, verify the client is targeting the write endpoint, and confirm there is no stale proxy cache.
Unexpected disk growth
Audit metric cardinality, reduce noisy labels, and shorten retention while you remediate the source of growth.
Service repeatedly restarts
Read systemd logs and verify ownership/permissions under /var/lib/victoriametrics. Also confirm binary path and startup flags are valid after upgrades.
Slow query response under burst traffic
Review CPU steal, I/O wait, and high-cardinality queries. Consider query guardrails and dedicated resources if mixed workloads share the host.
FAQ
1) Is this setup enough for production?
For many teams, yes. It is suitable for single-node production where high simplicity and fast operations are priorities. You can evolve to clustered architecture later.
2) Why use systemd instead of a container here?
Native systemd deployments are straightforward on dedicated Ubuntu hosts and reduce moving parts. For platform teams standardized on containers, both approaches are valid.
3) How often should credentials be rotated?
At minimum quarterly, and immediately after incident response or staffing changes affecting operational access.
4) What is the most common scaling pitfall?
Unchecked metric cardinality. Teams often add dynamic labels that multiply series count and storage unexpectedly.
5) How do I plan backup and restore?
Take regular snapshots, keep retention for restore points, and test restore drills in an isolated environment at least quarterly.
6) Can I migrate gradually from existing Prometheus storage?
Yes. A phased migration with dual-write during validation is common and reduces rollback risk.
7) Should query and write endpoints share the same auth policy?
Not always. Keep write endpoints tightly controlled, and apply least privilege to query users based on operational needs.
Related guides
- How to Deploy Netdata on Kubernetes with Helm for Production-Grade Monitoring
- Headscale Self-Hosted VPN: Production Docker Compose + Caddy + OIDC Deployment Guide
- Deploy Uptime Kuma with Docker Compose and Caddy on Ubuntu (Production Guide)
Talk to us
If you want help implementing production observability, hardening metric pipelines, or planning a phased migration, we can help you design and operate it safely.