When remote employees, contractors, and multi-cloud workloads all need private access to internal services, a traditional VPN quickly becomes a bottleneck. Teams start with a single concentrator, then discover recurring pain: brittle client onboarding, difficult key rotation, weak device visibility, and flat network exposure that violates least-privilege principles. In this guide, we deploy a production-ready NetBird control plane on Kubernetes using Helm and cert-manager so you can deliver zero-trust network access with auditable policy, identity-aware enrollment, and encrypted WireGuard connectivity.
The deployment pattern in this tutorial is designed for operations teams that need predictable upgrades and clear rollback paths. We use separate namespaces, explicit secrets, TLS automation, and health checks so the stack is manageable in day-2 operations. By the end, you will have a hardened baseline you can adapt for staging and production environments, plus practical troubleshooting notes for the failures most teams hit first.
Architecture and flow overview
This implementation uses Kubernetes as the control-plane substrate and Helm as the release manager. cert-manager issues and renews TLS certificates via ACME, while your ingress controller terminates HTTPS and routes traffic to the NetBird dashboard and management endpoints. Users authenticate through your configured identity provider, enroll clients, and receive policy-defined access to private resources over WireGuard tunnels.
- Kubernetes cluster: runs NetBird components with controlled rollout and observability hooks.
- Helm chart values: keeps environment-specific overrides in versioned YAML.
- cert-manager + ClusterIssuer: automates trusted certificates and renewal.
- Ingress + DNS: exposes secure endpoints at predictable hostnames.
- Secrets: stores sensitive values outside application manifests.
Prerequisites
- Kubernetes 1.27+ cluster with working ingress controller.
kubectlandhelminstalled on your admin workstation.- DNS control for a domain such as
vpn.example.com. - An ACME email for cert-manager issuance.
- A secure secret-management workflow (Vault, SOPS, or sealed secrets).
- Outbound internet access from cluster nodes for image pulls and ACME challenges.
Step-by-step deployment
1) Create namespace and baseline labels
Create a dedicated namespace so resource quotas, policies, and troubleshooting remain scoped. Label it for ownership and environment filtering.
kubectl create namespace netbird
kubectl label namespace netbird app=netbird env=prod owner=platform-team
If the copy button does not work in your browser, manually select the block and copy it.
2) Install cert-manager (if not already installed)
cert-manager is a hard requirement for automated TLS. If your platform team already operates it cluster-wide, skip this step and reuse the existing ClusterIssuer policy.
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm upgrade --install cert-manager jetstack/cert-manager --namespace cert-manager --create-namespace --set crds.enabled=true
If the copy button does not work in your browser, manually select the block and copy it.
3) Create a production ClusterIssuer
Use Let's Encrypt production only after validating with staging in non-production clusters. Keep your notification email and challenge solver explicit.
cat <<'EOF' | kubectl apply -f -
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod
spec:
acme:
email: [email protected]
server: https://acme-v02.api.letsencrypt.org/directory
privateKeySecretRef:
name: letsencrypt-prod
solvers:
- http01:
ingress:
class: nginx
EOF
If the copy button does not work in your browser, manually select the block and copy it.
4) Add Helm repository and prepare values
Store values in Git and separate non-secret from secret material. The snippet below shows the core production knobs: ingress hostnames, TLS issuer, replica intent, and probes.
helm repo add netbird https://netbirdio.github.io/helm-charts
helm repo update
mkdir -p deploy/netbird
If the copy button does not work in your browser, manually select the block and copy it.
cat <<'EOF' > deploy/netbird/values-prod.yaml
ingress:
enabled: true
className: nginx
annotations:
cert-manager.io/cluster-issuer: letsencrypt-prod
hosts:
- host: vpn.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: netbird-tls
hosts:
- vpn.example.com
replicaCount: 2
resources:
requests:
cpu: 200m
memory: 256Mi
limits:
cpu: 1000m
memory: 1Gi
podDisruptionBudget:
enabled: true
minAvailable: 1
EOF
If the copy button does not work in your browser, manually select the block and copy it.
5) Create secrets safely
Never commit literal credentials. Create Kubernetes secrets from your secret store export during CI/CD deployment. Rotate all bootstrap secrets after first successful login.
kubectl -n netbird create secret generic netbird-secrets --from-literal=NETBIRD_DOMAIN=vpn.example.com [email protected] --from-literal=NETBIRD_ENCRYPTION_KEY='replace-with-32-byte-random'
If the copy button does not work in your browser, manually select the block and copy it.
6) Install or upgrade NetBird release
Use helm upgrade --install for idempotent deployments. Keep release naming stable for clean rollback and audit trails.
helm upgrade --install netbird netbird/netbird --namespace netbird -f deploy/netbird/values-prod.yaml --wait --timeout 10m
If the copy button does not work in your browser, manually select the block and copy it.
7) Validate rollout and endpoints
Confirm pods are healthy, certificates are issued, and ingress is serving expected hostnames before onboarding users.
kubectl -n netbird get pods -o wide
kubectl -n netbird get ingress
kubectl -n netbird get certificate
kubectl -n netbird describe certificate netbird-tls
If the copy button does not work in your browser, manually select the block and copy it.
Configuration and secrets handling best practices
Production NetBird operations improve significantly when configuration and secrets are treated as separate concerns. Keep chart values under version control, then inject secrets at deploy time through a secure pipeline. This gives you reproducible releases without exposing credential material in Git history.
- Use environment overlays (
values-dev.yaml,values-stage.yaml,values-prod.yaml). - Encrypt sensitive manifests with SOPS or use external secret operators.
- Pin chart versions and track upgrade notes in your runbook.
- Rotate encryption keys and admin bootstrap credentials on a fixed schedule.
- Restrict namespace RBAC so only platform admins can read secret objects.
For identity, enforce SSO-backed authentication and avoid shared admin accounts. For networking, apply NetworkPolicy so control-plane components only talk to required services. This reduces lateral movement risk if any pod is compromised.
Verification checklist
- Dashboard endpoint resolves and serves a valid certificate chain.
- Management API health endpoint returns
200 OK. - At least two replicas are running for critical control-plane components.
- Client enrollment works with your identity provider and expected group mapping.
- Policy rules enforce least privilege between peer groups.
- Audit logs are collected centrally (SIEM/observability stack).
curl -I https://vpn.example.com
kubectl -n netbird logs deploy/netbird-management --tail=100
kubectl -n netbird get events --sort-by=.metadata.creationTimestamp | tail -n 30
If the copy button does not work in your browser, manually select the block and copy it.
Common issues and fixes
ACME challenge fails repeatedly
Check ingress class mismatch and DNS propagation. Most failures come from wrong ingress.class or stale A records.
Pods restart after deployment
Usually resource limits are too low or secret keys are malformed. Validate secret keys and increase memory limits for management components.
Users can authenticate but cannot reach private services
Review NetBird policy definitions and peer group assignments. Also verify destination service firewalls allow WireGuard CIDR ranges.
Certificate is valid but browser warns intermittently
Confirm your load balancer is not serving an old default cert on one backend and ensure TLS secret names are identical across ingress resources.
FAQ
1) Can I run NetBird with a single-node Kubernetes cluster?
Yes for labs, but production should use multi-node control-plane capacity and at least two replicas for critical services.
2) Which ingress controller is best for this setup?
Nginx, Traefik, and HAProxy can work. Choose the controller your platform team already monitors and operates reliably.
3) How often should I rotate NetBird encryption keys?
Set a formal quarterly or semiannual rotation policy, and always rotate immediately after suspected credential exposure.
4) Do I need a managed database?
Not strictly, but managed data services often reduce operational risk and simplify backup/restore objectives.
5) What is the minimum logging I should capture?
Ingress access logs, management service logs, auth events, and Kubernetes events at deployment and incident windows.
6) How do I roll back safely after a failed chart upgrade?
Use helm rollback to the prior revision, then re-run health checks before re-opening user enrollment.
7) How should I separate staging and production access?
Use distinct hostnames, identities/groups, and policy sets; never share production admin credentials with non-production clusters.
helm -n netbird history netbird
helm -n netbird rollback netbird <REVISION_NUMBER>
kubectl -n netbird get pods
If the copy button does not work in your browser, manually select the block and copy it.
Related internal guides
Talk to us
If you want help designing a zero-trust rollout, hardening your Kubernetes control plane, or building an operations runbook your team can actually execute during incidents, we can help.