If your team needs fast search over docs, incident notes, and product catalogs, self-hosting OpenSearch Dashboards can reduce latency, improve data control, and cut recurring SaaS costs. But many deployments fail in production due to weak secret handling, missing backup policy, and no rollout strategy.
This guide shows a production-oriented deployment using Kubernetes Helm. You will deploy a hardened stack, keep credentials out of source control, validate each layer, and prepare for day-2 operations like upgrades, snapshots, and troubleshooting under load.
A common real-world pattern is an internal operations portal where teams search playbooks, postmortems, and customer docs. Centralized search can remove friction, but only if availability, access boundaries, and backup objectives are designed up front.
Architecture and Flow Overview
- Ingress: TLS termination and routing to the search service.
- Search service: Meilisearch API + index storage on persistent volume.
- Secret layer: Kubernetes Secret for API key and bootstrap values.
- Observability: readiness/liveness probes and basic service metrics.
- Backup path: snapshot export job to object storage on schedule.
Request flow: client request → ingress → service → Meilisearch pod → persistent volume. Admin/API calls are protected by an API key and restricted network policy. For high-change workloads, keep ingestion workers separate from query services so indexing spikes do not degrade user-facing searches.
Capacity planning should include expected document volume, average document size, update frequency, and query concurrency. These parameters influence CPU, RAM, and disk provisioning more than cluster size alone.
Prerequisites
- Kubernetes cluster (v1.26+ recommended) with a default StorageClass.
- kubectl access with namespace-admin privileges.
- Helm v3.12+ installed locally.
- A DNS name for the service (for example,
search.example.com). - Cert-manager or an existing wildcard TLS certificate.
- Object storage bucket for snapshots (S3-compatible is fine).
- CI/CD runner capable of Helm deploy and secret injection.
Step-by-Step Deployment
1) Create namespace and baseline policy
kubectl create namespace meilisearch
kubectl -n meilisearch create limitrange meili-limits --dry-run=client -o yaml | kubectl apply -f -
kubectl -n meilisearch label namespace meilisearch app=meilisearch --overwrite
If the copy button does not work in your browser/editor, select the code manually and copy.
2) Add Helm repo and inspect chart values
helm repo add meilisearch https://meilisearch.github.io/meilisearch-kubernetes
helm repo update
helm search repo meilisearch/meilisearch --versions | head -n 5
helm show values meilisearch/meilisearch > values.reference.yaml
If the copy button does not work in your browser/editor, select the code manually and copy.
3) Create secrets without committing plaintext
export MEILI_MASTER_KEY="$(openssl rand -hex 32)"
kubectl -n meilisearch create secret generic meili-secrets --from-literal=MEILI_MASTER_KEY="$MEILI_MASTER_KEY" --dry-run=client -o yaml | kubectl apply -f -
unset MEILI_MASTER_KEY
If the copy button does not work in your browser/editor, select the code manually and copy.
4) Prepare hardened values file
cat > values.prod.yaml <<'YAML'
replicaCount: 2
image:
tag: v1.11
persistence:
enabled: true
size: 200Gi
resources:
requests:
cpu: "500m"
memory: "1Gi"
limits:
cpu: "2"
memory: "4Gi"
envFrom:
- secretRef:
name: meili-secrets
ingress:
enabled: true
className: nginx
hosts:
- host: search.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: search-tls
hosts:
- search.example.com
livenessProbe:
httpGet:
path: /health
port: 7700
readinessProbe:
httpGet:
path: /health
port: 7700
YAML
If the copy button does not work in your browser/editor, select the code manually and copy.
5) Install the release and validate rollout
helm upgrade --install meilisearch meilisearch/meilisearch -n meilisearch -f values.prod.yaml
kubectl -n meilisearch rollout status deploy/meilisearch --timeout=5m
kubectl -n meilisearch get pods -o wide
If the copy button does not work in your browser/editor, select the code manually and copy.
6) Restrict network access
cat > netpol.yaml <<'YAML'
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: meili-allow-ingress-only
namespace: meilisearch
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
ingress:
- from:
- namespaceSelector:
matchLabels:
ingress: allowed
ports:
- protocol: TCP
port: 7700
egress:
- to:
- ipBlock:
cidr: 0.0.0.0/0
YAML
kubectl apply -f netpol.yaml
If the copy button does not work in your browser/editor, select the code manually and copy.
7) Seed a staging index and verify query behavior
curl -X POST 'https://search.example.com/indexes/docs/documents' -H 'Content-Type: application/json' -H "Authorization: Bearer $MEILI_MASTER_KEY" --data-binary @sample-docs.json
curl -sS 'https://search.example.com/indexes/docs/search' -H 'Content-Type: application/json' -H "Authorization: Bearer $MEILI_MASTER_KEY" --data '{"q":"incident response"}' | jq .
If the copy button does not work in your browser/editor, select the code manually and copy.
Configuration and Secret-Handling Best Practices
Keep production values in a dedicated file and never store sensitive keys in Git. Use a secret manager integration if available (External Secrets Operator, Vault, or cloud KMS). Rotate master keys on a schedule and after personnel changes. If you use CI/CD, inject secrets at deploy time and avoid writing them to logs.
For index settings, define typo tolerance, ranking rules, and searchable attributes explicitly per index. Do not rely on defaults across environments. Maintain a migration checklist for schema/index setting changes and apply them in staging first.
When exposing the API externally, enforce TLS, rate limits, and an allowlist for admin endpoints. Use separate API keys for ingestion and query traffic so revocation can be scoped without full outage risk. Keep dashboards for request rate, latency buckets, and non-2xx responses so regression appears before customers notice impact.
In regulated environments, attach retention and purge rules to index categories. For example, operational logs might need shorter retention than contractual or compliance documents. Document these controls in your runbook and review quarterly.
Verification
kubectl -n meilisearch get deploy,po,svc,ingress
kubectl -n meilisearch logs deploy/meilisearch --tail=100
curl -sS https://search.example.com/health
curl -sS -H "Authorization: Bearer $MEILI_MASTER_KEY" https://search.example.com/version
If the copy button does not work in your browser/editor, select the code manually and copy.
Expected results: deployment available, probes healthy, TLS valid, and authenticated version endpoint returning JSON. Next, run synthetic queries from the same network segment your application uses. Record baseline p50/p95 latency so future changes can be evaluated against objective performance thresholds.
For reliability testing, execute a controlled pod restart and ensure traffic remains available. Confirm rolling updates complete without failed probes and verify autoscaler behavior if configured.
Common Issues and Fixes
Pods restart during indexing spikes
Increase memory limits and tune batch size in ingestion jobs. Check OOMKilled events and profile your largest documents. Consider sharding ingest jobs by index.
High query latency at peak traffic
Raise CPU requests, ensure anti-affinity spreads pods, and verify node IOPS. Add an in-memory cache in front of frequent read patterns.
Ingress returns 502/504 intermittently
Validate upstream service endpoints and probe timeouts. Ensure readiness checks are not too strict during warm-up and adjust ingress proxy timeout settings.
Snapshot restore is slow
Run restore in maintenance window, provision temporary higher IOPS storage, and validate object store throughput. Keep restore runbook tested monthly.
Unexpected auth failures after key rotation
Check application-side secret refresh. If clients cache old keys, deploy a two-key transition window and retire the old key only after telemetry confirms migration completion.
FAQ
1) Can I start with one replica and scale later?
Yes. Start with one replica for low traffic, but define persistence and backup from day one so scale-out is operationally safe.
2) Do I need a service mesh for this setup?
No. It is optional. For many teams, ingress + network policies + TLS are enough. Add mesh only if you already operate it successfully.
3) How often should I rotate API keys?
At least quarterly, and immediately after security incidents or team-role transitions. Automate issuance and revocation workflows.
4) What is the best backup cadence?
Use hourly incremental snapshots for high-change datasets and daily full snapshots. Validate restore speed and integrity regularly.
5) How do I reduce downtime during upgrades?
Use rolling updates, preflight checks, and canary traffic if possible. Always test chart/value changes in staging with production-like data volume.
6) Which metrics matter most first?
P95 query latency, ingestion throughput, error rate, pod restarts, and storage saturation. Alert on sustained threshold breaches, not single spikes.
7) Is exposing admin APIs publicly acceptable?
Avoid it. Prefer private network access via VPN/bastion. If external access is unavoidable, enforce strict IP allowlists and short-lived credentials.
Related Guides
Talk to us
If you want help designing a resilient OpenSearch Dashboards deployment, hardening access, or building an automated backup/restore pipeline, our team can help.
Copy Code Helper
If your Odoo editor strips JavaScript, use the manual-copy note under each code block.