Dify AI Platform Setup: Build and Ship AI Apps Without the Infrastructure Nightmare
Most AI app frameworks make you choose between power and simplicity. Dify doesn't. It's an open-source LLM application platform that gives you a visual workflow builder, RAG pipelines, prompt engineering tools, API access, and a full agent framework — all self-hostable on your own infrastructure. This guide walks you through a complete Dify AI platform setup: from first docker compose up to a working AI app connected to your LLM of choice.
Prerequisites
Before you start, make sure you have the following ready:
- A Linux server or local machine (Ubuntu 20.04+ recommended)
- Docker Engine and Docker Compose v2 installed
- At least 4GB RAM — Dify runs several services concurrently
- At least 20GB free disk space for models, vector data, and logs
- An API key for at least one LLM provider (OpenAI, Anthropic, etc.) or a local model via Ollama
- Ports 80 and 443 available (or any custom port you choose)
Verify your environment:
docker --version
docker compose version
free -h
df -h /
What Is Dify and Why Self-Host It?
Dify is an open-source LLMOps platform. Think of it as the backend and UI layer your AI app needs but that you'd otherwise spend months building yourself. It handles prompt management, RAG (retrieval-augmented generation), multi-step workflows, API exposure, and observability — all from a clean web interface.
Core Features
- Visual workflow builder — chain LLM calls, tools, conditionals, and data sources without writing glue code
- RAG pipelines — upload documents, index them into a vector store, and query them with natural language
- Multi-model support — OpenAI, Anthropic, Mistral, Cohere, HuggingFace, Ollama, and more
- Agent framework — build tool-using agents with function calling
- API-first — every app you build is instantly accessible via a REST API
- Prompt IDE — version, test, and iterate on prompts with built-in A/B tooling
Why Self-Host?
The cloud version of Dify is fine for prototypes. But if you're processing sensitive data, need to control which models you use, want to avoid per-seat pricing at scale, or simply want your AI stack to live next to your other infrastructure — self-hosting is the move.
Deploying Dify with Docker Compose
Clone the Repo and Configure the Environment
Dify ships with an official Docker Compose setup. Clone it and copy the example env file:
git clone https://github.com/langgenius/dify.git
cd dify/docker
cp .env.example .env
Open .env and set at minimum these values:
# .env (key settings)
# Secret key — change this to a random string
SECRET_KEY=your-random-secret-key-here
# Console URL — where the Dify UI will be served
CONSOLE_WEB_URL=https://dify.yourdomain.com
CONSOLE_API_URL=https://dify.yourdomain.com
# App URL — same if single-domain
APP_WEB_URL=https://dify.yourdomain.com
# Postgres credentials
DB_USERNAME=dify
DB_PASSWORD=strongdbpassword
DB_DATABASE=dify
# Redis password
REDIS_PASSWORD=strongredispassword
Generate a secure secret key with:
openssl rand -hex 32
Start All Services
Dify's Compose stack includes the API server, web frontend, worker, PostgreSQL, Redis, Weaviate (vector DB), and Nginx. Bring it all up:
docker compose up -d
docker compose ps
First boot takes a few minutes while database migrations run. Watch the API service logs until you see it's healthy:
docker compose logs -f api
Once it's up, open http://localhost (or your domain). You'll be prompted to create an admin account on first visit. Do that, then log in to the Dify console.
Connecting LLM Providers and Models
Adding a Cloud LLM Provider
In the Dify console, go to Settings → Model Provider. You'll see a list of supported providers. Click the one you want, paste your API key, and save. Dify immediately validates the key and lists available models.
For OpenAI, you can also set a custom base URL here — useful if you're using Azure OpenAI Service or an OpenAI-compatible proxy.
Using Local Models via Ollama
If you want to run models locally without sending data to any external API, use Ollama. Install it on the same host and pull a model:
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model (e.g., Llama 3)
ollama pull llama3
# Verify it's running
curl http://localhost:11434/api/tags
In Dify's Model Provider settings, add Ollama and set the base URL to http://host.docker.internal:11434 (on Linux, use your host's actual LAN IP since host.docker.internal may not resolve). Dify will discover all locally available Ollama models automatically.
Setting a Default Model
Under Settings → Model Provider → System Model Settings, set your default inference model, embedding model, and reranking model. These defaults apply across all apps unless overridden per-app. Getting the embedding model right is especially important for RAG quality.
Building Your First AI App
Chatbot App
The fastest path to a working AI app in Dify:
- Go to Studio → Create App → Chatbot
- Give it a name and pick a model
- Write a system prompt in the Prompt IDE
- Click Publish
Your chatbot is now live with a shareable URL and a REST API endpoint. Test it immediately from the built-in debugger before touching any code.
RAG App with a Knowledge Base
To build an app that answers questions from your own documents:
- Go to Knowledge → Create Knowledge Base
- Upload your documents (PDF, Markdown, TXT, Notion, web scrape — all supported)
- Choose a chunking strategy and your embedding model, then index
- In your app settings, enable Context and attach your knowledge base
Dify handles embedding, storage in Weaviate, and retrieval automatically. You control chunk size, overlap, and retrieval top-K from the UI.
Workflow App
For multi-step AI pipelines — classification → lookup → generation → formatting — use the Workflow app type. The visual canvas lets you drag in nodes for LLM calls, HTTP requests, code execution, conditionals, and variable transformations. Each node's output feeds the next. It's closer to a proper DAG builder than a simple chatbot config.
Exposing Dify Apps via API
Every Dify app gets its own API key and endpoint. Here's how to call a published chatbot app from your backend:
curl -X POST https://dify.yourdomain.com/v1/chat-messages \
-H 'Authorization: Bearer YOUR_APP_API_KEY' \
-H 'Content-Type: application/json' \
-d '{
"inputs": {},
"query": "What is our refund policy?",
"response_mode": "blocking",
"conversation_id": "",
"user": "user-1234"
}'
Switch response_mode to streaming for SSE-based streaming responses — better for chat UIs where you want tokens to appear as they're generated:
curl -X POST https://dify.yourdomain.com/v1/chat-messages \
-H 'Authorization: Bearer YOUR_APP_API_KEY' \
-H 'Content-Type: application/json' \
-H 'Accept: text/event-stream' \
-d '{
"inputs": {},
"query": "Summarize the Q3 report",
"response_mode": "streaming",
"conversation_id": "",
"user": "user-5678"
}'
The API reference for your specific app is always available inside Dify under App → API Access. It's auto-generated based on your app's input variables and configuration.
Tips, Gotchas, and Troubleshooting
Services Won't Start or Crash Immediately
Check which service is failing and read its logs:
docker compose ps
docker compose logs api --tail 50
docker compose logs worker --tail 50
docker compose logs db --tail 30
The most common causes: wrong database password in .env, insufficient RAM causing the worker to OOM, or a port conflict on 80/443.
Vector DB Connection Errors
Dify uses Weaviate by default. If you see vector store errors, make sure the Weaviate container is healthy:
docker compose ps weaviate
curl http://localhost:8080/v1/.well-known/ready
If you'd rather use a different vector store, Dify also supports Qdrant, Milvus, pgvector, and others. Set VECTOR_STORE in your .env and configure the corresponding connection variables.
Upgrading Dify
Pull the latest code and restart — migrations run automatically on startup:
cd dify/docker
git pull origin main
docker compose pull
docker compose up -d
Always check the release notes before upgrading — breaking changes to the .env schema are called out explicitly.
Ollama Not Reachable from Dify Containers
On Linux, host.docker.internal doesn't resolve by default. Use your host's actual IP on the Docker bridge network instead:
# Find your host IP on the docker bridge
ip addr show docker0 | grep 'inet ' | awk '{print $2}' | cut -d/ -f1
# Typically: 172.17.0.1
# Use that IP in Dify's Ollama base URL:
# http://172.17.0.1:11434
Pro Tips
- Version your prompts — Dify's Prompt IDE keeps full history. Use it. A prompt regression can silently tank your app's quality.
- Use conversation variables in workflows to carry state across turns without reinventing session management.
- Monitor token usage per app under Monitoring → Logs & Annotations. It's the fastest way to catch runaway prompt costs before they hit your bill.
- Chunk size matters for RAG — smaller chunks (256–512 tokens) improve precision; larger chunks improve context. Test both against your actual queries before going to production.
- Back up your PostgreSQL volume regularly — all your app configs, knowledge bases, and conversation history live there.
Wrapping Up
A complete Dify AI platform setup takes under an hour and gives you infrastructure that most teams spend months building from scratch. You get RAG pipelines, workflow automation, multi-model support, and a clean API layer — all running on your own servers, under your own control.
Start with one chatbot or RAG app, get a feel for the workflow builder, then start replacing the ad-hoc LLM scripts in your codebase with proper Dify-managed apps. The difference in observability and maintainability alone is worth the switch.
Building AI Apps at Scale?
If you're moving beyond a single Dify instance into multi-tenant deployments, custom model integrations, or production-grade AI infrastructure — the sysbrix team can design and implement it with you. We work hands-on with self-hosted AI stacks from architecture through to launch.
📚 More Dify Guides on Sysbrix