Dify: Visual Agent Workflows

Most of Us Don’t Want to Write LangChain Spaghetti

You’ve got an LLM. You want to feed it some documents, let it call a few tools, maybe add memory so it remembers what you asked yesterday. And then you want to ship it without writing 500 lines of callback hell, stream handlers, and token-counting utilities.

That’s Dify.

Dify is an open-source (Apache 2.0) LLM application builder you can self-host on a single machine. It gives you a visual workflow editor, knowledge base management, built-in RAG, agent orchestration, and tool integration — all without touching Python. You can publish your app as a chat interface, embed it in a website, or expose it as an API. No vendor lock-in. No monthly SaaS subscription.

If you’ve heard of Flowise or Langflow and thought “close, but not quite,” Dify is what you’ve been waiting for.

What Makes Dify Different

There are a few visual LLM builders out there. Flowise and Langflow are solid, but they’re… rough around the edges. Dify feels like someone actually shipped a product instead of gluing together open-source libraries and calling it a day.

Dify vs. Flowise:

Dify has proper RBAC (role-based access control), workspace isolation, and team management. Flowise is single-user or bolted-on multi-tenancy.
Dify’s workflow editor is more mature — better error messages, undo/redo, node organization.
Dify has an app “store” where you can publish workflows as reusable apps. Flowise doesn’t have this.
Flowise is lighter weight if you just want a quick prototype. Dify is heavier but production-ready.

Dify vs. Langflow:

Langflow is newer and leans harder on LangChain abstractions. Dify is its own thing.
Dify’s knowledge base and RAG features are more polished.
Langflow is good if you’re already deep in the LangChain ecosystem. Dify is better if you want to escape it.

Dify vs. writing it yourself in Python:

You save weeks. Seriously. Building a chat app with proper authentication, versioning, deployment, and monitoring takes time.
Dify handles upgrades. You push a new container version, workflows stay intact.
You get built-in observability (logs, token counting, latency) without adding instrumentation code.
You can hand it off to a non-engineer. They can tweak prompts, add tools, adjust retrieval settings — all through the UI.

The Core Pieces: Workflow Canvas

Dify’s brain is the workflow editor. It’s a directed acyclic graph (DAG) where you drag nodes around and wire them together.

Node types:

LLM: Call OpenAI, Anthropic, Ollama, vLLM, Bedrock, or any OpenAI-compatible API.
Knowledge Retriever: Embed a document chunk and search your knowledge base (more on this in a bit).
Code: Write Python or JavaScript inline. Dify runs it in an isolated sandbox container.
Tool: HTTP calls, webhooks, function calling. Wire up external APIs.
Conditional: Branch logic. “If user input contains ‘danger’, skip the tool call.”
Loop: Iterate over a list and process each item.
Chat Memory: Persist conversation history across sessions.
Variable: Pass data between nodes.

You plug these together, set up input/output schemas, and Dify handles the orchestration. No async callbacks. No token streaming from five different libraries.

Here’s the rough idea of a RAG agent:

[User Input] → [Knowledge Retriever] → [LLM (with system prompt + context)] → [Tool (if needed)] → [Output]

Dify manages the wiring. You define the logic.

Knowledge Base: Documents Without the Boilerplate

Dify’s knowledge base is how you upload documents for RAG. You dump PDFs, Markdown, or plain text, and Dify:

Chunks them (configurable strategy: by token count, by paragraph, by separator).
Embeds them (using OpenAI, Jina, or local embeddings).
Stores them (local vector DB, Weaviate, Pinecone, Qdrant, Milvus).
Retrieves them (keyword search, semantic search, or hybrid).

You don’t need to think about vector databases or embedding pipelines. You click “Upload” and move on.

The retriever node in your workflow automatically handles the search and re-ranking. You can set temperature, top-k results, similarity threshold — all tunable without redeploying.

Agent Mode: Tool Calling Made Sane

Dify’s agent mode is where it gets fun. Instead of a linear workflow, you define:

An LLM
A set of tools (HTTP requests, knowledge retriever, code nodes)
A max iteration count
A stop condition

The LLM decides which tool to call, runs it, gets the result, and decides what to do next. Dify handles the loops and error handling. You don’t have to code a state machine.

It’s roughly equivalent to what you’d get with LangChain’s AgentExecutor, but without the cognitive overhead.

Publishing: Chat, Embed, or API

Once your workflow is done, Dify gives you three ways to expose it:

1. Chat Interface Dify generates a public chatbot URL. You can theme it, set conversation starters, customize the header. It’s a complete standalone app. Share the link with your team and they’re good to go.

2. Embedded Chat Widget Copy a snippet of HTML/JavaScript and embed the chat in your website. Same as Intercom or Drift, but for your own workflows.

3. REST API Dify publishes the workflow as an HTTP endpoint. You get a /chat-messages endpoint that accepts inputs (a dict of your workflow’s input variables) and query (the user message). You can integrate it with anything.

# Example: POST to a published Dify workflow API
curl -X POST "https://your-dify-instance.com/v1/chat-messages" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "inputs": {
      "company": "Acme Corp",
      "tone": "professional"
    },
    "query": "Write me a job posting",
    "response_mode": "streaming"
  }'

The API supports streaming, so large model outputs don’t timeout.

Model Flexibility

Dify isn’t locked to OpenAI (unlike a lot of “AI” tooling). You can:

Use Ollama locally (run models on your hardware).
Point to vLLM (inference engine for faster serving).
Use Anthropic, Cohere, Hugging Face Inference API.
Use Bedrock (AWS).
Connect to any OpenAI-compatible API.

This means you can prototype with a local Ollama model, then swap to Claude or GPT-4 in production without touching the workflow.

Self-Hosting: Just Docker

Dify’s deployment is straightforward. The officially recommended setup uses Docker Compose and includes:

The Dify web UI and API.
Postgres (for workflow storage and user data).
Redis (for caching and job queues).
A sandbox container (for running code nodes safely).

Here’s a minimal docker-compose.yaml:

version: '3.8'

services:
  dify:
    image: langgenius/dify-api:0.5.0
    container_name: dify-api
    environment:
      DB_HOST: postgres
      DB_PORT: 5432
      DB_USERNAME: dify
      DB_PASSWORD: difypassword
      DB_DATABASE: dify
      REDIS_HOST: redis
      REDIS_PORT: 6379
      REDIS_PASSWORD: redispassword
      SECRET_KEY: your-secret-key-change-me
      # Use your own LLM provider
      OPENAI_API_KEY: sk-...
      # Or Anthropic
      ANTHROPIC_API_KEY: sk-ant-...
      # Or point to a local Ollama instance
      OLLAMA_HOST: http://ollama:11434
    ports:
      - "5001:5001"
    depends_on:
      - postgres
      - redis
    volumes:
      - ./storage:/home/dify/storage

  dify-web:
    image: langgenius/dify-web:0.5.0
    container_name: dify-web
    environment:
      NEXT_PUBLIC_API_URL: http://localhost:5001
    ports:
      - "3000:3000"

  postgres:
    image: postgres:15-alpine
    container_name: dify-postgres
    environment:
      POSTGRES_DB: dify
      POSTGRES_USER: dify
      POSTGRES_PASSWORD: difypassword
    volumes:
      - postgres_data:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    container_name: dify-redis
    command: redis-server --requirepass redispassword
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Spin this up:

docker-compose up -d
# Wait 30 seconds for migrations to finish
# Visit http://localhost:3000
# Create an account, you're in.

That’s it. You’ve got a full LLM app builder running on your machine.

For production, you’d add SSL (via Caddy or Cloudflare), a reverse proxy, maybe increase container resource limits, and consider external Postgres/Redis. But the basic model stays the same.

Versioning and Workflow Management

Every time you edit a workflow, Dify creates a new draft. You can:

Test it without affecting the published version.
Publish a new version when you’re happy.
Roll back to a previous version if something breaks.

This is huge if you’re iterating on prompts or retrieval settings. You don’t have to worry about breaking production.

Each published version gets its own API endpoint, so you can A/B test workflows or gradually roll out changes.

The RAG Gotcha: Chunking and Retrieval Strategy

Here’s where a lot of people stumble: just uploading documents isn’t enough. You need to think about how they’re chunked and how they’re retrieved.

Dify lets you configure:

Chunking method: Fixed token count, by paragraph, by separator.
Overlap: How much text repeats between chunks.
Retrieval mode: Keyword-only, semantic-only, or hybrid.
Similarity threshold: How close a match needs to be.

Pick wrong and your RAG will either hallucinate (because retrieval is too loose) or miss relevant context (because chunking split important concepts). There’s no magic setting. You have to test with real queries.

Start with semantic search (it’s usually better than keyword-only) and a chunk size around 500–1000 tokens. Then iterate based on results.

Code Nodes: The Escape Hatch

Sometimes the workflow can’t express what you need. That’s where code nodes come in. You write Python or JavaScript, and Dify runs it in a sandboxed container.

# Dify passes input as a dict
inputs = {
    "user_query": "What's the price of milk?",
    "product_id": "123"
}

# Do something
price = fetch_price_from_api(inputs["product_id"])
formatted = f"The current price is ${price:.2f}"

# Return output as a dict
output = {
    "result": formatted
}

The sandbox is isolated (can’t access the host filesystem), so you’re safe. But you can make HTTP requests, do math, transform data — anything you’d do in a normal Python script.

Multi-Tenancy: A Note of Caution

Dify supports multiple workspaces and teams, which is great for managed hosting. But if you’re self-hosting and planning to serve multiple customers, be aware:

Workflows and knowledge bases are workspace-scoped. You can’t share a single workflow across teams (by design).
API keys are per-workspace. Each customer needs their own.
There’s no built-in metering or usage quotas. You have to track that yourself.

It’s not a blocker — just know what you’re getting into if you’re building a SaaS on top of Dify.

Migrating from LangChain

If you’ve already built something in LangChain and want to move to Dify, here’s the rough path:

Map your chain to a workflow (usually 1:1, sometimes better as an agent).
Move your documents to a knowledge base.
Extract environment variables (API keys, model names) into Dify’s environment config.
Publish it as an API endpoint.
Update your application to call the Dify API instead of running LangChain locally.

You’ll lose some low-level control (custom embedding functions, bespoke token streaming), but you gain:

Zero maintenance. Dify handles upgrades.
Easier iteration. Change prompts without redeploying your app.
A UI for non-engineers. Product managers can tweak retrieval settings.

In most cases, the trade-off is worth it.

Where Dify Shines

Rapid prototyping: Visual workflows beat Python boilerplate. A lot.
Non-engineer iteration: PMs and ops folks can tweak prompts, adjust retrieval, add tools.
Multi-model support: Switch between Ollama, OpenAI, Claude without refactoring.
Built-in observability: Logs, token counts, latency — all visible in the UI.
Team collaboration: Workspace isolation, version history, API key management.
Knowledge base management: Document upload, chunking, embedding all handled.

Where It Falls Short

Complex branching logic: Dify’s conditional nodes work, but deeply nested logic gets unwieldy. Pure code might be clearer.
Performance tuning: You can’t easily profile or optimize individual nodes like you can in Python.
Custom integrations: If you need a connector to an obscure API or internal system, you might need to write a code node or contribute to Dify.
Streaming UI: The embedded chat widget is functional but basic. If you need fine-grained control over message formatting, write your own frontend.
Cost visibility: There’s no built-in cost estimation for running against paid APIs. You have to calculate it yourself.

Getting Started

Clone the Dify repo or pull the Docker image.
Spin up Docker Compose (see above).
Create an account on the web UI.
Create a new workflow: blank canvas, drag an LLM node, wire it to input/output.
Publish it.
Play with the chat interface.

From there, upgrade to a real workflow: add a knowledge base, retriever, conditionals, tool calls. Each piece is incremental.

The documentation is solid (https://docs.dify.ai), and the community is active on GitHub. If you get stuck, there’s usually an answer in an issue or a Discord channel.

The Bottom Line

Dify is the LLM app builder that actually feels like a product. It’s open source, self-hostable, and doesn’t lock you into a vendor or language.

If you’ve been eyeing LangChain for something simple and wondering if it’s overkill, it is. If you’ve been wondering if there’s something better than Flowise, there is. Dify is the sweet spot: powerful enough for production, friendly enough for iteration, sane enough that you won’t want to torch your laptop at 2 AM.

Give it a spin on your home lab. Worst case, you learned something. Best case, you’ve got a visual agent builder that’ll handle 80% of your LLM app ideas without touching Python.