Skip to content
Go back

Longhorn vs Rook-Ceph

By SumGuy 11 min read
Longhorn vs Rook-Ceph

I just want PVCs to work in my cluster

You deploy a database to your Kubernetes cluster. Pod spins up. You add data. Then you reschedule the pod onto a different node. Your data evaporates like your budget after a conference trip.

Welcome to the persistent storage problem.

Kubernetes has this cute fiction that pods are ephemeral. Practical life says your Postgres needs data to actually… persist. So you need a storage backend — something that survives pod death and node reboots. You declare a PersistentVolumeClaim (PVC) in your manifests, and some provider has to make that real.

The two popular choices in the homelab / small-to-medium cluster space are Longhorn and Rook-Ceph. Both work. Both will chew through your afternoon. But they solve the problem in radically different ways — and which one doesn’t make you want to flip a table depends entirely on what you’re actually running.


The Storage Backend Problem

Before we pit them against each other, let’s be clear on what we’re actually solving.

Kubernetes itself is just an orchestrator. It schedules workloads, restarts failed pods, and manages networking. It does not come with persistent storage. (LocalPath provisioners exist, but they’re a trap — your data lives on a single node and if that node dies, so does your data. Great for testing. A disaster for anything stateful.)

A real storage backend needs to:

  1. Accept data from a pod on any node
  2. Keep it durable (replicated across multiple nodes, ideally)
  3. Serve it back to the pod even if the pod moves
  4. Not lose anything if a node catches fire

Longhorn and Rook-Ceph are the two philosophies for doing this at homelab scale.


Longhorn: The Lightweight Option

Longhorn is Rancher’s answer to the question: “What if we just made a storage provisioner that doesn’t require DevOps superpowers?”

Design Philosophy

Longhorn treats each volume independently. When you create a PVC, Longhorn:

  1. Spins up a small iSCSI “engine” pod (really a Longhorn replica set manager)
  2. Creates N replicas (default 3) across different nodes
  3. Syncs data between replicas in real-time
  4. Exposes the volume as an iSCSI target that the workload pod connects to

This is intentionally simple. You’re not managing a distributed filesystem or a crush map. You’re just saying “I have a volume, make copies of it on multiple nodes.”

What Longhorn Gives You

Longhorn’s Tradeoffs


Rook-Ceph: The Industrial Warehouse

Rook is an operator that deploys real Ceph into Kubernetes. Ceph is an open-source distributed storage system that’s been hardened in production at massive scale (CERN, OpenStack clouds, etc.).

Design Philosophy

Ceph is a distributed storage system. It doesn’t think in terms of individual volumes. It thinks in terms of:

Rook wraps this into Kubernetes operators so you don’t have to hand-configure a Ceph cluster separately. You write a CephCluster CR and Rook does the heavy lifting.

What Rook-Ceph Gives You

Rook-Ceph’s Tradeoffs


The Comparison Matrix

FactorLonghornRook-Ceph
Install time5 min15 min
Cluster size fit3–10 nodes3–100+ nodes
RAM overhead~2GB (3 nodes)~15–20GB (3 nodes)
Disk requirementShared OKDedicated devices required
Setup complexitySimpleModerate–complex
Block storage (RWO)Yes, iSCSIYes, RBD
Shared FS (RWX)No (need NFS on top)Yes, CephFS
SnapshotsYesYes
BackupsS3, NFS integrationS3 via RGW, snapshots
PerformanceGoodExcellent at scale
Failure domainPer-volume replicasCluster-wide, tunable
Web UIFriendlyIndustrial
ObservabilityBasicPrometheus + Ceph Dashboard
Multi-clusterNoPossible (advanced)
Upgrade storySmoothCareful coordination

The Real Decision Tree

Start with Longhorn if:

Go Rook-Ceph if:

Honest check: Do you need either?

Before you choose, ask yourself: Am I actually running stateful workloads that can’t tolerate data loss? Or am I running databases that should be managed by the cloud provider (RDS, etc.)?

For many homelab setups, local-path-provisioner with backups to S3 gets 90% of the way there. You’re not handling node failures gracefully, but you’re also not running a second full Kubernetes cluster just for storage.

If you’re running your own Postgres, Redis, or Elasticsearch — pick Longhorn. It’s simpler, and Postgres can handle brief rebalancing. If you’re running something that demands serious distributed storage semantics (OpenStack compute, a data lake), Rook-Ceph is the correct choice.


The Setup You’ll Actually Do

Longhorn in 10 Lines

Terminal window
# Add the Helm repo
helm repo add longhorn https://charts.longhorn.io && helm repo update
# Install to the cluster
helm install longhorn longhorn/longhorn \
--namespace longhorn-system \
--create-namespace \
--set defaultSettings.defaultDataPath="/var/lib/longhorn"
# Verify
kubectl get po -n longhorn-system

Now create a PVC and use it:

example-pvc.yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mydata
spec:
accessModes:
- ReadWriteOnce
storageClassName: longhorn
resources:
requests:
storage: 10Gi
---
apiVersion: v1
kind: Pod
metadata:
name: postgres-test
spec:
containers:
- name: postgres
image: postgres:16
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumes:
- name: data
persistentVolumeClaim:
claimName: mydata

Deploy it. Longhorn creates the volume, replicas sync, Postgres starts. Done.

Rook-Ceph: A Bit More Ceremony

Terminal window
# Install the Rook operator
helm install rook-ceph rook-release/rook-ceph \
--namespace rook-ceph \
--create-namespace
# Wait for the operator to be ready
kubectl wait --for=condition=ready pod \
-l app=rook-ceph-operator \
-n rook-ceph \
--timeout=300s
# Deploy the cluster (see YAML above, or use Rook's examples)
kubectl apply -f cluster.yaml
# Monitor the rollout
kubectl get cephcluster -n rook-ceph -w

Once the cluster is healthy (all MONs running, OSDs joining), create a pool and StorageClass. Then you can use PVCs the same way.

The difference: Longhorn gives you storage in 5 minutes. Ceph gives you enterprise storage in 20 minutes, and you’ll spend the next 3 months understanding CRUSH rules.


The Honest Take

Longhorn wins on simplicity. It’s the forklift that works. It’s not the fastest, it’s not the most scalable, but it gets your data off the floor without requiring a structural engineering degree.

Rook-Ceph wins on durability, scale, and features. If you’re serious about self-hosting infrastructure, Ceph is what every cloud provider uses under the hood. It handles complexity so you don’t have to (after you learn it once).

For most homelab scenarios, Longhorn is the right call. Save Ceph for when you’ve outgrown Longhorn and have the ops bandwidth to manage it.

Your 2 AM self will thank you for picking the simpler option.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
Gateway API vs Ingress in 2026

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts