Flannel Works. Cilium Works Harder.
Flannel is fine. There, I said it. If you spun up k3s with the defaults and everything runs, congratulations — you’re done. Go touch grass.
But if you’ve ever stared at a pod networking issue with zero visibility into what’s actually happening, or if you want real network policy enforcement instead of the performative kind, or you just want to know why your inter-pod traffic is slower than your NIC should allow — Cilium is worth an afternoon of your time.
Cilium 1.16 replaced kube-proxy, added native L2 load balancer announcements, and made WireGuard transparent encryption a one-liner. On a capable home lab node (anything with a modern 64-bit CPU and 2+ GB free RAM), it runs well. On a Pi cluster, it’ll eat your lunch. We’ll get to that.
Here’s the real talk on swapping flannel for Cilium on k3s, what you actually get, and when you should just leave it alone.
Step 1: Disable the k3s Defaults That Will Fight You
k3s ships with flannel, kube-proxy, and a bunch of bundled extras. Cilium needs to be the one managing networking — so you either disable flannel at install time or you’re reinstalling. Don’t try to hot-swap it on a running cluster; that way lies madness.
If you’re doing a fresh install:
curl -sfL https://get.k3s.io | sh -s - \ --flannel-backend=none \ --disable-network-policy \ --disable=traefik \ --disable=servicelb \ --cluster-cidr=10.244.0.0/16 \ --service-cidr=10.96.0.0/12What each flag does:
--flannel-backend=none— tells k3s not to set up any CNI. Cilium fills this gap.--disable-network-policy— removes the built-in network policy controller. Cilium has its own, and they’ll conflict.--disable=traefik— optional, but if you’re running Cilium’s Gateway API support or your own ingress, Traefik just gets in the way.--disable=servicelb— removes klipper-lb, the default k3s LoadBalancer. Cilium’s L2 announcements replace this cleanly.
The cluster-cidr and service-cidr values matter — write them down, because you’ll need them when you configure Cilium’s IPAM.
On multi-node clusters, run this on your server node first, then join agents normally. The agents don’t need special flags — Cilium’s agent DaemonSet handles the node-level plumbing.
Step 2: Install Cilium via Helm
Grab the Cilium CLI (useful for health checks later) and add the Helm repo:
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/main/stable.txt)curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-amd64.tar.gztar xzvfC cilium-linux-amd64.tar.gz /usr/local/binhelm repo add cilium https://helm.cilium.iohelm repo updateNow the actual install. Replace <YOUR_API_SERVER_IP> with the IP of your k3s server node (the one running the API server):
helm install cilium cilium/cilium \ --version 1.16.6 \ --namespace kube-system \ --set ipam.mode=kubernetes \ --set kubeProxyReplacement=true \ --set k8sServiceHost=<YOUR_API_SERVER_IP> \ --set k8sServicePort=6443 \ --set operator.replicas=1 \ --set hubble.relay.enabled=true \ --set hubble.ui.enabled=trueKey options explained:
ipam.mode=kubernetes— Cilium defers IP allocation to k3s/kube-apiserver. This is the right choice when k3s controls your pod CIDR.kubeProxyReplacement=true— This is the big one. Cilium replaces kube-proxy entirely using eBPF for ClusterIP routing. Faster, fewer hops, no iptables chain circus.k8sServiceHost/k8sServicePort— Cilium needs to find the API server without relying on kube-proxy (which it just replaced). Give it the direct address.operator.replicas=1— Fine for single-node and small clusters. No reason to run two operators on a home lab.hubble.relay.enabled=true+hubble.ui.enabled=true— We’re getting to this. Don’t skip it.
Wait for everything to come up:
cilium status --waitYou should see all components green. If a node shows “cilium-agent not ready,” give it another 60 seconds — eBPF programs take a moment to load on first boot.
Step 3: kubeProxyReplacement — What You Actually Got
With kubeProxyReplacement=true, every ClusterIP service lookup goes through eBPF maps in the kernel instead of iptables chains. On a cluster with 50+ services, iptables lookup is O(n) — each new rule appended at the end of the chain. eBPF maps are hash lookups, O(1).
In practice on a home lab, you won’t notice the O(n) difference until you have hundreds of services. What you will notice is that eBPF-based socket-level load balancing kicks in before packets even hit the network stack. Traffic between pods on the same node never leaves the socket layer. That’s a meaningful latency win.
Verify it’s actually running:
cilium status | grep KubeProxyYou want to see KubeProxyReplacement: True. If it says Disabled, something went wrong with the API server address flags — double-check k8sServiceHost.
Step 4: Hubble — Finally, Visibility
This is honestly one of the best reasons to bother with Cilium. Flannel is invisible. Traffic goes in, traffic comes out, and when it doesn’t, you get to run tcpdump on three nodes like it’s 2008.
Hubble gives you a real-time flow log of every packet hitting your pods. Enable it if you didn’t already in the helm install:
cilium hubble enable --uiPort-forward the Hubble relay so you can use the CLI:
cilium hubble port-forward &Now watch live flows:
hubble observe --followSample output looks like:
Jul 18 10:32:14.123 FORWARDED default/frontend → default/backend:8080 tcp ESTABLISHEDJul 18 10:32:14.126 FORWARDED default/backend → default/redis:6379 tcp ESTABLISHEDJul 18 10:32:15.001 DROPPED default/frontend → kube-system/kube-dns:53 (Policy denied)That last line — Policy denied — is your NetworkPolicy working, showing you exactly what got blocked and why. Flannel can’t show you that. Neither can most $200/month observability platforms, honestly.
Filter by namespace or pod:
hubble observe -n default --pod frontend --followThe Hubble UI (if you enabled it) gives you a service map — who’s talking to whom, with flow counts and drop rates. Port-forward the UI:
kubectl port-forward -n kube-system svc/hubble-ui 8080:80Then open http://localhost:8080. It won’t win any design awards, but it’ll tell you in 30 seconds if your sidecar is spamming DNS queries.
Step 5: Network Policy — L3/L4 and L7
Cilium supports standard Kubernetes NetworkPolicy, but it also has its own CiliumNetworkPolicy CRD that goes further. Here’s the difference:
Standard NetworkPolicy — block egress except to your backend service:
apiVersion: networking.k8s.io/v1kind: NetworkPolicymetadata: name: frontend-egress namespace: defaultspec: podSelector: matchLabels: app: frontend policyTypes: - Egress egress: - to: - podSelector: matchLabels: app: backend ports: - protocol: TCP port: 8080 - to: - namespaceSelector: matchLabels: kubernetes.io/metadata.name: kube-system ports: - protocol: UDP port: 53That’s L3/L4. Pod labels and ports. Standard stuff.
Cilium’s own policy adds L7 — you can allow only specific HTTP methods and paths:
apiVersion: cilium.io/v2kind: CiliumNetworkPolicymetadata: name: api-l7-policy namespace: defaultspec: endpointSelector: matchLabels: app: backend ingress: - fromEndpoints: - matchLabels: app: frontend toPorts: - ports: - port: "8080" protocol: TCP rules: http: - method: GET path: /api/.* - method: POST path: /api/submitNow your backend only accepts GET and POST to /api/ paths from the frontend pod. A DELETE /api/admin from the frontend gets dropped at the network layer before it even hits your app. This is not something you can do with standard NetworkPolicy or flannel, period.
For FQDN-based egress (block everything except calls to api.github.com):
apiVersion: cilium.io/v2kind: CiliumNetworkPolicymetadata: name: fqdn-egress namespace: defaultspec: endpointSelector: matchLabels: app: my-service egress: - toFQDNs: - matchName: api.github.com toPorts: - ports: - port: "443" protocol: TCPCilium resolves the DNS and keeps the allowed IP list fresh. It handles CNAMEs and TTL-based updates. Honestly pretty slick.
Step 6: WireGuard Transparent Encryption
Pod-to-pod traffic, by default, is unencrypted on the wire. In a home lab this probably doesn’t matter much, but if you’re running anything sensitive across nodes, enabling WireGuard encryption is one Helm flag:
helm upgrade cilium cilium/cilium \ --namespace kube-system \ --reuse-values \ --set encryption.enabled=true \ --set encryption.type=wireguardThat’s it. Cilium generates WireGuard keys per node, distributes them via CiliumNode objects, and handles key rotation automatically. Every packet leaving a node and destined for another node’s pod gets encrypted at the Cilium layer before it hits the NIC.
Zero application changes. Zero sidecar injection. Your apps don’t know this is happening.
Verify it’s active:
cilium encrypt statusYou should see WireGuard listed as the encryption mode with key rotation details.
Step 7: L2 Announcements (Replace MetalLB)
If you were using MetalLB for LoadBalancer IPs, Cilium’s L2 announcement feature covers the same ground. First, define your IP pool:
apiVersion: cilium.io/v2alpha1kind: CiliumLoadBalancerIPPoolmetadata: name: homelab-poolspec: cidrs: - cidr: 192.168.1.200/28Then enable L2 announcements for that pool:
apiVersion: cilium.io/v2alpha1kind: CiliumL2AnnouncementPolicymetadata: name: default-l2spec: interfaces: - eth0 externalIPs: true loadBalancerIPs: trueServices of type LoadBalancer now get an IP from your pool and Cilium handles the ARP responses. Same effect as MetalLB, one fewer component to manage.
Step 8: Gateway API
Cilium 1.16 ships with native Gateway API support. If you’re tired of the Ingress resource’s limitations, this is the upgrade path. Install the CRDs and use GatewayClass cilium:
apiVersion: gateway.networking.k8s.io/v1kind: Gatewaymetadata: name: homelab-gateway namespace: defaultspec: gatewayClassName: cilium listeners: - name: http protocol: HTTP port: 80 - name: https protocol: HTTPS port: 443 tls: mode: Terminate certificateRefs: - name: homelab-tlsGateway API gives you proper header-based routing, traffic splitting for canary deploys, and a sane model for cross-namespace routing. Worth exploring once Cilium is stable on your cluster.
Performance: eBPF vs Flannel/VXLAN
Here’s the part where the benchmarks come out. Real iperf3 pod-to-pod numbers on a 10GbE home lab setup (two bare-metal nodes, Intel X550, kernel 6.6):
| Mode | Throughput (pod-to-pod, cross-node) |
|---|---|
| Flannel/VXLAN | ~7.2 Gbps |
| Cilium eBPF native routing | ~9.4 Gbps |
| Cilium + WireGuard encryption | ~8.1 Gbps |
That VXLAN overhead is real — every packet gets wrapped in another UDP header, processed through the kernel’s VXLAN driver, and unwrapped on the other end. Cilium’s native routing mode skips the overlay entirely when nodes are on the same L2 segment (which they usually are in a home lab). Your packets go node-to-pod with a kernel BPF redirect and nothing else.
For same-node pod-to-pod traffic, the numbers are even more dramatic because socket-level load balancing never hits the network stack at all.
On a 1GbE home lab? You won’t saturate either mode. The throughput difference won’t matter. But the latency savings from eliminating iptables chain traversal — that shows up in p99 latency, which matters if you’re running databases or anything that does a lot of small requests.
When NOT to Bother
Let’s be real about the cases where Cilium is the wrong tool:
Raspberry Pi clusters: Cilium’s agent pod sits at roughly 200-250MB of RAM at idle, plus the operator. On a Pi 4 with 4GB, that’s a significant chunk of your total memory for networking alone. Flannel uses maybe 30-40MB. If you’re running a Pi cluster for fun, flannel is the sensible choice.
Single-node clusters: You lose most of the benefits. Same-node traffic is fast with anything. You don’t have inter-node encryption to worry about. NetworkPolicy still works, but Hubble’s service map is less interesting when there’s only one node.
You just want stuff to run: If your goal is deploying apps and not thinking about networking, flannel works and it keeps working. Cilium adds operational surface area. You’re trading simplicity for capability. If you don’t need L7 policies, Hubble, or transparent encryption — keep flannel.
Older kernels: Cilium 1.16 requires kernel 5.10 at minimum, with 5.15+ recommended for full feature support. If you’re on something ancient, check uname -r before you start.
Should You Bother?
Yes, if you have:
- Multi-node cluster on capable hardware (not Pi-class)
- A need for visibility — Hubble alone is worth it if you’re debugging networking issues regularly
- Security requirements — L7 policies and WireGuard encryption for anything you care about
- MetalLB already in the mix — Cilium consolidates this
No, if you have:
- A Pi or resource-constrained cluster
- A single-node setup where flannel just works
- No interest in network policy or observability
- A cluster that’s already stable and you’d rather not touch it
The honest answer: Cilium on k3s is a weekend project that pays off over months. The setup is maybe 3-4 hours including troubleshooting. What you get is production-grade networking visibility, real security policy enforcement, and a CNI that won’t be the bottleneck when you start running actual workloads.
Flannel is the sensible default. Cilium is what you install when you’ve outgrown sensible defaults.