You’re Lying to Kubernetes (And It’s Lying Back)
Every Kubernetes deployment has requests and limits. Requests are what the scheduler uses to decide where to place a pod. Limits are the guardrails that prevent a runaway container from eating your node alive. They’re important. They’re also, in most real-world clusters, completely made up.
I’ve seen it. You’ve done it. Somebody copy-pasted requests: cpu: 100m, memory: 128Mi from a StackOverflow answer five years ago, and now it’s in production, and nobody’s touched it since. Meanwhile the actual app uses 800m CPU under load and 900Mi memory before it gets OOMKilled at 2 AM, which wakes you up, which is not great.
The opposite happens too: devs who have been burned set requests to 2000m and 4Gi “just to be safe,” and now their pods are claiming four times the resources they actually use. Your scheduler thinks the node is full. It isn’t. You’re just bad at estimating.
This is a solvable problem. Goldilocks and the Vertical Pod Autoscaler exist specifically to replace your guesswork with actual data.
VPA: The Component That Watches and Recommends
Vertical Pod Autoscaler (VPA) is a Kubernetes project that observes your containers over time, analyzes their actual CPU and memory consumption, and computes recommendations for what requests and limits should be. It runs as a set of controllers in your cluster.
Three components ship with VPA:
- vpa-recommender — collects metrics, builds usage history, outputs recommendations
- vpa-admission-controller — intercepts pod creation and can inject recommended values at startup
- vpa-updater — in Auto mode, evicts pods so they restart with updated resource values
VPA has four operating modes, set per-workload via a VPA CRD:
| Mode | Behavior |
|---|---|
Off | Recommendations computed, nothing applied — read-only |
Initial | Values injected at pod creation only, no live mutations |
Recreate | Evicts and recreates pods when recommendations change significantly |
Auto | Full lifecycle management — evicts and restarts on updates |
For most home lab situations, Off is where you start. Look at the recommendations. Copy the values you agree with into your deployment YAML manually. Trust but verify.
Auto sounds tempting. Don’t use it on StatefulSets with single replicas or anything else where eviction means downtime. The updater will happily restart your Postgres pod at midnight because it decided memory requests should go from 256Mi to 512Mi. Your database will be fine. Your family will be annoyed.
Goldilocks: A Dashboard for Humans
VPA gives you raw CRD output. Goldilocks wraps that in a web UI and adds some opinionated structure on top. It creates VPA objects in Off mode for every workload in namespaces you’ve opted in, then surfaces the recommendations in a clean dashboard showing:
- Current requests and limits
- Recommended requests and limits
- QoS class (Guaranteed vs. Burstable)
- The delta between what you have and what you need
It’s made by Fairwinds, the same folks behind Polaris and other Kubernetes policy tools. It’s open source, actively maintained, and genuinely useful even if you only run it once a quarter.
Installing the Stack
Step 1: Metrics Server
VPA needs actual metrics to work. If you’re on k3s, metrics-server ships as a bundled HelmChart resource. Verify it’s running:
kubectl get pods -n kube-system | grep metrics-serverIf it’s there and Running, you’re good. If not, deploy it:
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yamlOn k3s with self-signed certs, you may need to patch the deployment to skip TLS verification:
kubectl patch deployment metrics-server -n kube-system \ --type='json' \ -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'Give it a minute, then confirm it’s collecting data:
kubectl top nodeskubectl top pods -AIf you see output instead of errors, you’re ready.
Step 2: Install VPA
The VPA controllers aren’t on a Helm chart you’d normally find in a standard registry. Clone the repo and run the installer:
git clone https://github.com/kubernetes/autoscaler.gitcd autoscaler/vertical-pod-autoscaler./hack/vpa-up.shThis deploys the three VPA components into kube-system. Verify:
kubectl get pods -n kube-system | grep vpaYou should see vpa-admission-controller, vpa-recommender, and vpa-updater all in Running state.
If you want to install VPA via manifests directly without cloning the whole autoscaler repo, the latest release manifests are at:
VPA_VERSION=1.2.1kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-${VPA_VERSION}/vertical-pod-autoscaler.yamlCheck the VPA releases page to confirm the latest version — it moves faster than you’d expect.
Step 3: Install Goldilocks
Add the Fairwinds Helm repo and install:
helm repo add fairwinds-stable https://charts.fairwinds.com/stablehelm repo update
helm install goldilocks fairwinds-stable/goldilocks \ --namespace goldilocks \ --create-namespace \ --version 9.0.1Version 9.0.1 is current as of mid-2026. Pin it. Fairwinds occasionally introduces breaking chart changes between minors.
Check that it’s up:
kubectl get pods -n goldilocksYou want goldilocks-controller and goldilocks-dashboard both running.
Enabling Namespaces
Goldilocks is opt-in per namespace. You enable it with a label:
kubectl label namespace myapp goldilocks.fairwinds.com/enabled=trueThe Goldilocks controller watches for this label. When it sees it, it creates a VPA resource in Off mode for every Deployment (and optionally StatefulSet, DaemonSet) in that namespace. The VPA recommender starts collecting data immediately.
To enable for multiple namespaces at once:
for ns in staging production monitoring; do kubectl label namespace $ns goldilocks.fairwinds.com/enabled=truedoneTo disable and have Goldilocks clean up the VPA objects:
kubectl label namespace myapp goldilocks.fairwinds.com/enabled-Viewing the Dashboard
Port-forward the dashboard service:
kubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80Open http://localhost:8080 in your browser. You’ll see each enabled namespace with a table of workloads and their recommendations.
The dashboard shows two columns per recommendation: Guaranteed QoS (requests == limits, no burst allowed) and Burstable QoS (requests < limits, pod can burst). For most workloads you want Burstable — it gives you breathing room without hoarding node capacity.
A sample recommendation might look like:
Workload: api-serverCurrent Requests: cpu: 500m, memory: 512MiCurrent Limits: cpu: 1000m, memory: 1Gi
Recommended (Burstable): Requests: cpu: 45m, memory: 87Mi Limits: cpu: 850m, memory: 400MiThat’s not a typo. A “500m CPU request” workload that actually uses 45m at steady state. That’s the kind of thing you find when you stop guessing.
The VPA CRD Itself
Goldilocks is creating these under the hood. Knowing what the CRD looks like is useful when you want to check recommendations without the dashboard, or when you want to run VPA on a workload Goldilocks isn’t managing.
apiVersion: autoscaling.k8s.io/v1kind: VerticalPodAutoscalermetadata: name: api-server namespace: myappspec: targetRef: apiVersion: apps/v1 kind: Deployment name: api-server updatePolicy: updateMode: "Off" resourcePolicy: containerPolicies: - containerName: "*" minAllowed: cpu: 10m memory: 50Mi maxAllowed: cpu: 2 memory: 4GiTo read the recommendation output directly:
kubectl describe vpa api-server -n myappThe Status.Recommendation section shows the lower bound, target, and upper bound for each container. Target is your happy path. Lower bound is “don’t go below this.” Upper bound is “above this, the recommender thinks something is wrong with your app.”
Applying the Recommendations
The manual way (recommended for production):
Copy the target values from the dashboard into your deployment YAML:
resources: requests: cpu: 45m memory: 87Mi limits: cpu: 850m memory: 400MiCommit it. Let it roll out on the next deploy. This is the right approach for anything stateful or business-critical.
The VPA Auto way (fine for stateless workloads):
Change the updateMode in the VPA object:
spec: updatePolicy: updateMode: "Auto"The updater will now evict pods when it decides the current resources are significantly off from the recommendation. The admission controller injects the new values at startup. For a Deployment with multiple replicas, this is mostly fine — pods roll one at a time. For a single-replica StatefulSet, this means downtime. You’ve been warned.
The Time Problem
Here’s where people go wrong: they install Goldilocks, enable a namespace, wait ten minutes, look at the dashboard, and see recommendations that look weird or have huge variance ranges.
VPA needs data. Real data, from real usage, over time. The recommender uses an exponential decay algorithm that weights recent data more heavily, but it needs at least 24 hours of metrics to produce stable recommendations. For bursty workloads — batch jobs, APIs with weekly traffic patterns — you want 7 days before trusting the numbers.
Don’t apply five-minute-old recommendations to production. That’s just replacing one guess with a slightly more confident guess.
The good news: in a home lab, you can leave Goldilocks running passively for a week, look at the dashboard once on the weekend, apply the recommendations to everything that’s obviously over-provisioned, and pick up a meaningful chunk of your node capacity back without touching anything critical.
What You’ll Actually Find
On a typical home lab cluster where workloads were deployed with chart defaults or hand-wavy estimates, Goldilocks usually reveals:
- Memory: Most apps use 15–30% of their requested memory at steady state. Chart defaults are written to work on any machine, so they’re always conservative. Your Grafana instance isn’t doing analytics for Netflix.
- CPU: Idle-ish services request 100m–500m and use 3m–15m. The scheduler has no idea your “busy” node has 80% of its CPU sitting idle because every pod claimed 200m and uses 8m.
- Occasional surprises: One or two workloads that are legitimately under-provisioned. Usually something doing background processing, image resizing, or anything with an in-memory cache that the VPA recommender sees as “usage” even when it’s just warm cache.
Honestly, the first time you run this on a cluster that’s been chugging along for a year, it’s a little humbling. You thought you were good at this.
Caveats and Edge Cases
Bursty apps need Burstable QoS. If your app spikes to 10x CPU during peak load (a build server, a media transcoder, a scraper), set limits well above requests. Don’t use Guaranteed QoS for these — you’ll CPU-throttle yourself during the exact moments that matter.
StatefulSets with single replicas fight the updater. If updateMode is Auto, the updater will evict your pod to apply new resources. Single-replica StatefulSet means downtime. Either use Off mode and apply manually, or add a PodDisruptionBudget that blocks eviction.
HPA + VPA conflict. If you’re running Horizontal Pod Autoscaler on a workload, don’t let VPA manage CPU requests in Auto mode at the same time. They’ll fight. Use VPA for memory only, or switch to KEDA which plays nicer with VPA.
Goldilocks only creates VPAs for Deployments by default. StatefulSets and DaemonSets need explicit configuration. Check the Goldilocks Helm values if you need broader coverage.
Should You Bother?
If you’re running more than five or six services on a k3s node (or any Kubernetes node), yes. Absolutely. The install is under an hour. The payoff is immediate visibility into wasted capacity, and the actual recommendations are usually good enough to apply within a week of data collection.
The sweet spot is this: install it, forget about it for a week, come back, copy the recommendations into your Helm values or deployment YAMLs, and push. You’ll free up 20–40% of your memory requests on a typical home lab cluster. That’s one or two extra services you can run without adding hardware, or just a node that runs cooler and doesn’t trigger your alerting every time there’s a memory spike.
You don’t have to run VPA in Auto mode. You don’t have to make it complicated. Goldilocks in Off mode is basically a permanent advisor that watches your cluster and tells you when you’ve been lying to the scheduler.
Stop guessing. Let the data tell you.
The Bottom Line
| Tool | Role |
|---|---|
| metrics-server | Exposes container resource metrics |
| VPA recommender | Analyzes metrics, computes recommendations |
| VPA admission controller | Injects values at pod creation (Initial/Auto mode) |
| VPA updater | Evicts pods to apply live updates (Auto mode) |
| Goldilocks | Creates VPA objects per namespace, surfaces recommendations in a dashboard |
Start here:
# 1. Confirm metrics-server is runningkubectl top nodes
# 2. Install VPA./hack/vpa-up.sh
# 3. Install Goldilockshelm install goldilocks fairwinds-stable/goldilocks -n goldilocks --create-namespace
# 4. Enable a namespacekubectl label namespace myapp goldilocks.fairwinds.com/enabled=true
# 5. Wait 24+ hours, thenkubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80That’s it. Go look at the dashboard. Try not to be surprised.