Skip to content
Go back

Goldilocks + VPA: Right-Size Pods Without Guessing

By SumGuy 11 min read
Goldilocks + VPA: Right-Size Pods Without Guessing

You’re Lying to Kubernetes (And It’s Lying Back)

Every Kubernetes deployment has requests and limits. Requests are what the scheduler uses to decide where to place a pod. Limits are the guardrails that prevent a runaway container from eating your node alive. They’re important. They’re also, in most real-world clusters, completely made up.

I’ve seen it. You’ve done it. Somebody copy-pasted requests: cpu: 100m, memory: 128Mi from a StackOverflow answer five years ago, and now it’s in production, and nobody’s touched it since. Meanwhile the actual app uses 800m CPU under load and 900Mi memory before it gets OOMKilled at 2 AM, which wakes you up, which is not great.

The opposite happens too: devs who have been burned set requests to 2000m and 4Gi “just to be safe,” and now their pods are claiming four times the resources they actually use. Your scheduler thinks the node is full. It isn’t. You’re just bad at estimating.

This is a solvable problem. Goldilocks and the Vertical Pod Autoscaler exist specifically to replace your guesswork with actual data.


VPA: The Component That Watches and Recommends

Vertical Pod Autoscaler (VPA) is a Kubernetes project that observes your containers over time, analyzes their actual CPU and memory consumption, and computes recommendations for what requests and limits should be. It runs as a set of controllers in your cluster.

Three components ship with VPA:

VPA has four operating modes, set per-workload via a VPA CRD:

ModeBehavior
OffRecommendations computed, nothing applied — read-only
InitialValues injected at pod creation only, no live mutations
RecreateEvicts and recreates pods when recommendations change significantly
AutoFull lifecycle management — evicts and restarts on updates

For most home lab situations, Off is where you start. Look at the recommendations. Copy the values you agree with into your deployment YAML manually. Trust but verify.

Auto sounds tempting. Don’t use it on StatefulSets with single replicas or anything else where eviction means downtime. The updater will happily restart your Postgres pod at midnight because it decided memory requests should go from 256Mi to 512Mi. Your database will be fine. Your family will be annoyed.


Goldilocks: A Dashboard for Humans

VPA gives you raw CRD output. Goldilocks wraps that in a web UI and adds some opinionated structure on top. It creates VPA objects in Off mode for every workload in namespaces you’ve opted in, then surfaces the recommendations in a clean dashboard showing:

It’s made by Fairwinds, the same folks behind Polaris and other Kubernetes policy tools. It’s open source, actively maintained, and genuinely useful even if you only run it once a quarter.


Installing the Stack

Step 1: Metrics Server

VPA needs actual metrics to work. If you’re on k3s, metrics-server ships as a bundled HelmChart resource. Verify it’s running:

Terminal window
kubectl get pods -n kube-system | grep metrics-server

If it’s there and Running, you’re good. If not, deploy it:

Terminal window
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

On k3s with self-signed certs, you may need to patch the deployment to skip TLS verification:

Terminal window
kubectl patch deployment metrics-server -n kube-system \
--type='json' \
-p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Give it a minute, then confirm it’s collecting data:

Terminal window
kubectl top nodes
kubectl top pods -A

If you see output instead of errors, you’re ready.

Step 2: Install VPA

The VPA controllers aren’t on a Helm chart you’d normally find in a standard registry. Clone the repo and run the installer:

Terminal window
git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

This deploys the three VPA components into kube-system. Verify:

Terminal window
kubectl get pods -n kube-system | grep vpa

You should see vpa-admission-controller, vpa-recommender, and vpa-updater all in Running state.

If you want to install VPA via manifests directly without cloning the whole autoscaler repo, the latest release manifests are at:

Terminal window
VPA_VERSION=1.2.1
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-${VPA_VERSION}/vertical-pod-autoscaler.yaml

Check the VPA releases page to confirm the latest version — it moves faster than you’d expect.

Step 3: Install Goldilocks

Add the Fairwinds Helm repo and install:

Terminal window
helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update
helm install goldilocks fairwinds-stable/goldilocks \
--namespace goldilocks \
--create-namespace \
--version 9.0.1

Version 9.0.1 is current as of mid-2026. Pin it. Fairwinds occasionally introduces breaking chart changes between minors.

Check that it’s up:

Terminal window
kubectl get pods -n goldilocks

You want goldilocks-controller and goldilocks-dashboard both running.


Enabling Namespaces

Goldilocks is opt-in per namespace. You enable it with a label:

Terminal window
kubectl label namespace myapp goldilocks.fairwinds.com/enabled=true

The Goldilocks controller watches for this label. When it sees it, it creates a VPA resource in Off mode for every Deployment (and optionally StatefulSet, DaemonSet) in that namespace. The VPA recommender starts collecting data immediately.

To enable for multiple namespaces at once:

Terminal window
for ns in staging production monitoring; do
kubectl label namespace $ns goldilocks.fairwinds.com/enabled=true
done

To disable and have Goldilocks clean up the VPA objects:

Terminal window
kubectl label namespace myapp goldilocks.fairwinds.com/enabled-

Viewing the Dashboard

Port-forward the dashboard service:

Terminal window
kubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80

Open http://localhost:8080 in your browser. You’ll see each enabled namespace with a table of workloads and their recommendations.

The dashboard shows two columns per recommendation: Guaranteed QoS (requests == limits, no burst allowed) and Burstable QoS (requests < limits, pod can burst). For most workloads you want Burstable — it gives you breathing room without hoarding node capacity.

A sample recommendation might look like:

Workload: api-server
Current Requests: cpu: 500m, memory: 512Mi
Current Limits: cpu: 1000m, memory: 1Gi
Recommended (Burstable):
Requests: cpu: 45m, memory: 87Mi
Limits: cpu: 850m, memory: 400Mi

That’s not a typo. A “500m CPU request” workload that actually uses 45m at steady state. That’s the kind of thing you find when you stop guessing.


The VPA CRD Itself

Goldilocks is creating these under the hood. Knowing what the CRD looks like is useful when you want to check recommendations without the dashboard, or when you want to run VPA on a workload Goldilocks isn’t managing.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: api-server
namespace: myapp
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
updatePolicy:
updateMode: "Off"
resourcePolicy:
containerPolicies:
- containerName: "*"
minAllowed:
cpu: 10m
memory: 50Mi
maxAllowed:
cpu: 2
memory: 4Gi

To read the recommendation output directly:

Terminal window
kubectl describe vpa api-server -n myapp

The Status.Recommendation section shows the lower bound, target, and upper bound for each container. Target is your happy path. Lower bound is “don’t go below this.” Upper bound is “above this, the recommender thinks something is wrong with your app.”


Applying the Recommendations

The manual way (recommended for production):

Copy the target values from the dashboard into your deployment YAML:

resources:
requests:
cpu: 45m
memory: 87Mi
limits:
cpu: 850m
memory: 400Mi

Commit it. Let it roll out on the next deploy. This is the right approach for anything stateful or business-critical.

The VPA Auto way (fine for stateless workloads):

Change the updateMode in the VPA object:

spec:
updatePolicy:
updateMode: "Auto"

The updater will now evict pods when it decides the current resources are significantly off from the recommendation. The admission controller injects the new values at startup. For a Deployment with multiple replicas, this is mostly fine — pods roll one at a time. For a single-replica StatefulSet, this means downtime. You’ve been warned.


The Time Problem

Here’s where people go wrong: they install Goldilocks, enable a namespace, wait ten minutes, look at the dashboard, and see recommendations that look weird or have huge variance ranges.

VPA needs data. Real data, from real usage, over time. The recommender uses an exponential decay algorithm that weights recent data more heavily, but it needs at least 24 hours of metrics to produce stable recommendations. For bursty workloads — batch jobs, APIs with weekly traffic patterns — you want 7 days before trusting the numbers.

Don’t apply five-minute-old recommendations to production. That’s just replacing one guess with a slightly more confident guess.

The good news: in a home lab, you can leave Goldilocks running passively for a week, look at the dashboard once on the weekend, apply the recommendations to everything that’s obviously over-provisioned, and pick up a meaningful chunk of your node capacity back without touching anything critical.


What You’ll Actually Find

On a typical home lab cluster where workloads were deployed with chart defaults or hand-wavy estimates, Goldilocks usually reveals:

Honestly, the first time you run this on a cluster that’s been chugging along for a year, it’s a little humbling. You thought you were good at this.


Caveats and Edge Cases

Bursty apps need Burstable QoS. If your app spikes to 10x CPU during peak load (a build server, a media transcoder, a scraper), set limits well above requests. Don’t use Guaranteed QoS for these — you’ll CPU-throttle yourself during the exact moments that matter.

StatefulSets with single replicas fight the updater. If updateMode is Auto, the updater will evict your pod to apply new resources. Single-replica StatefulSet means downtime. Either use Off mode and apply manually, or add a PodDisruptionBudget that blocks eviction.

HPA + VPA conflict. If you’re running Horizontal Pod Autoscaler on a workload, don’t let VPA manage CPU requests in Auto mode at the same time. They’ll fight. Use VPA for memory only, or switch to KEDA which plays nicer with VPA.

Goldilocks only creates VPAs for Deployments by default. StatefulSets and DaemonSets need explicit configuration. Check the Goldilocks Helm values if you need broader coverage.


Should You Bother?

If you’re running more than five or six services on a k3s node (or any Kubernetes node), yes. Absolutely. The install is under an hour. The payoff is immediate visibility into wasted capacity, and the actual recommendations are usually good enough to apply within a week of data collection.

The sweet spot is this: install it, forget about it for a week, come back, copy the recommendations into your Helm values or deployment YAMLs, and push. You’ll free up 20–40% of your memory requests on a typical home lab cluster. That’s one or two extra services you can run without adding hardware, or just a node that runs cooler and doesn’t trigger your alerting every time there’s a memory spike.

You don’t have to run VPA in Auto mode. You don’t have to make it complicated. Goldilocks in Off mode is basically a permanent advisor that watches your cluster and tells you when you’ve been lying to the scheduler.

Stop guessing. Let the data tell you.


The Bottom Line

ToolRole
metrics-serverExposes container resource metrics
VPA recommenderAnalyzes metrics, computes recommendations
VPA admission controllerInjects values at pod creation (Initial/Auto mode)
VPA updaterEvicts pods to apply live updates (Auto mode)
GoldilocksCreates VPA objects per namespace, surfaces recommendations in a dashboard

Start here:

Terminal window
# 1. Confirm metrics-server is running
kubectl top nodes
# 2. Install VPA
./hack/vpa-up.sh
# 3. Install Goldilocks
helm install goldilocks fairwinds-stable/goldilocks -n goldilocks --create-namespace
# 4. Enable a namespace
kubectl label namespace myapp goldilocks.fairwinds.com/enabled=true
# 5. Wait 24+ hours, then
kubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80

That’s it. Go look at the dashboard. Try not to be surprised.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it'll show up above once verified.


Next Post
Jellyseerr Tagging Workflows for Real Libraries

Discussion

Powered by Garrul . Sign in with GitHub or Google, or post anonymously.

Related Posts