Goldilocks + VPA: Right-Size Pods Without Guessing

You’re Lying to Kubernetes (And It’s Lying Back)

Every Kubernetes deployment has requests and limits. Requests are what the scheduler uses to decide where to place a pod. Limits are the guardrails that prevent a runaway container from eating your node alive. They’re important. They’re also, in most real-world clusters, completely made up.

I’ve seen it. You’ve done it. Somebody copy-pasted requests: cpu: 100m, memory: 128Mi from a StackOverflow answer five years ago, and now it’s in production, and nobody’s touched it since. Meanwhile the actual app uses 800m CPU under load and 900Mi memory before it gets OOMKilled at 2 AM, which wakes you up, which is not great.

The opposite happens too: devs who have been burned set requests to 2000m and 4Gi “just to be safe,” and now their pods are claiming four times the resources they actually use. Your scheduler thinks the node is full. It isn’t. You’re just bad at estimating.

This is a solvable problem. Goldilocks and the Vertical Pod Autoscaler exist specifically to replace your guesswork with actual data.

VPA: The Component That Watches and Recommends

Vertical Pod Autoscaler (VPA) is a Kubernetes project that observes your containers over time, analyzes their actual CPU and memory consumption, and computes recommendations for what requests and limits should be. It runs as a set of controllers in your cluster.

Three components ship with VPA:

vpa-recommender: collects metrics, builds usage history, outputs recommendations
vpa-admission-controller: intercepts pod creation and can inject recommended values at startup
vpa-updater: in Auto mode, evicts pods so they restart with updated resource values

VPA has four operating modes, set per-workload via a VPA CRD:

Mode	Behavior
`Off`	Recommendations computed, nothing applied, read-only
`Initial`	Values injected at pod creation only, no live mutations
`Recreate`	Evicts and recreates pods when recommendations change significantly
`Auto`	Full lifecycle management, evicts and restarts on updates

For most home lab situations, Off is where you start. Look at the recommendations. Copy the values you agree with into your deployment YAML manually. Trust but verify.

Auto sounds tempting. Don’t use it on StatefulSets with single replicas or anything else where eviction means downtime. The updater will happily restart your Postgres pod at midnight because it decided memory requests should go from 256Mi to 512Mi. Your database will be fine. Your family will be annoyed.

Goldilocks: A Dashboard for Humans

VPA gives you raw CRD output. Goldilocks wraps that in a web UI and adds some opinionated structure on top. It creates VPA objects in Off mode for every workload in namespaces you’ve opted in, then surfaces the recommendations in a clean dashboard showing:

Current requests and limits
Recommended requests and limits
QoS class (Guaranteed vs. Burstable)
The delta between what you have and what you need

It’s made by Fairwinds, the same folks behind Polaris and other Kubernetes policy tools. It’s open source, actively maintained, and genuinely useful even if you only run it once a quarter.

Installing the Stack

Step 1: Metrics Server

VPA needs actual metrics to work. If you’re on k3s, metrics-server ships as a bundled HelmChart resource. Verify it’s running:

kubectl get pods -n kube-system | grep metrics-server

If it’s there and Running, you’re good. If not, deploy it:

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

On k3s with self-signed certs, you may need to patch the deployment to skip TLS verification:

kubectl patch deployment metrics-server -n kube-system \
  --type='json' \
  -p='[{"op":"add","path":"/spec/template/spec/containers/0/args/-","value":"--kubelet-insecure-tls"}]'

Give it a minute, then confirm it’s collecting data:

kubectl top nodes
kubectl top pods -A

If you see output instead of errors, you’re ready.

Step 2: Install VPA

The VPA controllers aren’t on a Helm chart you’d normally find in a standard registry. Clone the repo and run the installer:

git clone https://github.com/kubernetes/autoscaler.git
cd autoscaler/vertical-pod-autoscaler
./hack/vpa-up.sh

This deploys the three VPA components into kube-system. Verify:

kubectl get pods -n kube-system | grep vpa

You should see vpa-admission-controller, vpa-recommender, and vpa-updater all in Running state.

If you want to install VPA via manifests directly without cloning the whole autoscaler repo, the latest release manifests are at:

VPA_VERSION=1.3.0
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/download/vertical-pod-autoscaler-${VPA_VERSION}/vertical-pod-autoscaler.yaml

Check the VPA releases page to confirm the latest version, it moves faster than you’d expect.

Step 3: Install Goldilocks

Add the Fairwinds Helm repo and install:

helm repo add fairwinds-stable https://charts.fairwinds.com/stable
helm repo update

helm install goldilocks fairwinds-stable/goldilocks \
  --namespace goldilocks \
  --create-namespace \
  --version 10.4.0

Version 10.4.0 is current as of mid-2026. Pin it. Fairwinds occasionally introduces breaking chart changes between minors.

Check that it’s up:

kubectl get pods -n goldilocks

You want goldilocks-controller and goldilocks-dashboard both running.

Enabling Namespaces

Goldilocks is opt-in per namespace. You enable it with a label:

kubectl label namespace myapp goldilocks.fairwinds.com/enabled=true

The Goldilocks controller watches for this label. When it sees it, it creates a VPA resource in Off mode for every Deployment (and optionally StatefulSet, DaemonSet) in that namespace. The VPA recommender starts collecting data immediately.

To enable for multiple namespaces at once:

for ns in staging production monitoring; do
  kubectl label namespace $ns goldilocks.fairwinds.com/enabled=true
done

To disable and have Goldilocks clean up the VPA objects:

kubectl label namespace myapp goldilocks.fairwinds.com/enabled-

Viewing the Dashboard

Port-forward the dashboard service:

kubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80

Open http://localhost:8080 in your browser. You’ll see each enabled namespace with a table of workloads and their recommendations.

The dashboard shows two columns per recommendation: Guaranteed QoS (requests == limits, no burst allowed) and Burstable QoS (requests < limits, pod can burst). For most workloads you want Burstable, it gives you breathing room without hoarding node capacity.

A sample recommendation might look like:

Workload: api-server
Current Requests: cpu: 500m, memory: 512Mi
Current Limits:   cpu: 1000m, memory: 1Gi

Recommended (Burstable):
  Requests: cpu: 45m, memory: 87Mi
  Limits:   cpu: 850m, memory: 400Mi

That’s not a typo. A “500m CPU request” workload that actually uses 45m at steady state. That’s the kind of thing you find when you stop guessing.

The VPA CRD Itself

Goldilocks is creating these under the hood. Knowing what the CRD looks like is useful when you want to check recommendations without the dashboard, or when you want to run VPA on a workload Goldilocks isn’t managing.

apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: api-server
  namespace: myapp
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: api-server
  updatePolicy:
    updateMode: "Off"
  resourcePolicy:
    containerPolicies:
      - containerName: "*"
        minAllowed:
          cpu: 10m
          memory: 50Mi
        maxAllowed:
          cpu: 2
          memory: 4Gi

To read the recommendation output directly:

kubectl describe vpa api-server -n myapp

The Status.Recommendation section shows the lower bound, target, and upper bound for each container. Target is your happy path. Lower bound is “don’t go below this.” Upper bound is “above this, the recommender thinks something is wrong with your app.”

Applying the Recommendations

The manual way (recommended for production):

Copy the target values from the dashboard into your deployment YAML:

resources:
  requests:
    cpu: 45m
    memory: 87Mi
  limits:
    cpu: 850m
    memory: 400Mi

Commit it. Let it roll out on the next deploy. This is the right approach for anything stateful or business-critical.

The VPA Auto way (fine for stateless workloads):

Change the updateMode in the VPA object:

spec:
  updatePolicy:
    updateMode: "Auto"

The updater will now evict pods when it decides the current resources are significantly off from the recommendation. The admission controller injects the new values at startup. For a Deployment with multiple replicas, this is mostly fine, pods roll one at a time. For a single-replica StatefulSet, this means downtime. You’ve been warned.

The Time Problem

Here’s where people go wrong: they install Goldilocks, enable a namespace, wait ten minutes, look at the dashboard, and see recommendations that look weird or have huge variance ranges.

VPA needs data. Real data, from real usage, over time. The recommender uses an exponential decay algorithm that weights recent data more heavily, but it needs at least 24 hours of metrics to produce stable recommendations. For bursty workloads, batch jobs, APIs with weekly traffic patterns, you want 7 days before trusting the numbers.

Don’t apply five-minute-old recommendations to production. That’s just replacing one guess with a slightly more confident guess.

The good news: in a home lab, you can leave Goldilocks running passively for a week, look at the dashboard once on the weekend, apply the recommendations to everything that’s obviously over-provisioned, and pick up a meaningful chunk of your node capacity back without touching anything critical.

What You’ll Actually Find

On a typical home lab cluster where workloads were deployed with chart defaults or hand-wavy estimates, Goldilocks usually reveals:

Memory: Most apps use 15 to 30% of their requested memory at steady state. Chart defaults are written to work on any machine, so they’re always conservative. Your Grafana instance isn’t doing analytics for Netflix.
CPU: Idle-ish services request 100m, 500m and use 3m, 15m. The scheduler has no idea your “busy” node has 80% of its CPU sitting idle because every pod claimed 200m and uses 8m.
Occasional surprises: One or two workloads that are legitimately under-provisioned. Usually something doing background processing, image resizing, or anything with an in-memory cache that the VPA recommender sees as “usage” even when it’s just warm cache.

Honestly, the first time you run this on a cluster that’s been chugging along for a year, it’s a little humbling. You thought you were good at this.

Caveats and Edge Cases

Bursty apps need Burstable QoS. If your app spikes to 10x CPU during peak load (a build server, a media transcoder, a scraper), set limits well above requests. Don’t use Guaranteed QoS for these, you’ll CPU-throttle yourself during the exact moments that matter.

StatefulSets with single replicas fight the updater. If updateMode is Auto, the updater will evict your pod to apply new resources. Single-replica StatefulSet means downtime. Either use Off mode and apply manually, or add a PodDisruptionBudget that blocks eviction.

HPA + VPA conflict. If you’re running Horizontal Pod Autoscaler on a workload, don’t let VPA manage CPU requests in Auto mode at the same time. They’ll fight. Use VPA for memory only, or switch to KEDA which plays nicer with VPA.

Goldilocks only creates VPAs for Deployments by default. StatefulSets and DaemonSets need explicit configuration. Check the Goldilocks Helm values if you need broader coverage.

Should You Bother?

If you’re running more than five or six services on a k3s node (or any Kubernetes node), yes. Absolutely. The install is under an hour. The payoff is immediate visibility into wasted capacity, and the actual recommendations are usually good enough to apply within a week of data collection.

The sweet spot is this: install it, forget about it for a week, come back, copy the recommendations into your Helm values or deployment YAMLs, and push. You’ll free up 20 to 40% of your memory requests on a typical home lab cluster. That’s one or two extra services you can run without adding hardware, or just a node that runs cooler and doesn’t trigger your alerting every time there’s a memory spike.

You don’t have to run VPA in Auto mode. You don’t have to make it complicated. Goldilocks in Off mode is basically a permanent advisor that watches your cluster and tells you when you’ve been lying to the scheduler.

Stop guessing. Let the data tell you.

The Bottom Line

Tool	Role
metrics-server	Exposes container resource metrics
VPA recommender	Analyzes metrics, computes recommendations
VPA admission controller	Injects values at pod creation (Initial/Auto mode)
VPA updater	Evicts pods to apply live updates (Auto mode)
Goldilocks	Creates VPA objects per namespace, surfaces recommendations in a dashboard

Start here:

# 1. Confirm metrics-server is running
kubectl top nodes

# 2. Install VPA
./hack/vpa-up.sh

# 3. Install Goldilocks
helm install goldilocks fairwinds-stable/goldilocks -n goldilocks --create-namespace

# 4. Enable a namespace
kubectl label namespace myapp goldilocks.fairwinds.com/enabled=true

# 5. Wait 24+ hours, then
kubectl port-forward -n goldilocks svc/goldilocks-dashboard 8080:80

That’s it. Go look at the dashboard. Try not to be surprised.

Goldilocks + VPA: Right-Size Pods Without Guessing

You’re Lying to Kubernetes (And It’s Lying Back)

VPA: The Component That Watches and Recommends

Goldilocks: A Dashboard for Humans

Installing the Stack

Step 1: Metrics Server

Step 2: Install VPA

Step 3: Install Goldilocks

Enabling Namespaces

Viewing the Dashboard

The VPA CRD Itself

Applying the Recommendations

The Time Problem

What You’ll Actually Find

Caveats and Edge Cases

Should You Bother?

The Bottom Line

Responses from around the web

Discussion

Related Posts

Headlamp: K8s UI Without the License Drama

K9s vs Lens vs Headlamp: Cluster UIs

Krew: Kubectl Plugins You'll Actually Use

KEDA: Event-Driven Autoscaling Self-Hosted

Goldilocks + VPA: Right-Size Pods Without Guessing

You’re Lying to Kubernetes (And It’s Lying Back)

VPA: The Component That Watches and Recommends

Goldilocks: A Dashboard for Humans

Installing the Stack

Step 1: Metrics Server

Step 2: Install VPA

Step 3: Install Goldilocks

Enabling Namespaces

Viewing the Dashboard

The VPA CRD Itself

Applying the Recommendations

The Time Problem

What You’ll Actually Find

Caveats and Edge Cases

Should You Bother?

The Bottom Line

Related Reading

Responses from around the web

Discussion

Related Posts

Headlamp: K8s UI Without the License Drama

K9s vs Lens vs Headlamp: Cluster UIs

Krew: Kubectl Plugins You'll Actually Use

KEDA: Event-Driven Autoscaling Self-Hosted