OPA & Gatekeeper: Policy as Code

The Moment You Need a Bouncer

Your cluster started with just you. You deployed whatever you wanted, ran containers however felt right, and life was blissful chaos. Then the team grew. Then the standards committee got involved. Then someone asked: “Wait, why does our database pod have hostPath access to the entire filesystem?” and suddenly you’re explaining why that wasn’t a good idea at 3 AM.

Welcome to policy enforcement.

Kubernetes gives you incredible power, maybe too much. Any user with pod creation rights can request privileged mode, mount the host filesystem, pull images from anywhere, or spike resource usage to tank the cluster. You could solve this with yelling and Slack messages. Or you could use Open Policy Agent (OPA) and Gatekeeper to automate it. Think of it as hiring a bouncer: policies are the rules, Gatekeeper is the velvet rope, and Rego is the clipboard the bouncer reads from.

This article covers what OPA and Gatekeeper are, how they work together, real-world constraints for a homelab Kubernetes cluster, and why you’d pick them over alternatives like Kyverno.

What Is OPA (Open Policy Agent)?

OPA is a general-purpose policy engine. It’s not Kubernetes-specific, you can use it to enforce policies on Terraform plans, REST APIs, Envoy proxies, or even static files. The magic is in Rego, a declarative language designed for policy logic.

At its core, OPA answers one question: Given an input (a JSON document), does it violate any policies? The answer is always yes or no, and you control what “no” means by writing rules.

A simple Rego rule looks like:

package main

# Deny if the input is invalid
deny[msg] {
    not input.name
    msg := "name field is required"
}

If input.name is missing, the policy returns a denial with a message. OPA doesn’t care about the structure, it’s all JSON to OPA. That’s why it’s so flexible. Kubernetes just happens to be a really good use case.

Gatekeeper: OPA for Kubernetes Admission Control

Gatekeeper is the Kubernetes operator that turns OPA into a ValidatingAdmissionWebhook. It sits between kubectl and the API server, intercepting every pod, deployment, service, and custom resource you try to create.

When you apply a manifest, Gatekeeper’s webhook receives the request, sends it to OPA, and tells the API server to allow or deny based on OPA’s decision. It’s transparent, users don’t interact with OPA directly; they just get denied if their manifest violates a policy.

Two key Gatekeeper concepts:

ConstraintTemplate: defines a reusable policy (the bouncer’s manual)
Constraint: applies that policy to specific resources (the bouncer at a specific venue)

A ConstraintTemplate contains Rego code and describes what inputs it validates. A Constraint says “enforce this template on all Pods in the default namespace” or “enforce it on all Deployments everywhere.”

Installing Gatekeeper

If you’re running a decent Kubernetes cluster (even a homelab one), installation is straightforward:

kubectl apply -f https://raw.githubusercontent.com/open-policy-agent/gatekeeper/v3.22.2/deploy/gatekeeper.yaml

This deploys the Gatekeeper controller, the webhook server, and RBAC rules. It lives in a gatekeeper-system namespace and watches your cluster from there.

To verify it’s running:

kubectl get pods -n gatekeeper-system
kubectl get validatingwebhookconfigurations | grep gatekeeper

You’ll see something like gatekeeper-validating-webhook-configuration. That’s the hook that intercepts API requests.

ConstraintTemplate and Constraint: The Two-Part Enforcement

Here’s where the abstraction pays off. A ConstraintTemplate is the policy definition, it’s YAML + Rego code. A Constraint is a lightweight YAML file that references a ConstraintTemplate and says which resources to enforce it on.

Why split it? Because you might want the same policy rule (e.g., “no privileged pods”) applied in different ways (all namespaces vs. just production, pods only vs. all workloads, etc.). The template is the logic; the constraint is the binding.

Example 1: Disallow Privileged Pods

apiVersion: templates.gatekeeper.sh/v1beta1
kind: ConstraintTemplate
metadata:
  name: k8srequiredlabels
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredLabels
      validation:
        openAPIV3Schema:
          properties:
            labels:
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredlabels

        deny[msg] {
            container := input.review.object.spec.containers[_]
            container.securityContext.privileged == true
            msg := sprintf("Privileged container %v not allowed", [container.name])
        }

This ConstraintTemplate:

Defines a new custom resource type called K8sRequiredLabels (arbitrary name: use something descriptive)
Targets Kubernetes admission webhooks
Contains a Rego rule that denies any pod with securityContext.privileged: true

The Rego logic is straightforward: iterate through all containers (input.review.object.spec.containers[_]), and if any has privileged == true, deny with a message.

Now apply this template:

kubectl apply -f template.yaml

The template alone doesn’t enforce anything, it just makes the policy available. To actually enforce it, create a Constraint:

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredLabels
metadata:
  name: block-privileged-pods
spec:
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    excludedNamespaces: ["kube-system", "gatekeeper-system"]
  parameters:
    exemptions: ["kube-system"]

This Constraint says: “Apply the K8sRequiredLabels policy to all Pods, except those in kube-system and gatekeeper-system.”

Try to create a privileged pod:

kubectl run test-pod --image=nginx --privileged

You’ll get:

Error from server (Forbidden): admission webhook "validation.gatekeeper.sh" denied the request: Privileged container test-pod not allowed

It worked. The bouncer saw a rule violation and sent the request away.

Real-World Constraints for a Homelab

Let’s build a few practical policies you’d want in a real cluster.

Constraint 2: Require Resource Limits

Uncontrolled pods can starve your cluster. Require every container to declare CPU and memory requests/limits:

package k8srequiredresources

deny[msg] {
    container := input.review.object.spec.containers[_]
    not container.resources.limits
    msg := sprintf("Container %v must define resource limits", [container.name])
}

deny[msg] {
    container := input.review.object.spec.containers[_]
    not container.resources.requests
    msg := sprintf("Container %v must define resource requests", [container.name])
}

Create the ConstraintTemplate (similar structure as before, just swap the Rego code), then create a Constraint that applies it to Pods and Deployments.

Constraint 3: Require Specific Labels

For billing, ownership, or cost-center tracking, enforce that certain labels are present:

package k8srequiredlabels

required_labels := ["owner", "cost-center", "app"]

deny[msg] {
    missing_label := required_labels[_]
    not input.review.object.metadata.labels[missing_label]
    msg := sprintf("Label %v is required", [missing_label])
}

This iterates through your required labels and denies if any are missing. Users can’t deploy anything without tagging it properly.

Constraint 4: Restrict Image Registries

Block images from untrusted registries. Only allow images from your private registry or a curated list:

package k8sallowedregistries

allowed_registries := [
    "docker.io/library/",
    "ghcr.io/",
    "registry.internal:5000/"
]

deny[msg] {
    container := input.review.object.spec.containers[_]
    image := container.image
    not startswith_allowed(image)
    msg := sprintf("Image %v from untrusted registry. Allowed: %v", [image, allowed_registries])
}

startswith_allowed(image) {
    prefix := allowed_registries[_]
    startswith(image, prefix)
}

Now when someone tries to deploy ghcr.io/sketchy-project/backdoor:latest, Gatekeeper stops it.

Constraint 5: Block hostPath Volumes

HostPath volumes bypass container isolation. In a multi-tenant or security-conscious cluster, ban them:

package k8snohostpath

deny[msg] {
    volume := input.review.object.spec.volumes[_]
    volume.hostPath
    msg := sprintf("hostPath volume %v not allowed", [volume.name])
}

Testing Policies: The Dryrun Approach

Before enforcing a policy cluster-wide, test it in audit mode first. Gatekeeper can evaluate policies without blocking requests, violations just get logged.

In your Constraint, add:

spec:
  enforcementAction: audit
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]

With audit mode, denied requests still get created, but you get a warning in the status. Roll out this way for a week, check the audit logs, tune your policy, then switch to enforce:

spec:
  enforcementAction: enforce

This dryrun pattern prevents surprise outages. You’ll catch badly-written policies or overly broad constraints before users hit them in production.

Understanding Rego: The Policy Language

Rego is declarative, you describe facts and rules, not steps. Here’s what you need to know:

Packages organize rules:

package k8srules

Rules define policy logic. A deny rule blocks a request:

deny[msg] {
    condition
    msg := "reason"
}

Input is the resource being evaluated. In Kubernetes, it’s always input.review.object:

input.review.object.spec.containers[_]  # iterate containers
input.review.object.metadata.labels     # access labels
input.review.object.metadata.namespace  # namespace

Iteration uses [_] to iterate over arrays:

deny[msg] {
    container := input.review.object.spec.containers[_]
    container.securityContext.privileged == true
    msg := "no privileged containers"
}

This loops through every container and denies if any is privileged.

Built-in functions like startswith(), contains(), sprintf(), and regex.match() let you do string operations:

deny[msg] {
    image := input.review.object.spec.containers[0].image
    regex.match("^docker.io/", image)  # matches docker.io/* images
    msg := "no Docker Hub images"
}

The Rego docs (openpolicyagent.org/docs) are excellent. You don’t need to memorize everything, most policies are just iterations and condition checks.

Mutation: Assigning Defaults

Beyond enforcement, Gatekeeper can mutate resources, rewrite them to meet policy standards. For example, automatically add resource limits or assign namespace-based labels.

Mutations are not defined in ConstraintTemplates and Rego cannot perform mutations in Gatekeeper. Instead, Gatekeeper provides separate dedicated CRD resources for mutation:

Assign: set or overwrite a field on a resource
AssignMetadata: add or overwrite labels/annotations
ModifySet: add or remove items from a list field

An Assign that injects a default memory limit looks like this:

apiVersion: mutations.gatekeeper.sh/v1
kind: Assign
metadata:
  name: assign-default-memory-limit
spec:
  applyTo:
  - groups: [""]
    kinds: ["Pod"]
    versions: ["v1"]
  match:
    scope: Namespaced
    kinds:
    - apiGroups: ["*"]
      kinds: ["Pod"]
  location: "spec.containers[name:*].resources.limits.memory"
  parameters:
    assign:
      value: "256Mi"

This rewrites every container that is missing a memory limit, injecting the default. Gatekeeper applies it as a MutatingAdmissionWebhook before the object is persisted.

This is powerful but also risky, a bad mutation can silently change resource configurations. Test mutations in a staging cluster and audit the results before rolling out to production.

OPA Beyond Kubernetes

OPA’s real superpower is that it’s not Kubernetes-specific. You can use OPA for:

Conftest: Static policy checking for Terraform, Docker, Kubernetes manifests before they hit the cluster
REST APIs: Validate HTTP requests and responses against policies
Envoy/Istio: Enforce policies at the service mesh level
CI/CD pipelines: Block unsafe commits or configurations before they’re deployed

This means your policy rules can be consistent across your entire infrastructure. The Rego code for “no privileged containers” works in Gatekeeper, Conftest, and API policies. One language, many contexts.

Kyverno: The Friendly Cousin

If Gatekeeper feels complex, meet Kyverno, a Kubernetes-native policy engine that uses YAML instead of Rego.

A Kyverno ClusterPolicy looks like:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-requests-limits
spec:
  validationFailureAction: audit
  rules:
  - name: check-resources
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "CPU and memory limits required."
      pattern:
        spec:
          containers:
          - resources:
              limits:
                memory: "?*"
                cpu: "?*"
              requests:
                memory: "?*"
                cpu: "?*"

This is easier to read if you’re not comfortable with Rego. The downside: Kyverno is less expressive. Complex logic (regex matches, fuzzy iteration, conditional mutations) is harder in pure YAML.

OPA wins if: You need expressive policy logic, want to reuse rules across platforms, or plan to use Conftest for static checks.

Kyverno wins if: Your policies are simple, you prefer YAML, and you don’t want to learn Rego.

For a homelab, either works. For large organizations, OPA’s flexibility usually wins.

Monitoring Policy Violations

Gatekeeper logs denials to the API server audit log. You can also query the Gatekeeper metrics:

kubectl logs -n gatekeeper-system deployment/gatekeeper-audit

This shows violations that were caught in audit mode. To see all constraints and their status:

kubectl get constraints
kubectl describe constraint block-privileged-pods

The status section shows violations. In a mature cluster, you might have a dashboard (Prometheus + Grafana) scraping Gatekeeper’s /metrics endpoint to alert on policy changes.

The Reality of Policy as Code

Here’s the honest part: policy enforcement is cultural before it’s technical. The best policy engine fails if your team doesn’t understand or agree with the policies. Before rolling out Gatekeeper:

Document your policies: Why does every pod need resource limits? Because uncontrolled pods starve the cluster. Tell your team.
Start with audit mode: Enforce in production only after your team has seen real violations and understands the rules.
Provide escape hatches: Allow policy exceptions in specific namespaces (e.g., kube-system) or require an owner comment for overrides.
Iterate: Your policies will change as your cluster matures. Gatekeeper makes that easy.

The bouncer metaphor only works if everyone agrees the bouncer’s rules are fair.

Wrapping Up

OPA and Gatekeeper turn policy enforcement from a hope-for-the-best exercise into a codified guardrail. You define what’s allowed, the system enforces it, and violators get immediate feedback instead of surprises in production.

For a homelab Kubernetes cluster, starting with three constraints makes sense:

No privileged pods (security)
Resource limits required (stability)
Specific labels required (operations)

From there, add constraints as pain points emerge. Your 3 AM self will thank you.

Now go forth and let OPA be the bouncer your cluster deserves.

OPA & Gatekeeper: Policy as Code

The Moment You Need a Bouncer

What Is OPA (Open Policy Agent)?

Gatekeeper: OPA for Kubernetes Admission Control

Installing Gatekeeper

ConstraintTemplate and Constraint: The Two-Part Enforcement

Example 1: Disallow Privileged Pods

Real-World Constraints for a Homelab

Constraint 2: Require Resource Limits

Constraint 3: Require Specific Labels

Constraint 4: Restrict Image Registries

Constraint 5: Block hostPath Volumes

Testing Policies: The Dryrun Approach

Understanding Rego: The Policy Language

Mutation: Assigning Defaults

OPA Beyond Kubernetes

Kyverno: The Friendly Cousin

Monitoring Policy Violations

The Reality of Policy as Code

Wrapping Up

Responses from around the web

Discussion

Related Posts

Crossplane vs Terraform for Home Lab

Sigstore + Gitsign: Signed Commits Without GPG Pain

Sealed Secrets vs External Secrets Operator

Dragonfly: P2P Container Image Distribution at Scale

OPA & Gatekeeper: Policy as Code

The Moment You Need a Bouncer

What Is OPA (Open Policy Agent)?

Gatekeeper: OPA for Kubernetes Admission Control

Installing Gatekeeper

ConstraintTemplate and Constraint: The Two-Part Enforcement

Example 1: Disallow Privileged Pods

Real-World Constraints for a Homelab

Constraint 2: Require Resource Limits

Constraint 3: Require Specific Labels

Constraint 4: Restrict Image Registries

Constraint 5: Block hostPath Volumes

Testing Policies: The Dryrun Approach

Understanding Rego: The Policy Language

Mutation: Assigning Defaults

OPA Beyond Kubernetes

Kyverno: The Friendly Cousin

Monitoring Policy Violations

The Reality of Policy as Code

Wrapping Up

Related Reading

Responses from around the web

Discussion

Related Posts

Crossplane vs Terraform for Home Lab

Sigstore + Gitsign: Signed Commits Without GPG Pain

Sealed Secrets vs External Secrets Operator

Dragonfly: P2P Container Image Distribution at Scale