What is Open Policy Agent (OPA) and OPA Gatekeeper

Posted on October 7, 2022 by Adrian Wyssmann ‐ 8 min read

I finally caught up on our clusters to have a look at the replacement of the pod security policies which haven been deprecated and it's possible successor/replacement.

Before I go into details of Open Policy Agent and OPA Gatekeeper, let me recap come of some security basics in Kubernetes.

Security contexts

When managing kubernetes, you want to ensure some security and ensure that resources which are deployed, follow certain best practices. So you should know these elements:

  • PodSecurityContext is a connection of fields for security relevant settings on a pod level. For example securityContext.runAsNonRoot indicates that the container must run as a non-root user.
  • SecurityContext is the same but for containers, which holds security attributes for containers running in a pod.

The security context dictates to the kubelet and container runtime how the Pod should actually be run. Some fields may be present in both, PodSecurityContext and SecurityContext.

Pod Security Policies

While the security context defined how, the pod security policy constraints the values that may be set. So while the security context may tell the pod to run as user 1 (root), the psp may restrict this value must be greater than 1. So in essence, it is a resource that ensures a [Pod] meets certain requirements.

However, as you may know pod security policies which haven been deprecated in Kubernetes v1.21

PodSecurityPolicy was deprecated in Kubernetes v1.21, and removed from Kubernetes in v1.25. Instead of using PodSecurityPolicy, you can enforce similar restrictions on Pods using either or both:

For a migration guide, see Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller. For more information on the removal of this API, see PodSecurityPolicy Deprecation: Past, Present, and Future

If you are not running Kubernetes v1.25, check the documentation for your version of Kubernetes.

If you want to know more about pod security policies (psp) and the reasons why it has been deprecated, I recommend to read PodSecurityPolicy Deprecation: Past, Present, and Future. You certainly still want to have something like this in place, so there are some alternatives to be considered: K-Rail, Kyverno and OPA/Gatekeeper. Latter is the one we choose for, cause Rancher offers an app. But if we go to details on how to install, let’s have a look at some conceptual and technical details first.

Admission Controller

Kubernetes has an admission controller, which inercepts requests to the Kubernetes API server. This is implemented as HTTP callbacks, so when a kubernetes resource shall be created, updated or deleted, the admission controller receives the request and then may validate, mutate or do both. So there are two different types of admission controllers, which are executed in a certain order:

  1. MutatingAdmissionWebhook will modify the object(s) to enforce custom defaults. This means, the object a user gets back, is different from what she creates. Configuration for such webhooks are stored in MutatingWebhookConfiguration.
  2. ValidatingAdmissionWebhook will validate the request and then either accept or deny the creation of the object. Configuration for such webhooks are stored in ValidatingWebhookConfiguration.

There are some admission controller implemented and enabled by default, but the mechanism is implmeneted in an extensible way, so additional admission controllers can be added to the platform, to enhance the functionality.

Open Policy Agent (OPA)

Open Policy Agent (OPA, pronounced “oh-pa”) is an …

… open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more

OPA separates or decouples policy decision-making from policy enforcement. As you can see in the diagram below, OPA receives a query, makes a decission, based on the policy and data, then it sends back a decision, which can be an arbitrary structured data:

workflow in opa

OPA can run as a single binary or in the case of Kubernetes, as an admission controller, which would receive the request from the Kubernetes API Server, to whom the response would be send back to

workflow in opa with k8s

So, the repsonse sent back to the the Kubernetes API server may looks like this:

apiVersion: admission.k8s.io/v1
kind: AdmissionReview
response:
  uid: 8d836dfd-e0c0-4490-93ba-85ed4a04261e
  allowed: false
  status:
    message: "image fails to come from trusted registry: nginx"

Outside of Kubernetes, for example if used as an AWS CloudFormation Hook the response may look like this:

{
  "allow": false,
  "violations": ["bucket must not be public", "bucket name must follow naming standard"]
}

What are Policies?

OPA uses a declarative language called Rego (pronounced “ray-go”) to define security policies. Some examples what such a policy could describe:

  • Which users can access which resources.
  • Which subnets egress traffic is allowed to.
  • Which clusters a workload must be deployed to.
  • Which registries binaries can be downloaded from.
  • Which OS capabilities a container can execute with.
  • Which times of day the system can be accessed at.

Let’s take an example:

package kubernetes.admission                                                                  # line 1

deny[msg] {                                                                                    # line 2
    input.request.kind.kind == "Pod"                                                           # line 3
    image := input.request.object.spec.containers[_].image                                     # line 4
    not startswith(image, "wyssmann.com/")                                                     # line 5
    msg := sprintf("image '%v' comes from untrusted registry (must be wyssmann.com)", [image]) # line 6
}                                                                                              # line 7

Line 1 is the package statement:

Packages group the rules defined in one or more modules into a particular namespace. Because rules are namespaced they can be safely shared across projects.

Line 2 to 7 are the rule, which are if-then logic statements, that can either be “complete” or “partial”. Every rule consists of a head and a body, whereas the body is enclosed in {...}.

  • line 2, the head is deny[msg] and tells to create an array of msg.
  • line 3, accesses the [input document] created by OPA and access the element input.request.kind.kind to check if it’s a Pod.
  • line 4 and 5 do check whether the container comes from a verified registry.
  • line 6 adds an error message to the deny-Array if image does not come from wyssmann.com/.

OPA will bind the data from the query to a global variable [input], which can be used to access data in the policy using dot notation, as we have seen in line 3.

For more about rules you also might check Rego Language and function versus rules.

How does it work?

Let’s assume we use OPA with Kubernetes, so when a user wants to create the following pod:

kind: Pod
apiVersion: v1
metadata:
  name: myapp
spec:
  containers:
  - image: nginx
    name: nginx-frontend
  - image: mysql
    name: mysql-backend

OPA will receive an AdmissionReview request which looks as follows:

{
  "kind": "AdmissionReview",
  "request": {
    "kind": {
      "kind": "Pod",
      "version": "v1"
    },
    "object": {
      "metadata": {
        "name": "myapp"
      },
      "spec": {
        "containers": [
          {
            "image": "nginx",
            "name": "nginx-frontend"
          },
          {
            "image": "mysql",
            "name": "mysql-backend"
          }
        ]
      }
    }
  }
}

OPA will create the [input document] and then evaluate all policies. Looking the policy from above, it will result in the following answer:

{
  "deny": [
    "image 'mysql' comes from untrusted registry (must be wyssmann.com)",
    "image 'nginx' comes from untrusted registry (must be wyssmann.com)"
  ]
}

When we look at the format of the [admission review response] in Kubernetes OPA ultimately shall generate an [admission review response] that is sent back to the API Server which looks like this:

{
  "kind": "AdmissionReview",
  "apiVersion": "admission.k8s.io/v1",
  "response": {
    "allowed": false,
    "status": {
      "message": "image 'mysql' comes from untrusted registry (must be wyssmann.com)"
    }
  }
}

OPA Gatekeeper

While OPA can work with Kubernetes, it is recommended to use OPA Gatekeeper - the policy engine for Cloud Native environments hosted by CNCF. It enhances OPA as follows

  • An extensible, parameterized policy library
  • Native Kubernetes CRDs for instantiating the policy library (aka “constraints”)
  • Native Kubernetes CRDs for extending the policy library (aka “constraint templates”)
  • Audit functionality, so that admins can see what resources are currently violating any given plicy.
  • External data support

Bild (c) https://www.openpolicyagent.org/docs/latest/kubernetes-introduction/

How to use it?

There are two CRDs required:

  • A Constraint Templates describes the constraint in Rego. So for example to enfore the container only comes from validated repo:
spec:
  crd:
    spec:
      names:
        kind: K8sAllowedRepos
      validation:
        legacySchema: true
        openAPIV3Schema:
          properties:
            repos:
              items:
                type: string
              type: array
  targets:
  - rego: |
      package k8sallowedrepos

      violation[{"msg": msg}] {
        container := input.review.object.spec.containers[_]
        satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
        not any(satisfied)
        msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
      }

      violation[{"msg": msg}] {
        container := input.review.object.spec.initContainers[_]
        satisfied := [good | repo = input.parameters.repos[_] ; good = startswith(container.image, repo)]
        not any(satisfied)
        msg := sprintf("container <%v> has an invalid image repo <%v>, allowed repos are %v", [container.name, container.image, input.parameters.repos])
      }
    target: admission.k8s.gatekeeper.sh
  • A [Constraint] to defines what how and on what objects a [Constraint Template] will be enforced. So for example container have to come from repo wyssmann.com/
apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sAllowedRepos
metadata:
  name: repo-is-wysssmann-com
spec:
  enforcementAction: warn
  match:
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
  parameters:
    repos:
      - "wyssmann.com/"

By explicitly adding the enforcementAction field, one defines the action for [handling Constraint violations], which are:

  • deny(default): deny the admission request.
  • dryRun: testing constraints without enforcing them, so no actual changes are made.
  • warn: same as dryRun but with immediate feedback why the constraint would have been denied.

For more examples on [Constrain Templates] and sample Constraints have a look at the Gatekeeper Policy Library. There are a lot of configuration options as well some hints for debugging. You may also download gator, the cli to evaluate Constraint Templates and Constraints

Conclusion

OPA seems very powerful when it comes to enforce security policies and are luckily not limited to pods only as it was the case for psp. However, it will me take some time to fully understand Rego and to be able to write my own policies. Luckily there is the Gatekeeper Policy Library which already offers quite a comprehensive collection of useful templates. At last, it will be interesting to really install and enable OPA Gatekeeper in an existing cluster, without breaking (deny admissions) of existing apps.