Overview and comparison of possible alternatives to pod security policies

Posted in cloud on September 29, 2022 by Adrian Wyssmann ‐ 7 min read

Why

Currently we have pod security policies in place, they are on a per-project assignment:

PSP haven been deprecated in Kubernetes v1.21

PodSecurityPolicy was deprecated in Kubernetes v1.21, and removed from Kubernetes in v1.25. Instead of using PodSecurityPolicy, you can enforce similar restrictions on Pods using either or both:
Pod Security Admission
a 3rd party admission plugin, that you deploy and configure yourself
For a migration guide, see Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller. For more information on the removal of this API, see PodSecurityPolicy Deprecation: Past, Present, and Future
If you are not running Kubernetes v1.25, check the documentation for your version of Kubernetes.

If you want to know more about pod security policies (psp) and the reasons why it has been deprecated, I recommend to read PodSecurityPolicy Deprecation: Past, Present, and Future. You certainly still want to have something like this in place, so there are some alternatives to be considered: K-Rail, Kyverno and OPA/Gatekeeper.

Admission Controller

Kubernetes has an admission controller, which inercepts requests to the Kubernetes API server. This is implemented as HTTP callbacks, so when a kubernetes resource shall be created, updated or deleted, the admission controller receives the request and then may validate, mutate or do both. So there are two different types of admission controllers, which are executed in a certain order:

MutatingAdmissionWebhook will modify the object(s) to enforce custom defaults. This means, the object a user gets back, is different from what she creates. Configuration for such webhooks are stored in MutatingWebhookConfiguration.
ValidatingAdmissionWebhook will validate the request and then either accept or deny the creation of the object. Configuration for such webhooks are stored in ValidatingWebhookConfiguration.

There are some [dmission controller] implemented and enabled by default, but the mechanism is implmeneted in an extensible way, so additional admission controllers can be added to the platform, to enhance the functionality.

K-Rail

K-Rail is a kubernetes security tool for policy enforcement and offers the following features

Passive report-only mode of running policies
Structured violation data logged, ready for analysis and dashboards
Flexible and powerful policy exemptions by cluster, resource name, namespace, groups, and users
Realtime interactive feedback for engineers and systems that apply resources

It is based on dynamic admission controller in a Kubernetes cluster, where it receives validating and mutating admission webhook HTTP callbacks from the kube-apiserver and applies matching policies to return results that enforce admission policies or reject requests.

Policies are built-in to K-Rail but additional policies can be defined however:

No ShareProcessNamespace - Enforced pod Spec directive that puts all containers in a Pod within the same PID Namespace
No Exec - prevents users from execing into running pods unless they have an exemption
No Bind Mounts - disable hostPath mounts
No Docker Sock Mount - disable docker socket bind mount
No Root User - forbid to run as root user
EmptyDir size limit - enforces a limit on emptyDir
Mutate Default Seccomp Profile - Sets a default seccomp profile
Immutable Image Reference - Sets the docker image tags in a registry to immutable
No Host Network - disables the use of host network
No Host PID - disables the access of PIDs from host
No New Capabilities - disable to add kernel capabilites
No Privileged Container - ensure container runs unprivileged
No Helm Tiller - blocks images with tiller in heir name from being deployed
Trusted Image Repository - ensure images come only from known registries
Mutate Safe to Evict - mutates Pods that do not have the annotation cluster-autoscaler.kubernetes.io/safe-to-evict=true specified
Mutate Image Pull Policy - modifies ImagePullPolicy to a specific state (regex based)
Require Ingress Exemption - requires the configured ingress classes to have an a Policy exemption to be used
Unique Ingress Host - requires the configured ingress hosts to be unique across cluster namespaces
Service type LoadBalancer annotation check - validates the annotations put on a service and will reject services defined with annotations outside the acceptable range.
Istio VirtualService Gateways check - validates the gateways listed on an Istio virtual service and will reject virtual services defined with gateways outside the acceptable range
No Persistent Volume Host Path - prevents direct access to potentially sensitive files or directories at the Node-level via Persistent Volumes.
No Anonymous Cluster Role Binding - prevents the creation of cluster level role bindings that authorize unathenticated or anonymous users to access resources
No Anonymous Role Binding - prevents the creation of namespace level role bindings that authorize unathenticated or anonymous users to access resources
Invalid Pod Disruption Budget - Prevent misconfigured pod disruption budgets from disrupting normal system maintenance such as node drains.
No External IP on Service - Prevents providing External IPs on a Service to mitigate CVE-2020-8554.
Deny Unconfined AppArmor Policies - Prevents users from specifing an unconfined apparmor policy which can be used with other conditions to lead to container escape.
Protect CRD From Accidental Deletion - allows the user to set the annotation k-rail.crd.protect: enabled on any CRD which will prevent its deletion if any children CRs exist

What speaks against this solution?

Policies are build-in
As of today 19 Oct 2022 last change happened 12 month ago

Kyverno

Kyverno is a policy engine designed for Kubernetes. It can validate, mutate, and generate configurations using admission controls and background scans.

It offers the following features:

policies as Kubernetes resources (no new language to learn!)
validate, mutate, or generate any resource
verify container images for software supply chain security
inspect image metadata
match resources using label selectors and wildcards
validate and mutate using overlays (like Kustomize!)
synchronize configurations across Namespaces
block non-conformant resources using admission controls, or report policy violations
test policies and validate resources using the Kyverno CLI, in your CI/CD pipeline, before applying to your cluster
manage policies as code using familiar tools like git and kustomize

TODO BILD

A Kyverno Policy is a collection of rules. Each rule has a matches or excludes declaration plus one child declaration validate, mutate, generate, or verifyImages. Policies can be cluster-wide (ClusterPolicy) or namespaces (Policy).

What speaks for Kyverno?

Very active project
Separation of engine and policies

Open Policy Agent / OPA Gatekeeper

Open Policy Agent (OPA, pronounced “oh-pa”) is an …

… open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more

OPA separates or decouples policy decision-making from policy enforcement. As you can see in the diagram below, OPA receives a query, makes a decission, based on the policy and data, then it sends back a decision, which can be an arbitrary structured data:

TODO: IMAGE

OPA uses a declarative language called Rego (pronounced “ray-go”) to define security policies. Some examples what such a policy could describe:

Which users can access which resources.
Which subnets egress traffic is allowed to.
Which clusters a workload must be deployed to.
Which registries binaries can be downloaded from.
Which OS capabilities a container can execute with.
Which times of day the system can be accessed at.

While OPA can work with Kubernetes, it is recommended to use OPA Gatekeeper - the policy engine for Cloud Native environments hosted by CNCF. It enhances OPA as follows

An extensible, parameterized policy library
Native Kubernetes CRDs for instantiating the policy library (aka “constraints”)
Native Kubernetes CRDs for extending the policy library (aka “constraint templates”)
Audit functionality, so that admins can see what resources are currently violating any given plicy.
External data support

Bild

What speaks for Gatekeeper?

Very active project
Separation of engine and policies
There is an app for Rancher and it is default for AKS clusters

Where to go from here?

Due to the default for AKS clusters and the availability in Rancher, we decided to go on with Gatekeeper. The presented solution may not be everything which exists, but the once I had a look at. Also the decisions

Wyssmann Engineering

Title here