Community post originally published on Medium by Mathieu Benoit

In Kubernetes 1.25 as stable (and since 1.23 as beta), the Pod Security admission (PSA) controller replaces PodSecurityPolicy (PSP), making it easier to enforce predefined Pod Security Standards (PSS) by simply adding a label to a namespace.

Pod Security admission places requirements on a Pod’s Security Context and other related fields according to the three levels defined by the Pod Security Standardsprivilegedbaseline, and restricted.

Why do you need this?

If you don’t restrict the default privileged capabilities of your containers running on Kubernetes, here are some examples I captured showing how an attacker could hack your containers, your clusters and ultimately your Cloud services:

Let’s see how you can easily protect your containers, and you clusters!

In this blog post, let’s see in action how you can easily use this out-of-the-box PSA/PSS feature to improve your Kubernetes security posture. We will also cover how you can guarantee that any Namespaces has PSS enforced via Admission Controllers like Validating Admission Policies or Kyverno. Finally, we will see an example of a managed offer like GKE Autopilot which applies the baseline policies with some modifications for usability.

Let’s start with a not secure container deployed in a dedicated namespace:

kubectl create ns psa-tests
kubectl create deploy nginx \
    --image nginx:latest \
    -n psa-tests

Enforce PSS on a Namespace

Enforce the restricted Pod Security Standard for this namespace:

kubectl label ns psa-tests pod-security.kubernetes.io/enforce=restricted

You will get these warnings:

Warning: existing pods in namespace "psa-tests" violate the new PodSecurity enforce level "restricted:latest"
Warning: nginx-7bf8c77b5b-wpd5g: allowPrivilegeEscalation != false, unrestricted capabilities, runAsNonRoot != true, seccompProfile
namespace/psa-tests labeled

If you try to deploy a new unsecure container, you will get this warning:

Warning: would violate PodSecurity "restricted:latest": allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false), unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]), runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true), seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type to "RuntimeDefault" or "Localhost")
deployment.apps/nginx created

We could see that the Deployment is created but the Pod won’t be created and will raise this error:

'pods "nginx-7b9554bccd-zgjqb" is forbidden: violates PodSecurity "restricted:latest":
      allowPrivilegeEscalation != false (container "nginx" must set securityContext.allowPrivilegeEscalation=false),
      unrestricted capabilities (container "nginx" must set securityContext.capabilities.drop=["ALL"]),
      runAsNonRoot != true (pod or container "nginx" must set securityContext.runAsNonRoot=true),
      seccompProfile (pod or container "nginx" must set securityContext.seccompProfile.type
      to "RuntimeDefault" or "Localhost")'

Only highly secure container can be deployed, here is an example of a secure Deployment:

cat << EOF | kubectl apply -n nginx -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
  labels:
    app: nginx
spec:
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      automountServiceAccountToken: false
      securityContext:
        fsGroup: 65532
        runAsGroup: 65532
        runAsNonRoot: true
        runAsUser: 65532
        seccompProfile:
          type: RuntimeDefault
      containers:
        - name: nginx
          securityContext:
            allowPrivilegeEscalation: false
            capabilities:
              drop:
                - ALL
            privileged: false
            readOnlyRootFilesystem: true
          image: nginxinc/nginx-unprivileged:alpine-slim
          ports:
            - containerPort: 8080
          volumeMounts:
          - mountPath: /tmp
            name: tmp
      volumes:
      - emptyDir: {}
        name: tmp
EOF

This is for a secure nginx container. You can find an other example with a .NET app there too: Sail Sharp, 9 tips to optimize and secure your .NET containers for Kubernetes. And this should be the same approach for any containers, all your containers.

That’s it!

Just one label on your Namespaces and you just improved your security posture with an out-of-the-box feature in Kubernetes!

Awesome, isn’t it?! 🤩

Now, how to ensure that any Namespaces have this label? That’s what we will discuss with the two next sections with Validating Admission Policies and Kyverno.

Enforcement of PSS with Validating Admission Policies

Validating Admission Policies (still in beta in Kubernetes 1.28) is an easy way to ensure that any Namespaces have this label.

Here is the policy to test if the label is on the Namespaces:

cat << EOF | kubectl apply -f -
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingAdmissionPolicy
metadata:
  name: pss-label-enforcement
spec:
  failurePolicy: Fail
  matchConstraints:
    resourceRules:
    - apiGroups:   [""]
      apiVersions: ["v1"]
      operations:  ["CREATE", "UPDATE"]
      resources:   ["namespaces"]
  validations:
  - expression: "'pod-security.kubernetes.io/enforce' in object.metadata.labels"
    message: "pod-security.kubernetes.io/enforce label not existing"
  - expression: "object.metadata.labels['pod-security.kubernetes.io/enforce'] in ['restricted']"
    message: "pod-security.kubernetes.io/enforce label not one of the required values: restricted"
EOF
cat << EOF | kubectl apply -f -
apiVersion: admissionregistration.k8s.io/v1beta1
kind: ValidatingAdmissionPolicyBinding
metadata:
  name: pss-label-enforcement
spec:
  policyName: pss-label-enforcement
  validationActions:
    - Deny
  matchResources:
    namespaceSelector:
      matchExpressions:
      - key: kubernetes.io/metadata.name
        operator: NotIn
        values:
        - kube-node-lease
        - kube-public
        - kube-system
EOF

Enforcement of PSS with Kyverno

An Admission Controller like Kyverno allows you to enforce that any Namespaces have the PSS label. You could use either the validate method or the mutate method.

Here is the policy for the validate method to test if the label is on the Namespaces:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: check-namespaces-with-pss-restricted
spec:
  validationFailureAction: Enforce
  background: true
  rules:
  - name: check-namespace-labels
    match:
      any:
      - resources:
          kinds:
            - Namespace
    exclude:
      any:
      - resources:
          namespaces:
          - kube-node-lease
          - kube-public
          - kube-system
    validate:
      message: This Namespace is missing a PSS restricted label.
      pattern:
        metadata:
          labels:
            pod-security.kubernetes.io/enforce: restricted

And here is the policy for the mutate method to automatically add the label on the Namespaces:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-restricted-existing-namespaces
spec:
  mutateExistingOnPolicyUpdate: true
  rules:
  - name: label-restricted-namespaces
    match:
      any:
      - resources:
          kinds:
          - Namespace
    mutate:
      targets:
        - apiVersion: v1
          kind: Namespace
      patchStrategicMerge:
        metadata:
          <(name): "!kube-*"
          labels:
            pod-security.kubernetes.io/enforce: restricted

Note: same approaches can be done with OPA Gatekeeper as well.

Kyverno is going one step further with that. Let’s imagine you want to apply the restricted policies, but not just for one container with one specific security feature it requires to “by-pass”? It’s not possible to be granular like this with the default PSS labels. Kyverno makes it possible by having a dedicated podSecurity validation. Here below is an example of a policy which enforces the restricted profile except for the Seccomp check and specifically on containers which use the cadvisor image:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: psa
spec:
  background: true
  validationFailureAction: Enforce
  rules:
  - name: restricted
    match:
      any:
      - resources:
          kinds:
          - Pod
    validate:
      podSecurity:
        level: restricted
        exclude:
        - controlName: Seccomp
          images:
          - cadvisor*

Baseline policies applied by default with GKE Autopilot

Autopilot applies the Baseline policy with some modifications for usability. Autopilot also applies many constraints from the Restricted policy, but avoids restrictions that would block a majority of your workloads from running. Autopilot applies these constraints at the cluster level using an admission controller that Google controls. If you need to apply additional restrictions to comply with the full Restricted policy, you can optionally use the PodSecurity admission controller in specific namespaces.

You can find out more details about how GKE Autopilot uses these policies and constraints: GKE Autopilot security capabilities | Google Kubernetes Engine (GKE) | Google Cloud.

You can learn for example that if you need NET_ADMIN, you can specify --workload-policies=allow-net-admin in your cluster creation command. For example if you are using service meshes such as Istio or Anthos Service Mesh (ASM), you will need that.

That’s a wrap!

We saw how just one label on your Namespaces could radically improve your Security posture with your containers and your clusters. Just do it!

Kubernetes is recommending two group of policies, either restricted or baseline, depending on your needs and requirements.

We also saw how Validating Admission Policies or Kyverno could help you ensuring that any Namespaces have the required label and value.

Resources

Hope you enjoyed this one, stay safe out there, and happy sailing!