Community post originally posted on Neon Mirrors by Chip Zoller

It seems just about everyone is doing GitOps in Kubernetes these days. With so many available tools and the maturity of them, it’s hard to avoid it. But with only one tool being responsible for the actual creation in the cluster of the resources stored in git, it makes it difficult or impossible for someone to answer the question, “who is the author of this thing?” In this post, I’ll show one nifty method for getting more mileage out of Kyverno by using its CLI to help you answer this question in an automated fashion.

One of the inevitabilities of Kubernetes–or, indeed, any IT system–is that once it proves to be successful, more people within an organization begin to adopt it. More people equals more hands in the pie as it were, and being able to identify those people is often times quite important. From an organizational perspective, people are grouped into teams and it’s a well-known and commonly-accepted practice for things like team names to be required, often as labels or annotations, when creating Kubernetes resources. This desire falls under the governance category when it comes to policy and Kyverno handles this extremely easily today with validate rules. Teams are often necessary, but it still doesn’t capture the individual person. Which person on this team was responsible here? When multiple people participate in authoring resources into a single git repo, things get tricky.

Assigning owners automatically is also possible today as shown in this mutate policy. In cases where users are allowed to individually and directly create resources against a cluster, this can be a viable approach. But in the GitOps world, this isn’t how things happen. The tool of choice (Argo CD, Flux, etc) is the sole creator of these resources and so using this policy would result in indicating all resources were created by the same ServiceAccount like shown below in a snippet of a Pod being deployed to a cluster by Argo CD.

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kyverno.io/created-by: '{"username":"system:serviceaccount:argocd:argocd-application-controller"}'
  creationTimestamp: "2023-02-12T15:51:27Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: signed-development
  name: nginx-platform
  namespace: platform-a
  resourceVersion: "1750857"
  uid: ad2b0ce9-1b6e-47ed-a9d2-946c7856c179
spec:
  <snip>

The resulting workflow this represents looks like the below.

Figure 1: Individual authorship is lost when using a GitOps tool to deploy to a Kubernetes cluster.

Even though the ServiceAccount named, in this case, argocd-application-controller was the principal responsible for the creation of the resource, it wasn’t the author. What’s needed here is an automated way to add this authorship information inside of the pipeline to the resources being created by their human operators.

The Kyverno CLI has a wide range of functionalities. One of its capabilities is being able to apply a Kubernetes resource manifest to a policy and view the result. This works not just for validate rules but for mutate rules as well. When used with a mutate rule, it can also show the final result of that mutation. This result can very easily be sent to a file by using the -o flag.

See the output of the kyverno apply -h command for all the possibilities.

Kyverno also has a rich variable system and has the ability to set the values of those variables during runtime in the CLI. A mutate rule with a variable can be written to add a label or annotation to any Kubernetes resource quite simply. In this example, I’ve chosen a label named corp.org/author. The value of this will be dynamically set in the next step.

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: add-labels
spec:
  background: false
  rules:
  - name: add-author
    match:
      any:
      - resources:
          kinds:
          - "*"
    mutate:
      patchStrategicMerge:
        metadata:
          labels:
            corp.org/author: "{{request.githubprauthor}}"

If you’re already familiar with Kyverno and its CLI, the “trick” you might have noticed here is using a variable in the policy which begins with request.. Although variables which begin with this word normally come from the Kubernetes API server via its AdmissionReview, I’m piggybacking off of that to define my own. This obviously would never work in a live cluster, but it allows the CLI to permit the variable rather than flagging it as unrecognizable.

Basically all of the CI tools out there have some type of pre-defined variable which captures the ID of the user who initiated a pull/merge request. In GitHub Actions, which is where I’ll be showing this, it’s found under the event type at github.event.pull_request.user.login. We should then be able to pair a mutate rule, like the above, with this information to capture the ID of that user at the time such a request is opened. Once pieced together, we can do something like the example GitHub Actions workflow shown below. Let’s break it down.

  1. A user writes their manifests into the /incoming directory.
  2. When a PR is submitted, all your normal workflows fire. This is where you could validate those resources and fail or return messages allowing the user to correct them as needed.
  3. When the PR is merged, the workflow uses the Kyverno CLI to mutate each of the manifests in /incoming adding the user responsible for the PR as the value of the corp.org/author label.
  4. The mutated manifests are sent to the /outgoing directory and the /incoming directory is scrubbed clean.
  5. Your GitOps tool of choice is configured to look at the /outgoing directory and deploy whatever is inside.
  6. Once a new manifest has been committed, your tool then deploys those changes to your cluster.
name: Merge workflow

# only trigger on pull request closed events
on:
  pull_request:
    types: [ closed ]
env:
  VERSION: v1.9.0

jobs:
  merge_job:
    # this job will only run if the PR has been merged
    if: github.event.pull_request.merged == true
    runs-on: ubuntu-latest
    permissions:
      contents: write
      actions: read
      id-token: write
    steps:
    - name: Checkout
      uses: actions/checkout@v3
      with:
        fetch-depth: 0
    - name: Write author
      run: |
        curl -sLO https://github.com/kyverno/kyverno/releases/download/${{ env.VERSION }}/kyverno-cli_${{ env.VERSION }}_linux_x86_64.tar.gz
        tar -xf kyverno-cli_${{ env.VERSION }}_linux_x86_64.tar.gz
        ./kyverno version
        for f in $(ls ./incoming)
        do
        if [[ "$f" = *\.yaml ]]
        then
            echo "Adding authorship to incoming/$f"
            ./kyverno apply author.yaml -r incoming/$f --set request.githubprauthor=${{github.event.pull_request.user.login}} -o outgoing/temp.yaml
            sed '/^[[:space:]]*$/d' outgoing/temp.yaml > outgoing/$f
            rm incoming/$f
            rm outgoing/temp.yaml
        fi
        done        
    - name: Push manifests
      uses: EndBug/add-and-commit@v9
      with:
        author_name: GitHub Actions
        commit: --signoff
        default_author: github_actions
        message: 'Manifests committed.'

After this flow completes and your GitOps tool deploys the resources, you should now be able to very easily and conveniently see who the original author of that resource was. Below you can tell by the value of corp.org/author that my GitHub account (“chipzoller”) was used to author this resource irrespective of the GitOps tool used to actually deploy it.

$ kubectl get deploy product-bravo -o yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    kubectl.kubernetes.io/last-applied-configuration: |
            {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app.kubernetes.io/instance":"crest","corp.org/author":"chipzoller"},"name":"product-bravo","namespace":"default"},"spec":{"replicas":1,"selector":{"matchLabels":{"app":"prodb"}},"template":{"metadata":{"labels":{"app":"prodb"}},"spec":{"containers":[{"args":["sleep","1d"],"image":"busybox:1.28","name":"busybox"}]}}}}
  creationTimestamp: "2023-02-12T18:49:11Z"
  generation: 1
  labels:
    app.kubernetes.io/instance: crest
    corp.org/author: chipzoller
  name: product-bravo
  namespace: default
  resourceVersion: "1769507"
  uid: 7779e7f9-0a95-477b-9098-3758c5330e80
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: prodb
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: prodb
    spec:
      containers:
      - args:
        - sleep
        - 1d
        image: busybox:1.28
        imagePullPolicy: IfNotPresent
        name: busybox
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  availableReplicas: 1
  conditions:
  - lastTransitionTime: "2023-02-12T18:49:13Z"
    lastUpdateTime: "2023-02-12T18:49:13Z"
    message: Deployment has minimum availability.
    reason: MinimumReplicasAvailable
    status: "True"
    type: Available
  - lastTransitionTime: "2023-02-12T18:49:11Z"
    lastUpdateTime: "2023-02-12T18:49:13Z"
    message: ReplicaSet "product-bravo-84fc4998bd" has successfully progressed.
    reason: NewReplicaSetAvailable
    status: "True"
    type: Progressing
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updatedReplicas: 1

This technique can be used to further enhance these manifests with even more information about the individuals involved in the authoring or approval process. For example, you might also want to know the pull request from which this particular resource manifest originated, or the account responsible for merging it. You can add this information as additional labels and pass these as variables to the Kyverno CLI referencing the appropriate context variables.

    mutate:
      patchStrategicMerge:
        metadata:
          labels:
            corp.org/author: "{{request.githubprauthor}}"
            corp.org/pr: "{{request.githubpr | to_string(@) }}"
            corp.org/merger: "{{request.githubmerger}}"
--set request.githubprauthor=${{github.event.pull_request.user.login}},\
request.githubpr=${{github.event.number}},\
request.githubmerger=${{github.event.pull_request.merged_by.login}}
apiVersion: v1
kind: Namespace
metadata:
  labels:
    corp.org/author: realshuting
    corp.org/merger: chipzoller
    corp.org/pr: "13"
  name: org-ns-bar

And that’s basically it. Not a super complex bit of automation but nevertheless can assist in the process of following back to its source any given resource that’s deployed into your Kubernetes environments. This method works with any GitOps tool you want and on basically any CI tool you wish to use.

If you liked this post, I’m always glad to hear feedback so feel free to drop me a note on Twitter or come find me on Slack.