Guest post originally published on the InfraCloud Blog by Himanshu Verma

What if we didn’t have to worry about configuring Node Groups, or right-sizing compute resources beforehand in our Kubernetes infrastructure? You read it right, Karpenter does not use Node Groups to manage the workload. Instead, it uses Launch Templates for nodes and manages each instance directly without configuring any orchestration mechanism. Karpenter allows you to take full advantage of the cloud’s flexibility. Before Karpenter, Kubernetes users had to use Amazon EC2 Auto Scaling Groups and the Kubernetes Cluster Autoscaler or some custom script cron job to dynamically adjust their cluster compute capacity. In this article, we will cover in detail how to improve the efficiency and cost of running workloads in Kubernetes using Karpenter.

What is Karpenter?

Karpenter is an open source provisioner tool that can quickly deploy Kubernetes infrastructure with the right nodes at the right time. It significantly improves the efficiency and cost of running workloads on a cluster. It automatically provisions new nodes in response to un-schedulable pods.

How is Karpenter different from Cluster Autoscaler?

How does Karpenter work?

As Karpenter is tightly integrated with Kubernetes features, it observes events within a Kubernetes cluster and then sends commands to the cloud provider. As new pods are detected, scheduling constraints are evaluated, nodes are provisioned based on the required constraints, pods are scheduled on the newly created nodes, and nodes are removed when no longer needed to minimize scheduling latencies and infrastructure costs.

Diagram flow showing Karpenter architecture

The key concept behind this is the custom resource named Provisioner. It is used by Karpenter to define provisioning configurations. Provisioners contain constraints that affect the nodes that can be provisioned and the attributes of those nodes (for example, timers for removing nodes).

The Provisioner can be set to do things like:

Key Features & Benefits of Karpenter

Deploying Karpenter

Firstly, we need to create an IAM role that the Karpenter controller will use to provision new instances. The controller will be using IAM Roles for Service Accounts (IRSA) which requires an OIDC endpoint. Details can be found here: karpenter-IRSA-details

Step 1: Applying karpenter custom resource definitions:

Step 2: Refer to the values file accordingly: karpenter-values-sample

Step 3: Installing or upgrading the karpenter:

helm upgrade --install \
        karpenter karpenter/karpenter\
        --version 0.26.1 \
        --values custom-values.yaml \
        --namespace karpenter \
        --wait

After the installation is complete, we can see the following resources are created:

code example

Step 4: Once the karpenter pods are up and running, refer to the sample Provisioner file and update it accordingly: karpenter-provisioner-sample

For more details about the spec, see Provisioners page from Karpenter documentation.

Step 5: Make the required changes and apply the provisioner CRD:

kubectl apply -f provisioner.yaml

Cost Reduction – Using Spot and On-Demand Flexibility

The real-world problem that Karpenter can help solve is managing workload fluctuations in a cost-effective manner. Traditionally, manual scaling of worker nodes is required to handle increased traffic, which can be time-consuming and costly. Karpenter’s efficient response to dynamic resource requests enables users to handle increased traffic without downtime. To reduce the costs, spot instances can be used with on-demand fallbacks. Additionally, Karpenter provides time-slicing GPU nodes, allowing users to run high-performance computing workloads. These features help users optimize their resources and save costs while ensuring their workloads run efficiently.

Limitations of Karpenter

Some Learnings

Summary

Karpenter allows us to do much more than Kubernetes Cluster Autoscaler regarding provisioner configurations. For example, Karpenter directly manages the node group without ASG, new pods get bound immediately to the node. Thus, making it much faster than the Autoscaler.

We can greatly reduce the cost of infrastructure by using a flexible provisioner which dynamically allocates Spot and On-Demand instances. Using a provisioner, you can create arm64 (AWS Graviton) worker nodes that reduce cost and boost performance. By enabling the consolidation feature it automatically adjusts the cluster size by deleting underutilized nodes and consolidating pods into fewer right-size nodes.

Thanks for reading this post. We hope it was informative and engaging for you. We would love to hear your thoughts on this post, so do start a conversation on LinkedIn.

Looking to adopt cloud native and Kubernetes stack? Learn why so many startups and enterprise trust InfraCloud as one of the best Kubernetes consulting services provider for Kubernetes adoption and implementation.