Karpenter vs CLuster Auto Scaler

Karpenter is an open-source Kubernetes cluster autoscaling tool designed to automatically launch just-in-time compute resources to meet the needs of your cluster’s workloads. It optimizes for resource utilization and performance by dynamically provisioning and de-provisioning nodes based on the current state of the cluster.

Karpenter dynamically provisions nodes tailored to the exact needs of your workloads, optimizing for resource utilization and performance.

Karpenter vs Cluster Autoscaler

Cluster Autoscaler and Karpenter are both tools for managing the scaling of Kubernetes clusters but differ in their approaches and features.

Cluster Autoscaler automatically adjusts the cluster size based on workload needs, adding nodes when there are unschedulable pods due to insufficient resources and removing nodes when they are underutilized. It is widely used, integrated with many cloud providers (such as AWS, GCP, and Azure), and is known for its stability and predictability. It suits users who prefer a mature and well-supported solution.

Karpenter, on the other hand, is a newer and more flexible autoscaler that makes more intelligent scaling decisions. It can launch new nodes faster with specifications tailored to current workload requirements and integrates closely with AWS, supporting advanced features like spot instances and customized node templates. Karpenter is ideal for users seeking more responsive and efficient scaling, particularly in AWS environments.

Install

I assume that you have a working Kubernetes Cluster, mine will be in EKS but this shouldn’t matter much. https://karpenter.sh/v0.37/getting-started/getting-started-with-karpenter/#4-install-karpenter

Once Karpenter is installed, you will have to register some ~~provisioners~~ Node Pools and EC2NodeClass.

Configure

In Karpenter, the concepts of Node Pools and EC2NodeClass provide mechanisms to manage and customize the behavior of nodes within a Kubernetes cluster, particularly when using AWS as the infrastructure provider.

NodePools and EC2NodeClass

Node Pools in Karpenter are collections of nodes that share similar characteristics, such as instance types, labels, taints, and other configuration parameters. They are used to define groups of nodes that should be treated similarly by the autoscaler. Node Pools can help manage different types of workloads by ensuring that specific nodes with required resources and configurations are available.

EC2NodeClass in Karpenter allows you to define the configuration for EC2 instances that Karpenter will use to scale the cluster. This includes specifying instance types, AMIs, networking, and other AWS-specific settings. It provides a way to customize the nodes according to your specific needs and workload requirements.

Here is a very simple example of configuration.

cat <<EOF | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: simple-nodepool
spec:
  disruption:
    consolidationPolicy: WhenEmpty 
    consolidateAfter: 2m # Checks to consolidate every 2minutes (just for the example)
    expireAfter: 720h # 30 * 24h = 720h
  limits:
    cpu: 1k
    memory: 1000Gi
  template:
    metadata:
      labels:
        app: blue
        managed-by: karpenter
    spec:
      nodeClassRef:
        apiVersion: karpenter.k8s.aws/v1beta1
        kind: EC2NodeClass
        name: default
      requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values:
        - m5.4xlarge
        - m5.2xlarge
      - key: kubernetes.io/arch
        operator: In
        values:
        - amd64
      - key: karpenter.sh/capacity-type
        operator: In
        values:
        - spot
      - key: kubernetes.io/os
        operator: In
        values:
        - linux
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2 # Amazon Linux 2
  securityGroupSelectorTerms:
    - tags:
        Name: "my-cluster-dev-us-east-1-nodegroup-default"
    - tags:
        Name: "my-cluster-dev-us-east-1-nodegroups-shared"
  role: "my-cluster-dev-us-east-1-node"
  amiSelectorTerms:
    - tags:
        Name: "packer-my-cluster-dev-node-*"
  subnetSelectorTerms:
    - tags:
        Name: my-vpc-us-east-1-private-*
EOF

Will not be going into very single line, this official page explains it much better.

Basically the above config will tell Karpenter:

which kind of nodes to launch
add some labels to the nodes
overalls limits
security groups, roles, AMI and subnet config.

For the last point, I strongly advise tagging all your resources on AWS, this will make it much easier to configure your EC2NodeClass rather than hardcoding values.

Verify it actually works

To verify Karpenter can launch new nodes I will simply scale up and down a simple deployment and check Karpenter’s Controller for any errors.

cat <<EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
  name: inflate
spec:
  replicas: 0
  selector:
    matchLabels:
      app: inflate
  template:
    metadata:
      labels:
        app: inflate
    spec:
      terminationGracePeriodSeconds: 0
      securityContext:
        runAsUser: 1000
        runAsGroup: 3000
        fsGroup: 2000
      containers:
      - name: inflate
        image: public.ecr.aws/eks-distro/kubernetes/pause:3.7
        resources:
          requests:
            cpu: 1
        securityContext:
          allowPrivilegeEscalation: false
EOF

kubectl scale deployment inflate --replicas 5
kubectl logs -f -n karpenter -l app.kubernetes.io/name=karpenter -c controller

You should normally see Karpenter spawn a new node.

{"level":"INFO","time":"2024-08-04T10:39:17.358Z","logger":"controller","message":"registered nodeclaim","commit":"490ef94","controller":"nodeclaim.lifecycle",
"controllerGroup":"karpenter.sh","controllerKind":"NodeClaim","NodeClaim":{"name":"simple-nodepool-w8grs"},
"namespace":"","name":"simple-nodepool-w8grs","reconcileID":"XXXXXXXXXX",
"provider-id":"aws:///us-east-1a/i-XXXXXXXXXX","Node":{"name":"ip-XX-XX-XX-XX.ec2.internal"}}
{"level":"INFO","time":"2024-08-04T10:39:29.885Z","logger":"controller","message":"initialized nodeclaim",
"commit":"490ef94","controller":"nodeclaim.lifecycle","controllerGroup":"karpenter.sh","controllerKind":"NodeClaim",
"NodeClaim":{"name":"simple-nodepool-w8grs"},"namespace":"","name":"simple-nodepool-w8grs","reconcileID":"XXXXXXXX",
"provider-id":"aws:///us-east-1a/i-XXXXXXXXXX","Node":{"name":"ip-XX-XX-XX-XX.ec2.internal"},
"allocatable":{"cpu":"7910m","ephemeral-storage":"18242267924","hugepages-1Gi":"0","hugepages-2Mi":"0","memory":"31369688Ki","pods":"58"}}

Since I configured the NodePool with consolidationPolicy: WhenEmpty and consolidateAfter: 2m, If you scale down the application and check the controller’s log:

kubectl scale deployment inflate --replicas 0
kubectl logs -f -n "${KARPENTER_NAMESPACE}" -l app.kubernetes.io/name=karpenter -c controller

You should see Karpenter starting to take down the nodes.

{"level":"INFO","time":"2024-08-04T10:42:48.863Z","logger":"controller","message":"disrupting via emptiness delete, terminating 1 nodes (0 pods) ip-XX-XX-XX-XX.ec2.internal/m5.2xlarge/spot","commit":"490ef94","controller":"disruption","command-id":"XXXXXXXX"}
{"level":"INFO","time":"2024-08-04T10:42:49.357Z","logger":"controller","message":"command succeeded","commit":"490ef94","controller":"disruption.queue","command-id":"XXXXXXXXXXX"}

For the sake of the example, I have set consolidationPolicy: WhenEmpty with a very small consolidateAfter. The choice between using a consolidationPolicy of WhenEmpty or WhenUnderUtilized depends on your specific use case and priorities. But in most case using WhenUnderUtilized is much more Cost Efficient.

Karpenter vs CLuster Auto Scaler¶

Karpenter vs Cluster Autoscaler¶

Install¶

Configure¶

NodePools and EC2NodeClass¶

Verify it actually works¶