Scaling EKS workloads with Karpenter - A working solution deployed via CDK - MONEX ENGINEER BLOG │マネックスエンジニアブログ

Kubernetes has become the de-facto standard for orchestrating containerized workloads and Amazon Elastic Kubernetes Service (EKS) is the AWS solution to make running Kubernetes in the cloud more manageable by abstracting away the control plane low-level operations. As our cluster workloads grow, it becomes crucial to handle our compute nodes efficiently. And this is where Karpenter, an open source, cluster autoscaler for EKS comes in. In this blog post we will see how we can set up a powerful solution to run our workloads on EKS, scale our nodes with Karpenter and define our infrastructure as code and deploy via CDK.

Why EKS?

EKS gives us a managed Kubernetes control plane. We don't have to worry about etcd management, API server scaling and control plane updates as all this is managed by AWS. Instead our teams can focus on delivering business value. Beyond operational simplification, EKS provides seamless integration with common AWS services, enterprise-grade security and high availability across multiple availability zones.

Autoscaling with Karpenter

Karpenter introduces a new approach in EKS autoscaling by provisioning nodes directly through the EC2 API rather than relying on pre-configured Auto Scaling Groups. Unlike traditional autoscalers that force workloads into predetermined instance types, Karpenter evaluates the full spectrum of available instances in real-time to find the optimal match for pending pods. You define resource limits at the provisioner level, setting boundaries on total CPU, memory, or instance count and Karpenter intelligently selects the most efficient instance types within those constraints based on pending workload requirements. This approach delivers nodes in seconds instead of minutes while reducing costs through smart bin-packing and automatic consolidation of underutilized resources. The result is a self-optimizing cluster that respects your budget guardrails while continuously adapting to demand without manual intervention, making it ideal for teams looking to reduce both infrastructure costs and operational complexity.

Code time! - A working example using CDK

AWS CDK lets you define infrastructure in TypeScript, C# and other languages, instead of wrestling with JSON/YAML CloudFormation templates. Below we are going to see how we can define our cluster, some helm charts, IAM policies and so on in order to deploy a simple workload on EKS. Our language of choice is Typescript.

The EKS cluster

Before the cluster definition we need to create an IAM masters role for the cluster. I usually organize this in a separate folder were I define all my IAM roles and policies. Since policies for some roles can grow very large, I also separate those two.

export function getMastersRole(scope: Construct): iam.Role {
  return new iam.Role(scope, 'MyMastersRole', {
    assumedBy: new iam.AccountRootPrincipal(),
    description: 'Role for EKS masters to access the cluster',
    roleName: 'MyMastersRole',
  });
}

export function getMastersRolePolicy(clusterArn: string): iam.PolicyStatement {
  return new iam.PolicyStatement({
    actions: [
      'eks:AccessKubernetesApi',
      'eks:Describe*',
      'eks:List*',
      'eks:DescribePodIdentityAssociation',
      'sts:AssumeRole',
      'sts:TagSession',
      'iam:PassRole',
    ],
    resources: [clusterArn],
  });
}

The cluster also needs to know the VPC it will be deployed to. Since I have a pre-existing VPC in my AWS account I will perform a lookup to find the VPC

  const vpc = Vpc.fromLookup(this, 'Vpc', {
    vpcId: 'my-vpc-id'
  });

Now finally we can define the cluster object:

  const mastersRole = getMastersRole();
  cluster = new eks.Cluster(this, 'MyCluster', {
    vpc,
    defaultCapacity: 0,
    kubectlLayer: new KubectlV33Layer(this, 'KubectlLayer'),
    version: eks.KubernetesVersion.V1_33,
    clusterName: clusterName,
    mastersRole: mastersRole,
  });
  mastersRole.addToPolicy(getMastersRolePolicy(cluster.clusterArn));

Next we are going to create a default node group which will handle AWS managed EKS default workloads. A t4g.small instance type should be sufficient for this workload. We run 2 instances for resilience

  cluster.addNodegroupCapacity('CoreNodeGroup', {
    nodegroupName: 'core-nodegroup',
    instanceTypes: [new ec2.InstanceType('t4g.small')],
    amiType: eks.NodegroupAmiType.AL2023_ARM_64_STANDARD,
    minSize: 2,
    desiredSize: 2,
    maxSize: 2,
    subnets: { subnetType: ec2.SubnetType.PRIVATE_WITH_EGRESS },
    capacityType: eks.CapacityType.ON_DEMAND,
    labels: { nodegroup: 'core-nodegroup' },
  });

Now we are going to set up an ALB Controller for the cluster. In order to do this, we will need to define a service account and the actual controller itself which comes in the form of a Helm chart.

  const albControllerSA = new eks.ServiceAccount(
    this,
    'aws-load-balancer-controller',
    {
      cluster: cluster,
      name: albControllerSAName,
      namespace: 'kube-system',
      identityType: eks.IdentityType.POD_IDENTITY,
    }
  );

  enrichAlbControllerRole(this, albControllerSA.role as iam.Role);

  const albControllerHelmChart = cluster.addHelmChart(
    'MyALBController',
    {
      chart: 'aws-load-balancer-controller',
      release: 'aws-load-balancer-controller',
      repository: 'https://aws.github.io/eks-charts',
      namespace: 'kube-system',
      values: {
        clusterName: cluster.clusterName,
        serviceAccount: {
          create: false,
          name: albControllerSAName,
        },
        region: '<my aws region>`,
        vpcId: vpc.vpcId,
      },
    }
  );

When working with EKS via CDK, we often encounter resources that depend on different resources to become available first. We can do this by adding dependencies as can be seen below:

  albControllerHelmChart.node.addDependency(albControllerSA);

The ALB Controller needs a myriad of IAM policies which again are defined separately in the enrichAlbControllerRole function. For coherence and keeping things clean I will write the list of policies needed for the controller and other resources at the end of this blog.

Now let's take look at our Karpenter set up. The first couple of infrastructure we need, similarly to the ALB, are the Service Account and the Karpenter Helm chart.

  const karpenterSAName = `karpenter-sa`;
  const kubeSystemNamespace = 'kube-system';

  const karpenterSA = new eks.ServiceAccount(
    this,
    'KarpenterServiceAccount',
    {
      cluster: props.cluster,
      name: karpenterSAName,
      namespace: kubeSystemNamespace,
      identityType: eks.IdentityType.POD_IDENTITY,
    }
  );

  enrichKarpenterRole(karpenterSA.role as iam.Role);

  const karpenterHelmChart = props.cluster.addHelmChart('KarpenterChart', {
    repository: 'oci://public.ecr.aws/karpenter/karpenter',
    chart: 'karpenter',
    release: 'karpenter',
    version: '1.6.0',
    namespace: kubeSystemNamespace,
    values: {
      serviceAccount: {
        create: false,
        name: karpenterSAName,
      },
      settings: {
        clusterName: cluster.clusterName,
      },
    },
    skipCrds: false,
  });

In the next step, we are going to get a little bit more involved with Karpenter configuration. We are going to create an EC2NodeClass. EC2NodeClass defines the AWS-specific configuration for provisioning nodes such as AMI, instance profile, subnet and security group selectors that Karpenter uses when launching EC2 instances.

  const nodeClassManifest = props.cluster.addManifest('KarpenterNodeClass', {
      apiVersion: 'karpenter.k8s.aws/v1',
      kind: 'EC2NodeClass',
      metadata: {
        name: 'my-nodeclass',
      },
      spec: {
        amiSelectorTerms: [{ alias: 'al2023@latest' }], // pin down the ami version
        role: karpenterNodeRole.roleName,
        subnetSelectorTerms: [
          {
            id: vpc.privateSubnets[0].subnetId,
          },
          {
            id: vpc.privateSubnets[1].subnetId,
          },
        ],
        securityGroupSelectorTerms: [
          { id: cluster.clusterSecurityGroup.securityGroupId },
        ],
        tags: {
          [discoveryTag]: cluster.clusterName,
          Name: 'my-karpenter-node',
        },
      },
    });

  nodeClassManifest.node.addDependency(karpenterHelmChart);

Our Karpenter set up requires one more resource, a NodePool. A NodePool defines how a set of nodes are provisioned and managed based on some constraints such as allowed instance types, architectures, taints (workload placement rules), resource limits such as total CPU cores, and disruption policies i.e. when and how nodes are consolidated, expired, or replaced. I added some example values but feel free to play around with the constraints to meet your workflow and budget needs.

  const nodePoolManifest = props.cluster.addManifest('KarpenterNodePool', {
      apiVersion: 'karpenter.sh/v1',
      kind: 'NodePool',
      metadata: { name: 'karpenter-nodepool' },
      spec: {
        template: {
          metadata: {
            labels: {
              nodepool: 'karpenter-nodepool',
            },
          },
          spec: {
            taints: [
              {
                key: 'workload',
                value: 'karpenter-nodes',
                effect: 'NoSchedule',
              },
            ],
            requirements: [
              {
                key: 'node.kubernetes.io/instance-type',
                operator: 'In',
                values: ['t4g.medium', 't4g.large', 't4g.xlarge'],
              },
              {
                key: 'karpenter.sh/capacity-type',
                operator: 'In',
                values: ['on-demand'],
              },
              {
                key: 'kubernetes.io/arch',
                operator: 'In',
                values: ['arm64'],
              },
              {
                key: 'kubernetes.io/os',
                operator: 'In',
                values: ['linux'],
              },
              {
                key: 'karpenter.k8s.aws/instance-generation',
                operator: 'Gt',
                values: ['2'],
              },
            ],
            nodeClassRef: {
              group: 'karpenter.k8s.aws',
              kind: 'EC2NodeClass',
              name: 'my-nodeclass',
            },
            expireAfter: '720h',
          },
        },
        limits: {
          cpu: '100',
        },
        disruption: {
          consolidationPolicy: 'WhenEmptyOrUnderutilized',
          consolidateAfter: '1m',
        },
        weight: 1,
      },
    });

    nodePoolManifest.node.addDependency(nodeClassManifest);

The infrastructure we created above, concludes our EKS cluster setup. We have created a cluster and implemented a robust autoscaling strategy with Karpenter. Let's continue to see how we can run a simple containerized application on our newly created custer

Create a hello world container app

It probably got old at this point, but we will start by creating a Service Account.

  const serviceAccountName = 'my-app-sa';
  const deploymentName = 'my-app';
  const ingressName = 'my-app-ingress';
  const albName = 'my-app-alb';
  const myAppNamespace = cluster.addManifest('MyAppNamespace', {
    apiVersion: 'v1',
    kind: 'Namespace',
    metadata: { name: namespace },
  });

  const myAppSA = new eks.ServiceAccount(this, 'MyAppServiceAccount', {
    cluster: cluster,
    name: serviceAccountName,
    namespace: namespace,
    identityType: eks.IdentityType.POD_IDENTITY,
  });

  myAppSA.node.addDependency(myAppNamespace);

  // The policies for this role are entirely dependent on the AWS resources your application needs to use such as SQS queues, S3 buckets and so on
  enrichPodRole(myAppSA.role as iam.Role);

Next we need to create a deployment for our application. A Deployment in EKS (and Kubernetes) is a higher-level resource that manages a set of identical Pods ensuring the desired number of replicas are running, automatically handling operations such as rolling updates, rollbacks, and self-healing. The nodeSelector and tolerations attributes give us control on scheduling, letting Pods target specific nodes or tolerate defined taints for precise workload placement. For this example, I used the nginx image that runs a HTTP server on port 80 which is simple enough for demonstration purposes.

  const deploymentManifest = cluster.addManifest('MyDeployment', {
    apiVersion: 'apps/v1',
    kind: 'Deployment',
    metadata: { name: deploymentName, namespace: namespace },
    spec: {
      replicas: 2,
      selector: { matchLabels: { app: deploymentName } },
      template: {
        metadata: { labels: { app: deploymentName } },
        spec: {
          serviceAccountName: serviceAccountName,
          tolerations: [
            {
              key: 'workload',
              operator: 'Equal',
              value: 'karpenter-nodes',
              effect: 'NoSchedule',
            },
          ],
          nodeSelector: {
            nodepool: 'karpenter-nodepool',
          },
          containers: [
            {
              name: deploymentName,
              image: 'docker.io/library/nginx:alpine',
              ports: [{ containerPort: 4537, name: 'server' }],
              readinessProbe: {
                httpGet: {
                  path: '/healthcheck',
                  port: 80,
                },
                initialDelaySeconds: 10,
                periodSeconds: 5,
                timeoutSeconds: 3,
                failureThreshold: 2,
              },
              livenessProbe: {
                httpGet: {
                  path: '/healthcheck',
                  port: 80,
                },
                initialDelaySeconds: 30,
                periodSeconds: 10,
                timeoutSeconds: 5,
                failureThreshold: 3,
              },
            },
          ],
        },
      },
    },
  });

Next we need to define a service for our application in the cluster. A Service provides a stable network endpoint that routes traffic to a set of Pods and load balancing across them.

const myAppServiceManifest = cluster.addManifest('MyAppService', {
  apiVersion: 'v1',
  kind: 'Service',
  metadata: {
    name: deploymentName,
    namespace: namespace,
  },
  spec: {
    type: 'ClusterIP',
    selector: { app: deploymentName },
    ports: [
      {
        name: 'http',
        port: 80,          // Exposed service port
        targetPort: 80,    // Container port for nginx:alpine
        protocol: 'TCP',
      },
    ],
  },
});

The last piece in the puzzle is creating an Ingress. An Ingress manages external access to Services by defining routing rules that map incoming requests to the right backend Pods and providing HTTPS termination so that traffic is securely encrypted between the clients and the load balancer.

  myAppServiceManifest.node.addDependency(deploymentManifest);

  const ingress = cluster.addManifest('MyAppIngress', {
    apiVersion: 'networking.k8s.io/v1',
    kind: 'Ingress',
    metadata: {
      name: ingressName,
      namespace: namespace,
      annotations: {
        'alb.ingress.kubernetes.io/load-balancer-name': albName,
        'alb.ingress.kubernetes.io/scheme': 'internet-facing',
        'alb.ingress.kubernetes.io/target-type': 'ip',
        'alb.ingress.kubernetes.io/listen-ports':
          '[{"HTTP": 80}, {"HTTPS": 443}]',
        'alb.ingress.kubernetes.io/certificate-arn': '<your certificate ARN goes here>',
        'alb.ingress.kubernetes.io/ssl-redirect': '443',
        'alb.ingress.kubernetes.io/load-balancer-attributes':
          'idle_timeout.timeout_seconds=60',
        'alb.ingress.kubernetes.io/healthcheck-path': '/healthcheck',
        'alb.ingress.kubernetes.io/healthcheck-port': '80',
      },
    },
    spec: {
      ingressClassName: 'alb',
      rules: [
        {
          host: `<your domain name>`,
          http: {
            paths: [
              {
                path: '/',
                pathType: 'Prefix',
                backend: {
                  service: { name: deploymentName, port: { number: 80 } },
                },
              },
            ],
          },
        },
      ],
    },
  });

  ingress.node.addDependency(myAppSA);
  ingress.node.addDependency(myAppServiceManifest);

This should be the very basic resources needed to deploy a EKS cluster with Karpenter for autoscaling and a simple, internet-facing application deployment. I hope you enjoyed and stay tuned for more EKS related posts in the future.

Akis Papaditsas - Monex Insight

Below you can find the policies required for the various resources:

ALB Controller policies

'acm:DescribeCertificate',
'acm:ListCertificates',
'cognito-idp:DescribeUserPoolClient',
'ec2:AuthorizeSecurityGroupIngress',
'ec2:RevokeSecurityGroupIngress',
'ec2:CreateFleet',
'ec2:CreateLaunchTemplate',
'ec2:CreateSecurityGroup',
'ec2:CreateTags',
'ec2:DeleteLaunchTemplate',
'ec2:DeleteSecurityGroup',
'ec2:DeleteTags',
'ec2:DescribeAccountAttributes',
'ec2:DescribeAddresses',
'ec2:DescribeAvailabilityZones',
'ec2:DescribeCoipPools',
'ec2:DescribeImages',
'ec2:DescribeInstanceTypeOfferings',
'ec2:DescribeInstanceTypes',
'ec2:DescribeInstances',
'ec2:DescribeInternetGateways',
'ec2:DescribeIpamPools',
'ec2:DescribeLaunchTemplates',
'ec2:DescribeNetworkInterfaces',
'ec2:DescribeRouteTables',
'ec2:DescribeSecurityGroups',
'ec2:DescribeSpotPriceHistory',
'ec2:DescribeSubnets',
'ec2:DescribeTags',
'ec2:DescribeVpcPeeringConnections',
'ec2:DescribeVpcs',
'ec2:GetCoipPoolUsage',
'ec2:GetSecurityGroupsForVpc',
'ec2:RunInstances',
'ec2:TerminateInstances',
'elasticloadbalancing:AddListenerCertificates',
'elasticloadbalancing:AddTags',
'elasticloadbalancing:CreateListener',
'elasticloadbalancing:CreateLoadBalancer',
'elasticloadbalancing:CreateRule',
'elasticloadbalancing:CreateTargetGroup',
'elasticloadbalancing:DeleteListener',
'elasticloadbalancing:DeleteLoadBalancer',
'elasticloadbalancing:DeleteRule',
'elasticloadbalancing:DeleteTargetGroup',
'elasticloadbalancing:DeregisterTargets',
'elasticloadbalancing:DescribeCapacityReservation',
'elasticloadbalancing:DescribeListenerAttributes',
'elasticloadbalancing:DescribeListenerCertificates',
'elasticloadbalancing:DescribeListeners',
'elasticloadbalancing:DescribeLoadBalancerAttributes',
'elasticloadbalancing:DescribeLoadBalancers',
'elasticloadbalancing:DescribeRules',
'elasticloadbalancing:DescribeSSLPolicies',
'elasticloadbalancing:DescribeTags',
'elasticloadbalancing:DescribeTargetGroupAttributes',
'elasticloadbalancing:DescribeTargetGroups',
'elasticloadbalancing:DescribeTargetHealth',
'elasticloadbalancing:DescribeTrustStores',
'elasticloadbalancing:ModifyListener',
'elasticloadbalancing:ModifyListenerAttributes',
'elasticloadbalancing:ModifyLoadBalancerAttributes',
'elasticloadbalancing:ModifyRule',
'elasticloadbalancing:ModifyTargetGroup',
'elasticloadbalancing:ModifyTargetGroupAttributes',
'elasticloadbalancing:RegisterTargets',
'elasticloadbalancing:RemoveListenerCertificates',
'elasticloadbalancing:RemoveTags',
'elasticloadbalancing:SetIpAddressType',
'elasticloadbalancing:SetRulePriorities',
'elasticloadbalancing:SetSecurityGroups',
'elasticloadbalancing:SetSubnets',
'elasticloadbalancing:SetWebAcl',
'elasticloadbalancing:ModifyCapacityReservation',
'elasticloadbalancing:ModifyIpPools',
'iam:AddRoleToInstanceProfile',
'iam:CreateInstanceProfile',
'iam:CreateServiceLinkedRole',
'iam:DeleteInstanceProfile',
'iam:GetServerCertificate',
'iam:ListServerCertificates',
'iam:PassRole',
'iam:RemoveRoleFromInstanceProfile',
'ssm:GetParameter*',
'waf-regional:AssociateWebACL',
'waf-regional:DisassociateWebACL',
'waf-regional:GetWebACL',
'waf-regional:GetWebACLForResource',
'wafv2:AssociateWebACL',
'wafv2:DisassociateWebACL',
'wafv2:GetWebACL',
'wafv2:GetWebACLForResource',
'shield:CreateProtection',
'shield:DeleteProtection',
'shield:DescribeProtection',
'shield:GetSubscriptionState',
'eks-auth:AssumeRoleForPodIdentity',
'sts:AssumeRole', 
'sts:TagSession

Karpenter policies

'ec2:CreateLaunchTemplate',
'ec2:CreateFleet',
'ec2:CreateTags',
'ec2:DeleteTags',
'ec2:DescribeAvailabilityZones',
'ec2:DescribeCapacityReservations',
'ec2:DescribeInstances',
'ec2:DescribeImages',
'ec2:DescribeInstanceTypeOfferings',
'ec2:DescribeInstanceTypes',
'ec2:DescribeLaunchTemplates',
'ec2:DescribeSecurityGroups',
'ec2:DescribeSpotPriceHistory',
'ec2:DescribeSubnets',
'ec2:DeleteLaunchTemplate',
'ec2:RunInstances',
'ec2:TerminateInstances',
'eks-auth:AssumeRoleForPodIdentity',
'iam:AddRoleToInstanceProfile',
'iam:CreateInstanceProfile',
'iam:DeleteInstanceProfile',
'iam:GetInstanceProfile',
'iam:PassRole',
'iam:RemoveRoleFromInstanceProfile',
'iam:TagInstanceProfile',
'ssm:GetParameter*',
'pricing:GetProducts',
'sqs:DeleteMessage',
'sqs:GetQueueUrl',
'sqs:ReceiveMessage'

Akis Papaditsas - Monex Insight