Optimizing Cost and Resilience by Integrating AWS Graviton with x86 CPUs Using Amazon EKS | Amazon VGT2 Las Vegas

Optimizing Cost and Resilience by Integrating AWS Graviton with x86 CPUs Using Amazon EKS | Amazon VGT2 Las VegasMore Info

This article is authored by Liam Carter and Mia Roberts, Cloud Solutions Architects. It explores how to incorporate AWS Graviton-based Amazon EC2 instances into an existing Amazon Elastic Kubernetes Service (Amazon EKS) deployment that operates on x86-based EC2 instances. By leveraging mixed-CPU architectures, customers can enhance application resilience and take advantage of a broader array of Amazon EC2 instance types. Prior to deploying production applications on Graviton-based instances, it is crucial to assess application performance in a testing environment. For further information, you can refer to AWS’ transition guide for moving your application to AWS Graviton.

This example illustrates the use of KEDA for managing application capacity across different CPU types in EKS. KEDA is designed to trigger deployments based on application response latency, which is monitored by the Application Load Balancer (ALB). Additionally, the article discusses Karpenter, an open-source Kubernetes node provisioning tool, along with the AWS Load Balancer Controller to streamline resource allocation.

Solution Overview

Two configurations are presented for testing a mixed-CPU application. The first setup, known as the “A/B Configuration” (illustrated in Figure 1), utilizes an ALB-based Ingress to manage traffic between x86 and Graviton-based node pools. This configuration allows for a gradual migration of a live application from x86 to Graviton instances, while monitoring response times via Amazon CloudWatch.

Figure 1: A/B Configuration

In the second setup, termed the “Karpenter Controlled Configuration” (depicted in Figure 2), Karpenter automates the management of instance types. It is configured with weighted provisioners that prioritize AWS Graviton-based EC2 instances over their x86 counterparts.

Figure 2: Karpenter Controlled Configuration with Weighting Provisioners

It is advisable to start with the “A/B” configuration to accurately gauge response times for live requests. Once the workload performance on Graviton instances is validated, the second configuration can be implemented to streamline deployment and enhance resilience. This setup allows your application to automatically revert to x86 instances in the event of unexpected high-demand scenarios.

For a detailed step-by-step guide, visit this GitHub resource to help explore the example application deployment discussed in this article. Below is an overview of the guide.

Code Migration to AWS Graviton

The initial step involves migrating your code from x86 to Graviton instances. AWS offers numerous resources to facilitate this migration, including the AWS Graviton Fast Start Program, AWS Graviton Technical Guide on GitHub, and the Porting Advisor for Graviton. After implementing necessary changes, recompiling your application for the Arm64 architecture may be needed. This step is essential for applications written in languages that compile to machine code, such as Golang and C/C++, or if you need to rebuild native libraries for interpreted or JIT compiled languages like Python or Java.

To ensure your containerized application operates on both x86 and Graviton nodes, you must construct OCI images for both architectures, upload them to your image repository (like Amazon ECR), and create a multi-architecture manifest list. For a comprehensive overview of these procedures, you can refer to another insightful article here.

Additionally, to simplify the process, consider using a Linux distribution package manager that accommodates cross-platform packages, avoiding the use of platform-specific package names. For instance, use:

RUN pip install httpd

instead of:

ARG ARCH=aarch64 or amd64
RUN yum install httpd.${ARCH}

This blog post dives deeper into automating the building of multi-architecture OCI images.

Application Deployment

Config 1 – A/B Controlled Topology

This topology allows for the migration to Graviton while validating application response times (approximately 300ms) across both x86 and Graviton-based instances. As depicted in Figure 1, this design features a single Listener that directs incoming requests to two Target Groups—one linked to Graviton and the other to x86 instances. The traffic distribution is defined in the Ingress configuration.

To create Config 1:

  • Establish two KEDA ScaledObjects that adjust the number of pods based on latency metrics (AWS/ApplicationELB-TargetResponseTime) pertinent to the target group. Set the maximum acceptable latency in targetMetricValue:0.3. Below is the Graviton deployment scaledObject (note the comments indicating the x86 deployment scaledObject):
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
…
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: armsimplemultiarchapp #amdsimplemultiarchapp
…
  triggers:                 
    - type: aws-cloudwatch
      metadata:
        namespace: "AWS/ApplicationELB"
        dimensionName: "LoadBalancer"
        dimensionValue: "app/simplemultiarchapp/xxxxxx"
        metricName: "TargetResponseTime"
        targetMetricValue: "0.3"

Once the topology is established, incorporate Amazon CloudWatch Container Insights to monitor CPU usage, network throughput, and instance performance. To account for potential performance discrepancies between instance generations, create two dedicated Karpenter provisioners and Kubernetes Deployments (replica sets), specifying instance generation, CPU count, and architecture for each. In this example, c7g (Graviton3) and c6i (Intel) are utilized. These constraints can be removed in the subsequent topology to allow greater flexibility.

The Karpenter provisioner for x86-based instances:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: x86provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "6"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In 
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - amd64

The Karpenter provisioner for Graviton-based instances:

apiVersion: karpenter.sh/v1alpha5
kind: Provisioner
metadata:
  name: arm64provisioner
spec:
  requirements:
  - key: karpenter.k8s.aws/instance-generation
    operator: In
    values:
    - "7"
  - key: karpenter.k8s.aws/instance-cpu
    operator: In
    values:
    - "2"
  - key: kubernetes.io/arch
    operator: In
    values:
    - arm64

Create two Kubernetes Deployment resources—one for each CPU architecture—using nodeSelector to allocate one Deployment to Graviton instances and the other to x86 instances. Similarly, create two NodePort Service resources that direct traffic to their corresponding architecture-specific ReplicaSet. Establish an Application Load Balancer using the AWS Load Balancer Controller to distribute incoming requests across the different pods. This is an excellent resource for understanding potential pitfalls in this process, as discussed here.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *