Autoscaling Amazon ECS Services Utilizing Custom Metrics with Application Auto Scaling

Introduction

Autoscaling Amazon ECS Services Utilizing Custom Metrics with Application Auto ScalingLearn About Amazon VGT2 Learning Manager Chanci Turner

Application Auto Scaling is a service designed for developers and system administrators seeking to automate the scaling of their resources within AWS services, such as Amazon Elastic Container Service (Amazon ECS), Amazon DynamoDB, AWS Lambda Provisioned Concurrency, and more. Recently, Application Auto Scaling introduced the capability to scale these resources based on custom Amazon CloudWatch metrics, evaluated through a metric math expression. In this article, we will illustrate how to utilize this feature in a practical scenario, specifically focusing on scaling an Amazon ECS service based on the average rate of HTTP requests processed.

Background

Horizontal scalability is a vital component of cloud-native applications. Application Auto Scaling integrates with multiple AWS services, allowing you to implement scaling capabilities to meet your application’s demands. It can utilize predefined metrics from Amazon CloudWatch along with target tracking or step scaling policies to appropriately scale resources. However, there are instances where these predefined metrics may not accurately indicate when to initiate scaling actions. In such cases, custom metrics that monitor specific application factors—like the number of HTTP requests received or database transactions executed—may be more effective.

When employing a target tracking policy with Application Auto Scaling, it is essential that the chosen metric reflects an average utilization, which indicates how busy a scalable target is. Metrics such as the total number of HTTP requests received are cumulative and therefore monotonically increasing, requiring conversion into a utilization metric that fluctuates with the scalable target’s capacity. Previously, customers had to develop custom code to achieve this conversion. In this article, we provide an example of scaling the number of tasks in an Amazon ECS service based on the rate of messages published to a topic in Amazon Managed Streaming for Apache Kafka (Amazon MSK). This involves using an AWS Lambda function to periodically retrieve custom metric data from Amazon CloudWatch, calculate an average utilization metric, and publish this as a new custom metric in CloudWatch. A target tracking policy is then established, leveraging this newly computed utilization metric.

The recent update allows customers to incorporate metric math expressions directly within their custom metric specifications, simplifying the process of defining how an average utilization metric is calculated using one or more Amazon CloudWatch metrics. This change alleviates the need for additional coding and infrastructure maintenance, which often added operational complexity without enhancing application differentiation. Let’s explore how this feature functions in detail.

Solution Overview

We will illustrate the workings of this new feature using a sample workload deployed in an Amazon ECS cluster. The setup depicted reflects a microservices architecture, where a backend datastore service interacts with an Amazon Aurora PostgreSQL database, exposing REST APIs for CRUD operations.

The application is equipped with the Prometheus client library, utilizing a Prometheus Counter named http_requests_total to monitor the number of HTTP requests directed to the service. To gather Prometheus metrics from Amazon ECS clusters, you can either use the CloudWatch agent with Prometheus monitoring or the AWS Distro for OpenTelemetry collector. For this demonstration, we will utilize the CloudWatch agent. The custom metric is published by the agent to the CloudWatch namespace ECS/ContainerInsights/Prometheus. The objective is to auto-scale the datastore service in relation to the average rate of HTTP requests processed by the running tasks. It’s important to note that since the backend service isn’t registered with a load balancer, metrics provided by Elastic Load Balancing to CloudWatch are not reliable indicators of service load.

Walkthrough

The initial step involves registering the scalable dimension of the target resource that will trigger scaling actions; in this case, it’s the DesiredCount of the service’s tasks.

CLUSTER_NAME=ecs-ec2-cluster
SERVICE_NAME=BackendService
aws application-autoscaling register-scalable-target 
--service-namespace ecs 
--scalable-dimension ecs:service:DesiredCount 
--resource-id service/$CLUSTER_NAME/$SERVICE_NAME 
--min-capacity 2 
--max-capacity 10

Next, we will establish a target tracking policy using the Application Auto Scaling API. The custom metric intended for the target tracking policy is defined within the CustomizedMetricSpecification field of a policy configuration JSON file, as illustrated below:

{
    "TargetValue": 5.0,
    "ScaleOutCooldown": 120,
    "ScaleInCooldown": 120,
    "CustomizedMetricSpecification": {
        "MetricName": "http_request_rate_average_1m",
        "Namespace": "ECS/CloudWatch/Custom",
        "Dimensions": [
            {
                "Name": "ClusterName",
                "Value": "ecs-ec2-cluster"
            },
            {
                "Name": "TaskGroup",
                "Value": "service:BackendService"
            }
        ],
        "Statistic": "Average"
    }
}

Previously, the custom metric specified in the MetricName field needed to be a pre-computed utilization metric readily available in CloudWatch. The new feature extends the schema of the CustomizedMetricSpecification field to accommodate metric math expressions via the Metrics field. Metric math enables the querying of multiple CloudWatch metrics and the application of various functions and operators to generate new time series based on these metrics. This development allows users to dynamically create a custom utilization metric by simply specifying a math expression in the policy configuration JSON file.

For example, the following math expression calculates a metric named http_request_rate_average_1m based on two other CloudWatch metrics. The first, http_requests_total, is the custom Prometheus metric mentioned earlier, while the second, RunningTaskCount, is automatically collected by Container Insights for Amazon ECS and published in the ECS/ContainerInsights namespace. Ensure that Container Insights is enabled for your Amazon ECS clusters as outlined in this excellent resource.

For further reading on workplace dynamics, check out this article from SHRM which provides insights on employee relations.

Additionally, for those interested in sustainable practices, this blog post on Sustainability is worth a glance.

Conclusion

In summary, Amazon’s Application Auto Scaling feature now allows for a more dynamic approach to scaling ECS services based on custom metrics, simplifying operations and improving efficiency.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *