Enhancing Efficiency and Minimizing Costs with Availability Zone Affinity | Amazon IXD – VGT2 Las Vegas Blog

Enhancing Efficiency and Minimizing Costs with Availability Zone Affinity | Amazon IXD - VGT2 Las Vegas BlogMore Info

Updated in April 2025: This blog post has been revised to incorporate the latest features in Elastic Load Balancing (ELB).

A key strategy for constructing resilient systems within Amazon Virtual Private Cloud (VPC) networks involves leveraging multiple Availability Zones (AZs). An AZ consists of one or more distinct data centers equipped with redundant power, networking, and connectivity. By utilizing multiple AZs, you can run workloads that offer heightened availability, fault tolerance, and scalability, which would be unattainable with a single data center. Nevertheless, transferring data between AZs incurs additional latency and cost.

This post explores an architectural approach known as Availability Zone Affinity, which enhances performance and reduces expenses while preserving the advantages of multi-AZ architectures.

Impact of Cross Availability Zone Transfers

AZs are strategically located at a considerable distance from one another within the same AWS Region, typically within 60 miles (100 kilometers). This separation generally results in single-digit millisecond roundtrip latency between AZs in the region. When instances communicate within the same AZ, roundtrip latency can often be below a millisecond with enhanced networking. This latency can be even lower when instances are organized in cluster placement groups. Furthermore, transferring data between AZs incurs charges in both directions.

To illustrate these effects, let’s examine a hypothetical workload, the “bar service,” depicted in Figure 1. The bar service serves as a storage platform for other AWS workloads to redundantly store data. Requests are initially processed by an Application Load Balancer (ALB). By default, ALBs utilize cross-zone load balancing to evenly distribute requests among all targets. The request is then directed from the load balancer to a request router that performs tasks such as authorization checks and input validation before sending it to the storage tier. The storage tier sequentially replicates data from the lead node to the middle node and finally to the tail node. Once data is written to all three nodes, it is considered committed. The response travels back from the tail node to the request router, through the load balancer, and finally returns to the client.

In the worst-case scenario, as shown in Figure 1, the request crosses an AZ boundary eight times. Let’s calculate the best-case (zeroth percentile, p0) latency. Assuming non-network processing time in the load balancer, request router, and storage tier is 4 ms, and adding 1 ms for each AZ traversal, the total processing time cannot be less than 12 ms. For the 50th percentile (p50), assuming cross-AZ latency is 1.5 ms and non-network processing takes 8 ms, the overall processing time amounts to 20 ms. If the system processes millions of requests, data transfer charges could escalate significantly over time. Now, let’s consider how the bar service can modify its system design to meet a p50 latency requirement under 20 ms.

Availability Zone Affinity

The AZ Affinity architectural pattern minimizes the number of times an AZ boundary is crossed. In the example system illustrated in Figure 1, AZ Affinity can be implemented through two changes.

First, the ALB’s target group is adjusted to disable cross-zone load balancing. This ensures that requests are directed only to targets residing in the same AZ as the ALB node handling the request. Second, clients send requests to the ALB node in their AZ by utilizing zonal DNS names. ELB provides these zonal DNS records for NLBs and ALBs, such as us-east-1a.my-load-balancer-1234567890abcdef.elb.us-east-1.amazonaws.com. You can create your DNS entries pointing to these zonal records, potentially incorporating the AZ ID in the name, like use1-az1.bar.com, to consistently target the correct zonal endpoint across AWS accounts.

Figure 2 illustrates the system with AZ Affinity. In this implementation, each request, at worst, crosses an AZ boundary only four times. Data transfer costs decrease by roughly 40 percent compared to the previous approach. With a p50 latency of 300 μs for intra-AZ communication, the overall latency is now calculated as (4×300μs)+(4×1.5ms)=7.2ms. When combined with the median processing time of 8 ms, the total median latency is reduced to 15.2 ms, representing a 40 percent decrease in median network latency. For p90, p99, or even p99.9 latencies, this reduction can be even more pronounced.

You can also apply this pattern using Network Load Balancers (NLB). NLBs have cross-zone load balancing disabled by default and offer a feature called Availability Zone DNS affinity. When enabled, client DNS queries are prioritized toward NLB IP addresses within their AZ, enhancing both latency and resilience since clients do not have to traverse AZ boundaries when connecting to targets. Clients do not need to alter the DNS names they use to access your service.

Figure 3 demonstrates an advanced approach utilizing service discovery. Rather than requiring the client to memorize AZ-specific DNS names for load balancers, AWS Cloud Map can be employed for service discovery. AWS Cloud Map is a fully managed service enabling clients to look up IP addresses and port combinations of service instances using DNS, retrieving abstract endpoints like URLs through the HTTP-based service discovery API. Service discovery can minimize the necessity for load balancers, thus eliminating their associated costs and latency.

Clients first obtain details about service instances in their AZ from the AWS Cloud Map registry, filtered to their AZ by specifying an optional parameter. They then utilize that information to direct requests to the discovered request routers.

Ensuring Workload Resiliency

In the new architecture with AZ Affinity, clients must select which AZ to communicate with. As they are “pinned” to a single AZ rather than being load balanced across multiple AZs, they could experience effects during an event impacting AWS infrastructure or the bar service in that AZ.

During such events, clients may opt to use retries with exponential backoff to navigate transient issues or redirect requests to unaffected AZs. By utilizing the NLB AZ DNS affinity feature, queries can automatically resolve to other zones if there are no healthy Network Load Balancer IP addresses in their zone. Both methods provide multi-AZ resilience while capitalizing on AZ Affinity during regular operations.

Client Libraries

The most straightforward way to implement service discovery, along with retries and failover, is to offer a client library/SDK. This library manages the underlying logic for users, making the process seamless, similar to what the AWS SDK or CLI accomplishes. Users can choose between low-level APIs and high-level libraries.

Conclusion

This blog has explored how the AZ Affinity pattern effectively reduces latency and data transfer expenses for multi-AZ systems, while ensuring high availability. To delve deeper into data transfer costs, check out this blog post and explore insights from experts in this domain. Additionally, for a comprehensive understanding of the automation processes used for training Amazon warehouse workers, refer to this excellent resource.

Location: Amazon IXD – VGT2, 6401 E Howdy Wells Ave, Las Vegas, NV 89115.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *