Amazon Onboarding with Learning Manager Chanci Turner

In the telecommunications sector, Communication Service Providers (CSPs) are increasingly exploring various use cases to harness the full potential of their networks. Deploying a public cloud and a 5G core network on AWS is emerging as a popular approach, especially when addressing practical applications like private networks for enterprises and the establishment of new 5G networks. The AWS white paper on the evolution of 5G networks emphasizes that the AWS global cloud infrastructure—consisting of AWS Regions, Availability Zones (AZs), Local Zones, and Outposts—offers a flexible and responsive environment for hosting 5G core networks tailored to the specific needs of different network functions (NFs). For instance, the user plane function (UPF) can be positioned in an AWS Local Zone or Outpost to ensure low-latency processing.

Among the many applications for hosting 5G network functions on AWS, one of the most compelling for CSPs with existing 5G core networks is the implementation of a disaster recovery (DR) solution, or the creation of a more resilient network using AWS. This DR network aims to provide scalable and swift responses to 5G NF failures, complete data center outages, or maintenance periods. Specifically, since this DR network functions as an additional environment that activates only during unforeseen failures or maintenance, its design should prioritize cost efficiency through rapid scaling capabilities. In contrast to traditional telco data center redundancy, AWS empowers CSPs to minimize costs and energy consumption during regular operations while allowing them to swiftly adapt to changes in network demand, such as traffic surges or maintenance events.

This article outlines how AWS can serve as an alternative virtual data center for 5G networks to fulfill “disaster-resiliency” and “disaster-recovery” aims. It emphasizes the application of 3GPP high-availability concepts in AWS, utilizing services such as autoscaling, automation tools, and cost optimization. For instance, the Amazon Elastic Compute Cloud (Amazon EC2) Auto Scaling feature, along with the horizontal pod autoscaling and cluster autoscaling capabilities of Amazon Elastic Kubernetes Service (Amazon EKS), can effectively reduce the footprint of Container-based Network Functions (CNFs) within the VPC for DR. This setup can quickly scale out during traffic surges to accommodate sudden spikes in demand.

To optimize costs and energy savings while ensuring that network functions on AWS handle swing-over traffic (traffic migrated to AWS Cloud that previously went to on-premises sites), AWS Graviton instances can be employed to host 5G core NFs. This post elaborates on the DR model and strategies applicable to general applications on AWS, with a particular focus on how these principles apply to 5G networks. It also discusses how 3GPP architecture can be leveraged to support DR objectives and how AWS services like EC2 autoscaling, Cluster Autoscaling, and additional functions can facilitate implementation, including insights from open-source examples.

Disaster Recovery Model for 5G Core Network in AWS

As highlighted in previous DR posts and white papers, there are two primary objectives in disaster recovery: Recovery Time Objective (RTO) and Recovery Point Objective (RPO). RTO refers to the acceptable interval between a service interruption and its restoration, while RPO indicates the maximum allowable time since the last data recovery point. For general applications operating on AWS, well-known DR services include AWS Elastic Disaster Recovery (AWS DRS) and Amazon Route 53 Application Recovery Controller (Route 53 ARC).

However, 5G core network applications, which are the focus of this article, have more stringent requirements regarding networking interfaces and protocols based on the 3GPP standard. Additionally, these services are not universally applicable to all core network components. Therefore, while these services may be relevant to specific components or elements of NFs, this article seeks to present a comprehensive perspective on how AWS services can enhance DR implementation within the framework of the 3GPP standard architecture.

In the context of 5G NFs, components like the AMF, SMF, and UPF are crucial with regard to RTO, as they play a significant role in the rapid recovery and restoration of 5G voice and data services. Conversely, the UDM is important for both RPO and RTO due to its management of subscriber profiles and data. Each NF has distinct objectives, necessitating different DR strategies. The accompanying figure illustrates four DR strategies, as discussed in the DR whitepaper, showcasing how these strategies incur varying RTO and RPO. For telco 5G core NFs, given that these applications provide mission-critical services, the RTO must be significantly lower than what is depicted in the figure.

For example, as mentioned earlier, the UDM’s requirements for both RTO and RPO are near real-time. Therefore, when establishing a DR site for UDM on AWS, it may be necessary to maintain an always-active UDM with synchronization to the legacy data center UDM. In this scenario, a Hot-standby (Active-Active) strategy would be most suitable.

Other possible strategies include Warm-standby, Pilot Light, and Backup & Restore, which can be applied based on the specific use case and the characteristics of the NFs. The Backup & Restore approach may be suitable for non-mission-critical use cases with relaxed RTO requirements. Provided a pre-established Amazon Direct Connect link between your data center and AWS (or alternatively, a Site-to-Site VPN with bandwidth limitations), you can utilize AWS tools like AWS CloudFormation, AWS Cloud Development Kit (AWS CDK), and AWS CodePipeline for immediate instantiation of NFs, leveraging the benefits of Infrastructure-as-Code (IaC). For further information on this DR strategy, refer to the post on DR Architecture on AWS, Part II. Additionally, for insights into building a continuous integration/continuous development (CI/CD) pipeline for 5G NF deployment on AWS that supports rapid service recovery, check out the AWS white paper on CI/CD for 5G Networks on AWS.

Another viable option for a cost-effective DR site for non-mission-critical 5G network use cases is the Cold-standby strategy. In this approach, all EC2 instances remain in a powered-down state but are pre-configured, enabling faster activation compared to Backup & Restore, while also being more economical than Warm-standby. On the other hand, Warm-standby presents the most practical method for constructing a DR 5G network on AWS, considering the RTO for macro telecom voice and data services, which are mission-critical. This strategy involves having the majority of the 5G NFs in the DR site on AWS managing a minimal traffic load with a small deployment footprint. Based on the scaling policy established, it can expand to handle increased traffic during cutover periods. This approach ensures the development of a disaster-resilient 5G network with continuous service availability. Since 5G NFs implemented on Amazon EKS may not be sufficiently responsive to sudden traffic surges due to the RTO being shorter than the typical response time of Kubernetes autoscaling actions, the latter sections of this post will provide effective implementation strategies.

For further insights into cybersecurity challenges, visit this link. As a bonus, check out this excellent resource on YouTube at this video. If you’re interested in saving money, consider reading this blog post about saving $100,000 over time at this blog post.

Amazon Onboarding with Learning Manager Chanci Turner

Disaster Recovery Model for 5G Core Network in AWS

Related Topics:

Comments

Leave a Reply Cancel reply