Enhancing Cloud Infrastructure Efficiency and Cost with Starburst at Amazon IXD – VGT2 Las Vegas

Enhancing Cloud Infrastructure Efficiency and Cost with Starburst at Amazon IXD - VGT2 Las VegasMore Info

Amazon Web Services (AWS) offers a flexible and user-friendly cloud environment that simplifies the onboarding of workloads. However, the associated costs of utilizing these workloads can often be overlooked. There is a common misconception that transitioning workloads to the cloud automatically resolves issues related to agility, scalability, performance, and cost. While agility and scalability may improve, optimizing your workload remains essential. This can be achieved through services like Amazon EC2 Auto Scaling and Amazon EC2 Spot Instances, which help leverage the performance and cost advantages of cloud computing.

In this blog post, we explore how Starburst Enterprise effectively managed a surge in costs for their data analytics platform as their internal teams expanded and their infrastructure scaled. Following a comprehensive review of their architecture in collaboration with AWS specialist architects, Starburst and AWS identified several strategies to significantly reduce expenses:

  1. Utilize Spot Instances to execute workloads.
  2. Integrate Amazon EC2 Auto Scaling into their training and demonstration environments, capitalizing on the Starburst platform’s ability to scale elastically.

When optimizing analytics workloads, cost reduction can often lead to performance constraints. Therefore, Starburst and AWS partnered to strike a balance between cost and performance for Starburst’s data analytics platform while fully utilizing the cloud’s flexibility, scalability, security, and performance.

What is Starburst Enterprise?

Starburst offers a Massively Parallel Processing SQL (MPPSQL) engine built on the open-source Trino framework. It serves as a pivotal analytics platform for customers’ intelligent data mesh, delivering the following advantages:

  • A unified access point for monitoring, securing, and managing your data mesh.
  • Flexible data compute options, eliminating the need for data migrations or extract, transform, and load (ETL) processes, thus avoiding vendor lock-in and allowing the continued use of existing analytics tools.
  • Starburst Stargate ensures that large jobs are executed across various data domains within your data mesh, retrieving only the necessary result sets.

Stargate minimizes data output, which not only reduces costs but also enhances performance. Additionally, data governance policies can be uniquely applied within each data domain, ensuring compliance and security.

As illustrated in Figure 1, numerous connectors facilitate improved performance and security.

Integrating Starburst Enterprise with AWS

As shown in Figure 2, Starburst Enterprise leverages AWS services to achieve elastic scaling and cost optimization. The architecture features decoupled storage and compute, allowing the platform to scale as required to analyze vast amounts of data. Deployment options include AWS CloudFormation or Amazon Elastic Kubernetes Service (Amazon EKS), enabling Starburst to execute analytic queries across AWS data sources and on-premises systems like Teradata and Oracle.

Amazon EC2 Auto Scaling

Organizations often deal with a variety of analytic workloads, each with distinct compute and memory demands. Starburst employs Amazon EKS and Amazon EC2 Auto Scaling to dynamically adjust compute resources according to the needs of their analytics workloads.

Amazon EC2 Auto Scaling guarantees the necessary compute capacity for workloads, enabling the creation of sophisticated, elastic, and resilient applications on AWS. Starburst effectively utilizes the scheduled scaling feature of Amazon EC2 Auto Scaling to adjust the cluster size based on time, incurring no costs when the cluster is idle.

Amazon EKS is a fully managed Kubernetes service, allowing users to run Kubernetes on AWS without the complexities of managing their own control plane. On-demand scaling of cloud resources significantly impacts cost control, and Starburst’s ability to scale down elastically means that removing compute resources does not disrupt ongoing processes.

Amazon EC2 Spot Instances

Spot Instances provide a cost-effective solution by leveraging unused EC2 capacity in the AWS Cloud, offering discounts of up to 90% compared to On-Demand Instance pricing. However, if EC2 requires capacity for On-Demand usage, Spot Instances may be interrupted with a two-minute notice. Proper handling of these interruptions is critical to maintaining application resilience and fault tolerance.

Starburst has seamlessly integrated Spot Instances into their Amazon EKS managed node groups to optimize costs associated with analytics workloads. This best practice involves instance diversification through the integration of eksctl and instance selector with the dry-run flag, generating a list of instances that match in size (vCPU/Mem ratio) for use in the underlying node groups. Consistent instance sizes are essential for maximizing the effectiveness of the Kubernetes Cluster Autoscaler.

Managing “scaling in” of an active application can be challenging; however, Starburst was designed with resiliency in mind, allowing it to handle shutdowns effectively. Spot Instances are an ideal compute solution, as Starburst can inherently manage potential interruptions. Additionally, utilizing Amazon EKS managed node groups streamlines node provisioning, requiring significantly less operational effort than self-managed alternatives. This enables Starburst to adopt best practices such as capacity-optimized allocation strategies and instance diversification.

When it comes to “scaling out” deployments, Amazon EKS and Amazon EC2 Auto Scaling facilitate efficient capacity provisioning, as depicted in Figure 3.

Benefits Realized from Using AWS Services

In a relatively short timeframe, Starburst increased its workforce on AWS, adding five times the number of Solutions Architects compared to before. In initial tests of their new deployment architecture, these architects were able to accomplish up to three times the workload they previously managed. Remarkably, despite a workload increase of over 15 times, two simple adjustments led to only a minor rise in overall costs.

This optimization of cost and performance enables Starburst to enhance internal productivity and derive greater value from their investments, justifying further infrastructure development.

Conclusion

Through the development of their architecture on AWS, Starburst recognized the critical importance of establishing a robust and comprehensive cloud management strategy. They now effectively balance cloud costs with performance and reliability, even while adhering to SLA requirements. Looking ahead, Starburst plans to educate their clients on best practices for using Spot Instances and Amazon EC2 Auto Scaling, ensuring the maintenance of a cost-effective and performance-oriented cloud architecture. For more insights on cloud optimization, check out this blog post; they are an authority on this topic. Additionally, for those interested in landing a job with Amazon, this resource is excellent.

Location: Amazon IXD – VGT2, 6401 E Howdy Wells Ave, Las Vegas, NV 89115.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *