For over a decade, customers have successfully executed Microsoft workloads on Amazon Web Services (AWS), establishing it as the leading cloud provider in this domain. Many users have expressed interest in utilizing scalable AWS infrastructure for high-performance computing (HPC) simulations that operate on the Windows operating system. If you’ve ever attempted to configure an HPC cluster manually, you may have encountered obstacles such as insufficient documentation, Active Directory domain joining complications, networking topology issues, and a series of tedious deployment steps. Moreover, HPC in the cloud typically operates differently from traditional HPC setups. Instead of relying on long-term, static compute clusters, AWS allows for elastic clusters that can be quickly provisioned to run simulations and dismantled once the task is complete.
To streamline the setup process for an HPC cluster designed for Windows HPC workloads, we have created an AWS CloudFormation template that automates the deployment of an HPC Pack 2019 Windows cluster. This tool enables you to rapidly initiate Windows-based HPC workloads while leveraging AWS’s scalable, resilient, and secure infrastructure.
In this blog post, we will illustrate how to execute a Windows HPC workload on AWS using EnergyPlus, an open-source energy simulation tool managed by the U.S. Department of Energy’s Building Technology Office. EnergyPlus is essential for modeling energy consumption in buildings.
This article will cover the solution, how to implement it, and how to execute a sample parametric sweep job using EnergyPlus.
Solution Overview
This solution facilitates the installation of HPC Pack 2019 to manage a cluster of Windows Servers dedicated to High Performance Computing. The AWS CloudFormation template will launch a Windows-based HPC cluster using Windows Server 2016, along with core infrastructure components such as Amazon Virtual Private Cloud (VPC) and Active Directory Domain Controllers through AWS Managed Microsoft AD. From a security standpoint, while the head node is publicly accessible, the compute nodes are secured within private subnets. Input files are retrieved from EnergyPlus, and output files are stored in Amazon Simple Storage Service (S3). Compute nodes can access Amazon S3 via an S3 Gateway Endpoint, which enhances performance and cost efficiency by ensuring private connectivity to S3 instead of routing through a NAT Gateway.
AWS Secrets Manager is utilized to securely store the configuration details for your HPC Pack cluster, including the necessary information for integrating the compute nodes with Active Directory and the password for the associated certificate. By using Secrets Manager for sensitive information, compute nodes can dynamically access this data when they join the cluster without retaining it on the instance itself.
The cluster generated by the CloudFormation template is optimized for loosely coupled workloads. For scenarios requiring tightly coupled workloads, modifications can include implementing a shared file system like Amazon FSx for Windows File Server, along with Amazon EC2 placement groups, which can position instances closer together within an Availability Zone to achieve the low-latency network performance essential for tightly coupled communications.
The example parametric sweep job will perform simulations across the worker nodes in the cluster, initially scaling out the number of nodes and subsequently scaling them in after the task is completed.
Walkthrough
The walkthrough is divided into three sections: prerequisites, deployment of artifacts, and executing the AWS CloudFormation template to run a building energy HPC simulation.
Prerequisites
To follow this walkthrough, ensure you have the following:
- An AWS account
- Basic understanding of core AWS services – compute, networking, and storage
- Beginner to intermediate knowledge of HPC
Clone GitHub Repository
The solution is available on GitHub at this URL: https://github.com/aws-samples/aws-cfn-windows-hpc-template.
git clone https://github.com/aws-samples/aws-cfn-windows-hpc-template
Create S3 Buckets
We need to create two S3 buckets: one for deployment artifacts and another for simulation results.
- Visit the Amazon S3 Console at https://console.aws.amazon.com/s3.
- Click on Create bucket.
- On the Create bucket page:
- In the Bucket name field, input a unique name for the deployment artifacts, such as hpcpack-2019-[AWS ACCOUNT ID] to ensure global uniqueness.
- Confirm that Block all public access is selected.
- In the Default encryption section:
- Enable encryption.
- Choose Amazon S3 key (SSE-S3) as the Encryption key type.
- Click Create bucket at the bottom of the page.
- After successful creation, select Create bucket again.
- On the Create bucket page:
- Enter a unique name for the simulation results, like hpcpack-output-2019-[AWS ACCOUNT ID].
- Ensure Block all public access is selected.
- In the Default encryption section:
- Enable encryption.
- Select Amazon S3 key (SSE-S3) as the Encryption key type.
- Click Create bucket at the bottom of the screen.
Deploy Artifacts
- Download HPC Pack from Microsoft’s website (https://www.microsoft.com/en-us/download/confirmation.aspx?id=101360).
- Rename the downloaded HPC Pack file to HPCPack.zip.
- Upload artifacts to the bucket via the Console:
- Open the Amazon S3 Console at https://console.aws.amazon.com/s3.
- In the Buckets section, select the deployment bucket you created.
- Click Upload.
- Select Add Files.
- Choose HPCPack.zip, ScriptsForComputeNode2019.zip, and ScriptsForHeadNode2019.zip from the cloned repository.
- Click Upload and wait for the Upload succeeded message.
- Alternatively, upload HPCPack.zip using the CLI (optional)
Execute the following CLI commands, replacing the S3 bucket name with your chosen deployment bucket name:
aws s3 cp HPCPack.zip s3://hpcpack-2019-[AWS ACCOUNT ID]
aws s3 cp ScriptsForComputeNode2019.zip s3://hpcpack-2019-[AWS ACCOUNT ID]
aws s3 cp ScriptsForHeadNode2019.zip s3://hpcpack-2019-[AWS ACCOUNT ID]
Deploy CloudFormation Templates
Following the deployment of the artifacts, we need to set up our CloudFormation templates.
- Navigate to the CloudFormation console at https://console.aws.amazon.com/cloudformation/home.
- Click Create stack.
- Select With new resources (standard).
- On the Create stack page, proceed as follows:
- In the Specify template section, choose Upload a template file.
- Click Choose file and navigate to the cloned GitHub solution.
- Select the HPCLab2019.yml file from the HPCLab/CloudFormation directory.
- Click Open, then Next.
For further reading on effective strategies for on-boarding in Amazon, check out this excellent resource from Forbes. Additionally, if you’re interested in more detailed insights, consider reading this another blog post from Chanci Turner.
Leave a Reply