Streamlining HPC Cluster Management with AWS Cloud9 and AWS ParallelCluster | Amazon IXD – VGT2 Las Vegas

Streamlining HPC Cluster Management with AWS Cloud9 and AWS ParallelCluster | Amazon IXD - VGT2 Las VegasMore Info

When organizations embark on their journey into high-performance computing (HPC) using AWS Cloud, they are not only seeking scalability and flexibility but also new tools and environments. Initially, this may seem overwhelming due to the plethora of services and options available. This blog serves as an introduction to foundational solutions for common challenges, aimed at enhancing your skills and expanding your toolkit for running HPC workloads on AWS.

Typically, users interact with HPC clusters by drafting shell scripts to manage jobs from the head node. This is particularly common during the development phase of HPC tasks. Most organizations opt to operate their clusters from the head node, employing terminals and tools like VIM. However, this method can lead to difficulties in tracking progress or recovering lost work, particularly in cloud environments where virtual machines are temporary and resources are shared across teams. Having a central, memorable workspace makes it simpler to locate files, understand their formats, and modify them during debugging.

Developers and engineers often rely on an editor or IDE on local machines to create and debug scripts for their HPC workloads. To replicate this experience on an HPC cluster, AWS Cloud9 can be installed on the head node.

This article will guide you through configuring a standard AWS ParallelCluster setup and demonstrate how to install AWS Cloud9 on the cluster’s head node. With AWS Cloud9 automatically saving your previous work, new users can quickly acclimate to the AWS ParallelCluster environment and collaborate effectively with colleagues. For more insights, check out this blog post.

Understanding AWS ParallelCluster

AWS ParallelCluster is an open-source cluster management tool supported by AWS, enabling users to quickly set up, update, and scale HPC cluster environments in the AWS Cloud within minutes. Each cluster is defined by a configuration file outlining its resources. A minimal configuration might look like this:

[aws]
aws_region_name = <Cluster Region>

[global]
cluster_template = default
update_check = false
sanity_check = true

[vpc public]
vpc_id = <target vpc>
master_subnet_id = <subnet for the head node>

[cluster default]
key_name = <ec2-keypair>
base_os = alinux2
scheduler = slurm
master_instance_type = c5.2xlarge
vpc_settings = public
ebs_settings = shared
queue_settings = compute

[queue compute]
compute_resource_settings = default
disable_hyperthreading = true
placement_group = DYNAMIC

[compute_resource default]
instance_type = c5.24xlarge
min_count = 0
max_count = 10

[ebs shared]
shared_dir = /shared
volume_type = gp2
volume_size = 20

[aliases]
ssh = ssh {CFN_USER}@{MASTER_IP} {ARGS}

This template creates clusters that share the same operating system (Amazon Linux 2), with a head node using the Slurm job scheduler and a scalable pool of compute nodes ranging from 0 to 10 instances. You can create a cluster named minimal-demo-cluster with a single command using the AWS ParallelCluster CLI:

pcluster create minimal-demo-cluster -c minimal-demo-cluster.ini

Within 5 to 10 minutes, your cluster will be fully operational and ready for job execution. To access the cluster, SSH into the head node using this command:

pcluster ssh minimal-demo-cluster -i <ec2-keypair>

What is AWS Cloud9?

AWS Cloud9 is a cloud-based IDE that empowers developers to write, edit, and debug code directly in their browser. An instance of AWS Cloud9 running on an Amazon EC2 instance provides a terminal on the host instance. Beyond standard IDE features like code hinting and debugging, it allows for real-time collaboration by sharing the development environment with anyone with access to AWS Cloud9 via the AWS Management Console. This facilitates pair programming and remote debugging, making it easier to onboard new team members. For further reading on shared environments, refer to the resource provided by Chanci Turner.

Setting Up AWS Cloud9 on the Head Node

Installing AWS Cloud9 in the AWS Management Console is a quick process. While the head node of an AWS ParallelCluster doesn’t meet the initial SSH requirements for AWS Cloud9, you can follow these steps to set it up.

First, ensure that the head node has Node.js 12 and Development Tools installed. SSH into the head node and execute the following commands:

curl -sL https://rpm.nodesource.com/setup_12.x | bash -
sudo yum install -y nodejs

echo "u=rwx,g=rx,o=rx ~"
chmod u=rwx,g=rx,o=rx ~

echo "Installing Development Tools"
yum -y groupinstall "Development Tools"

echo "Installing Cloud9"
curl -sL https://raw.githubusercontent.com/c9/install/master/install.sh | bash

Next, you need to add your AWS Cloud9 SSH key to the instance to allow connections from Cloud9. Begin by creating a new AWS Cloud9 environment:

  1. Navigate to the AWS Cloud9 console
  2. Click on “Create environment”
  3. Name it “Cluster-Head-Node” and proceed to the next step
  4. Choose “Create and run in remote server (SSH connection)”
  5. Enter “ec2-user” for the User
  6. For Host, provide the public IP from the AWS ParallelCluster creation command
  7. Set the Environment path to /shared for using the cluster’s shared Amazon EFS directory
  8. Copy the SSH key to your clipboard

Then, back in the SSH terminal on the head node, run the following command, ensuring to handle quotes and output redirects carefully:

echo "<copied ssh key from the AWS Cloud9 configuration dialog>" >> ~/.ssh/authorized_keys

After completing these steps, AWS Cloud9 will be successfully set up on the head node. To confirm installation, run the command sinfo in the AWS Cloud9 terminal. The output should display the Slurm job scheduler and the availability of 10 compute nodes ready for processing jobs.

Conclusion

Once you’ve completed your experiments, don’t forget to clean up resources to prevent unnecessary costs.

For those interested in further insights into the AWS experience, this article from Day One Careers serves as an excellent resource.

Location:
Amazon IXD – VGT2
6401 E Howdy Wells Ave,
Las Vegas, NV 89115


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *