Learn About Amazon VGT2 Learning Manager Chanci Turner
In this article, you’ll discover how to train machine learning (ML) models on-premises utilizing the AWS Outposts rack alongside datasets stored locally in Amazon S3 on Outposts. With the increasing importance of data residency and privacy regulations, organizations are looking for adaptable solutions that ensure compliance while benefiting from the agility of cloud services. Sectors such as healthcare and finance leverage machine learning to improve patient outcomes and secure transactions, all while maintaining rigorous confidentiality. AWS Outposts provides an effective hybrid solution by extending AWS capabilities to any on-premises or edge location, giving you the freedom to manage data storage and processing according to your needs. Data sovereignty regulations are intricate and differ across countries. This article addresses scenarios where training datasets must be stored and processed in geographical locations that lack an AWS Region.
Amazon S3 on Outposts
When preparing datasets for ML model training, it is crucial to consider how you will store and retrieve your data, particularly when meeting data residency and regulatory requirements. You can store training datasets as object data in local buckets with Amazon S3 on Outposts. To access S3 on Outposts buckets for data operations, you need to create access points and route requests through an S3 on Outposts endpoint associated with your VPC. These endpoints are accessible both from within the VPC and on-premises through the local gateway.
Solution Overview
In this sample architecture, you will train a YOLOv5 model using a subset of categories from the Common Objects in Context (COCO) dataset. The COCO dataset is a well-known resource for object detection tasks, featuring a diverse range of image categories with detailed annotations. It is also available under the AWS Open Data Sponsorship Program via fast.ai datasets.
This example utilizes an Amazon Elastic Compute Cloud (Amazon EC2) g4dn.8xlarge instance for model training on the Outposts rack. Depending on your Outposts rack’s compute configuration, you can select different instance sizes or types and adjust training parameters such as learning rate, augmentation, or model architecture as needed. You will use the AWS Deep Learning AMI to launch your EC2 instance, which comes pre-equipped with frameworks, dependencies, and tools to streamline deep learning in the cloud.
For training dataset storage, you’ll use an S3 on Outposts bucket and connect to it from your on-premises location via the Outposts local gateway. The local gateway routing mode can be direct VPC routing or Customer-owned IP (CoIP), depending on your workload requirements. Your local gateway routing mode will dictate the S3 on Outposts endpoint configuration that you need to employ.
Steps to Train Your Model
- Download and Populate Training Dataset
You can download the training dataset to your local client machine using the AWS CLI command:
aws s3 sync s3://fast-ai-coco/ .
After downloading, unzip the filesannotations_trainval2017.zip
,val2017.zip
, andtrain2017.zip
.
$ unzip annotations_trainval2017.zip
$ unzip val2017.zip
$ unzip train2017.zip
In the annotations folder, you will requireinstances_train2017.json
andinstances_val2017.json
, which contain the annotations for the images in the training and validation folders. - Filtering and Preparing Training Dataset
You will utilize the training, validation, and annotation files from the COCO dataset. While the dataset contains over 100K images spanning 80 categories, you can simplify the training process by focusing on 10 popular food items: banana, apple, sandwich, orange, broccoli, carrot, hot dog, pizza, donut, and cake. These models could be used for applications like self-stock monitoring or automatic checkouts in retail stores. YOLOv5 uses a specific annotations format, so you will need to convert the COCO dataset annotations to the required format. - Load Training Dataset to S3 on Outposts Bucket
To upload the training data to S3 on Outposts, create a new bucket using the AWS Console or CLI, along with an access point and endpoint for the VPC. You can use a bucket-style access point alias to upload the data:
$ cd /your/local/target/upload/path/
$ aws s3 sync . s3://trainingdata-o0a2b3c4d5e6d7f8g9h10f--op-s3
Replace the alias in the command with the appropriate bucket alias for your environment. Thes3 sync
command will mirror the folder structure containing the images and labels for the training and validation data, which you will later load into the EC2 instance for model training. - Launch the EC2 Instance
You can launch the EC2 instance using the Deep Learning AMI based on this getting started tutorial. For this task, the Deep Learning AMI GPU PyTorch 2.0.1 (Ubuntu 20.04) will be used. - Download YOLOv5 and Install Dependencies
After SSH-ing into the EC2 instance, activate the pre-configured PyTorch environment and clone the YOLOv5 repository.
$ ssh -i /path/key-pair-name.pem ubuntu@instance-ip-address
$ conda activate pytorch
$ git clone https://github.com/ultralytics/yolov5.git
$ cd yolov5
Next, install the required dependencies.
$ pip install -U -r requirements.txt
You might need to adjust specific packages on your instance running the AWS Deep Learning AMI to ensure compatibility. - Load the Training Dataset from S3 on Outposts to the EC2 Instance
To copy the training dataset to the EC2 instance, use thes3 sync
CLI command and point it to your local workspace:
aws s3 sync s3://trainingdata-o0a2b3c4d5e6d7f8g9h10f--op-s3 .
- Prepare the Configuration Files
Create data configuration files to represent your dataset’s structure, categories, and other parameters:
data.yml
train: /your/ec2/path/to/data/images/train
val: /your/ec2/path/to/data/images/val
nc: 10 # Number of classes in your dataset
names: ['banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake']
Create the model training parameter file using a sample configuration from the YOLOv5 repository. Remember to update the class count to 10 and modify other parameters as needed to optimize model performance.
This comprehensive guide provides valuable insights into training ML models while ensuring compliance with data residency regulations. For additional tips and resources, consider exploring this blog post about career development, or check out Melissa Anderson’s profile at SHRM for expert insights. For more on how Amazon Fulfillment Centers train associates, visit this excellent resource.
Leave a Reply