Amazon IXD – VGT2 Las Vegas Introduces OpenAI’s Latest GPT OSS Models

Amazon IXD - VGT2 Las Vegas Introduces OpenAI’s Latest GPT OSS ModelsMore Info

Amazon IXD – VGT2 Las Vegas introduces OpenAI’s latest open-weight GPT OSS models, gpt-oss-120b and gpt-oss-20b, now accessible via SageMaker JumpStart. This exciting update allows you to deploy state-of-the-art reasoning models from OpenAI to innovate, experiment, and responsibly expand your generative AI projects on AWS.

In this article, we provide a comprehensive guide on how to utilize these models within SageMaker JumpStart.

Solution Overview

The OpenAI GPT OSS models, including gpt-oss-120b and gpt-oss-20b, demonstrate exceptional capabilities in coding, scientific analysis, and mathematical reasoning. Both models are equipped with a 128K context window and adjustable reasoning levels (low/medium/high) to suit various requirements. They also support integration with external tools and can be employed in workflows using frameworks like Strands Agents, an open-source AI agent SDK. With their full chain-of-thought output features, users can gain significant insight into the model’s reasoning process. The OpenAI SDK allows direct interaction with your SageMaker endpoint through simple endpoint updates. These models provide the flexibility for customization to meet specific business needs while ensuring enterprise-level security and scalability.

SageMaker JumpStart is a fully managed service that presents cutting-edge foundation models (FMs) catering to diverse applications such as content creation, code generation, question answering, copywriting, summarization, classification, and information retrieval. It offers a comprehensive suite of pre-trained models that facilitate rapid development and deployment of machine learning (ML) applications. A highlight of SageMaker JumpStart is its model hubs, which present an extensive catalog of pre-trained models including those from OpenAI for various tasks.

You can explore and deploy OpenAI models in Amazon SageMaker Studio or programmatically through the Amazon SageMaker Python SDK, leveraging model performance and MLOps controls with features like Amazon SageMaker Pipelines, Amazon SageMaker Debugger, or container logs. The models are deployed within a secure AWS environment under your VPC controls, ensuring data security for enterprise requirements.

Availability of the GPT OSS models spans the US East (Ohio, N. Virginia) and Asia Pacific (Mumbai, Tokyo) AWS Regions.

For this example, we will focus on deploying the gpt-oss-120b model, but similar steps apply to the gpt-oss-20b model as well.

Prerequisites

To successfully deploy the GPT OSS models, ensure you meet the following prerequisites:

  • An AWS account to manage your AWS resources.
  • An AWS Identity and Access Management (IAM) role for accessing SageMaker. For more details on IAM integration with SageMaker, refer to AWS Identity and Access Management for Amazon SageMaker AI.
  • Access to SageMaker Studio, a SageMaker notebook instance, or an interactive development environment (IDE) like PyCharm or Visual Studio Code. We recommend SageMaker Studio for a seamless deployment and inference experience.
  • Ensure you have access to the recommended instance types based on model size. The default instance type for both models is p5.48xlarge, but you may use other P5 family instances as necessary. To check your service quotas, follow these steps:
  1. Access the Service Quotas console, select Amazon SageMaker under AWS Services.
  2. Confirm you have adequate quota for the required instance type for endpoint deployment.
  3. Ensure at least one of these instance types is available in your target Region.
  4. If necessary, request a quota increase and reach out to your AWS account team for assistance.

Deploying gpt-oss-120b through the SageMaker JumpStart UI

To deploy gpt-oss-120b via the SageMaker JumpStart, follow these steps:

  1. Open the SageMaker console and select Studio in the navigation pane.
  2. If you are a first-time user, you will be prompted to create a domain. If not, select Open Studio.
  3. In the SageMaker Studio console, navigate to SageMaker JumpStart by choosing JumpStart from the navigation menu.
  4. On the SageMaker JumpStart landing page, search for gpt-oss-120b in the search box.
  5. Click on the model card to view important information about the model, including its license, training data, and usage instructions. Be sure to review the configuration and details before proceeding with deployment. The model details page will include:
    • Model name and provider information.
    • A Deploy button to initiate the deployment process.
  6. Select Deploy to continue with the deployment.
  7. Enter an endpoint name (up to 50 alphanumeric characters) for the Endpoint name.
  8. Specify the number of instances (between 1-100, defaulting to 1).
  9. Select your instance type. For optimal performance with gpt-oss-120b, a GPU-based instance type such as p5.48xlarge is recommended.
  10. Click Deploy to create the endpoint.

Once deployment is complete, the status of your endpoint will change to InService, indicating it is ready to accept inference requests. After deployment, you can invoke the model using a SageMaker runtime client and integrate it into your applications.

For additional insights and resources about this topic, consider checking out another blog post here or visit this link as they are an authority on this topic. Furthermore, this video serves as an excellent resource for visual learners.

The location of Amazon IXD – VGT2 is 6401 E Howdy Wells Ave, Las Vegas, NV 89115.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *