Amazon VGT2 Las Vegas

In the realm of artificial intelligence, creating a machine learning-driven REST API utilizing Amazon API Gateway’s mapping templates alongside Amazon SageMaker can greatly enhance user experiences. Organizations can leverage SageMaker to build, train, and deploy machine learning models that personalize interactions, such as tailored product recommendations based on user preferences. A critical architectural element in these applications is how to expose the runtime inference endpoint to client software on consumer devices. Typically, this endpoint is incorporated into a broader API, often following a RESTful architecture that provides a comprehensive suite of functions for client applications.

Amazon API Gateway is a fully managed service that simplifies the creation, publication, maintenance, monitoring, and security of APIs on any scale. By presenting an external-facing single entry point for Amazon SageMaker endpoints, API Gateway offers numerous advantages:

It abstracts the complexities of the underlying implementation, translating between a client-facing REST API and the Amazon SageMaker runtime inference API.
It facilitates authentication and authorization for client requests.
It manages clientele requests through throttling, rate-limiting, and quota management.
It employs AWS WAF firewall features for enhanced security.
It promotes cost-efficiency and operational optimization via caching and request validation.
It allows for safer canary deployments to introduce model changes gradually.

This article will explore how API Gateway can be utilized to expose an Amazon SageMaker inference endpoint as part of a REST API, leveraging a feature called mapping templates. This functionality allows for direct integration with an Amazon SageMaker runtime endpoint, eliminating the need for any intermediary compute resources, such as AWS Lambda or Amazon ECS containers. The outcome is a streamlined, faster, and cost-effective solution.

This direct integration approach proves particularly beneficial for consumer applications experiencing high peak traffic volumes. Amazon SageMaker automatically scales its inference endpoints, and API Gateway adjusts to match demand, ensuring that sufficient capacity is always available for incoming requests while only charging for actual usage.

While this article focuses on direct integration with Amazon SageMaker, mapping templates can also be employed in conjunction with an intermediate compute layer (for instance, using AWS Lambda), allowing for the reduction of load on the compute layer by reshaping payloads directly at the gateway.

In this walkthrough, I will demonstrate the process starting from the deployment of an Amazon SageMaker model endpoint, progressing to the creation of an Amazon API Gateway integration with this endpoint.

Solution Overview

To illustrate, this article presents a use case where a TV application requests ratings predictions for a selection of movies. Each time the app displays a movie page, it should prioritize showing films with higher predicted ratings. In this example, both users and movies are identified by unique numeric IDs, and predicted ratings range from 1 to 5, with higher values indicating a greater likelihood of user enjoyment.

Architecture

The architecture diagram outlines the key components and interactions within the solution. End-users interact with a client application (via a web browser or mobile device), which sends a REST-style request to an API Gateway endpoint. API Gateway translates this request into the required format for the Amazon SageMaker endpoint and invokes it to retrieve the model’s inference. The response from the SageMaker endpoint is then mapped back to a format suitable for the client.

Request and Response Formats

The REST API is designed to support a single resource (predicted-ratings) and utilizes a GET method. The request follows this format:

GET /<api-path>/predicted-ratings/{user_id}?items=id1,id2,…,idn&

Here, the user_id path parameter indicates the user for whom ratings are requested, while the items query string parameter contains a comma-separated list of item identifiers.

Upon successful processing, the HTTP response returns a code 200, with a JSON object in the body that lists the predicted ratings for the specified items:

{
  "ratings": [
    rating_1,
    rating_2,
    …,
    rating_n
  ]
}

For instance, a request could be made as follows:

% curl "https://<api-path>/predicted-ratings/321?items=101,131,162&"

This would yield a response similar to:

{
  "ratings": [
    3.527499437332153,    
    3.951640844345093,    
    3.658416271209717    
  ]
}

Input and Output Formats for Amazon SageMaker Model

The rating prediction solution is based on a sample model provided with Amazon SageMaker, specifically: object2vec_movie_recommendation.ipynb. This model’s inference endpoint accepts a POST method and expects the request body to include a JSON payload structured as follows:

{
  "instances": [
    {"in0": [863], "in1": [882]},
    {"in0": [278], "in1": [311]},
    {"in0": [705], "in1": [578]},
    …
  ]
}

In this structure, in0 corresponds to a user ID, while in1 corresponds to a movie ID. The model’s inference produces the following output:

{
  "predictions": [
    {"scores": [3.047305107116699]},
    {"scores": [3.58427882194519]},
    {"scores": [4.356469631195068]},
    …
  ]
}

Mapping templates are utilized to convert the GET request format received by the REST API into the POST input format expected by the Amazon SageMaker model endpoint and to transform the model output format back into the response format needed by the REST API.

API Gateway Method Integration

When creating a REST API using API Gateway, various configuration models must be defined for each API method:

Method Request: Defines the data model for the REST request format and specifies validation and authorization checks for incoming requests.
Integration Request: Specifies how the REST request and its parameters map to the format required by the backend service endpoint (in this case, the Amazon SageMaker inference endpoint).
Integration Response: Outlines how the response from the backend service (including potential errors) maps to the response format expected by API Gateway.
Method Response: Defines the data model and response format anticipated by the REST API.

The accompanying diagram illustrates the processing flow, showing how an example request and response are transformed at each stage.

Mapping Templates

As part of the integrations for both requests and responses, mapping templates are defined using the Apache Velocity Template Language (VTL). Initially designed for web development, VTL can also serve as a data transformation tool to convert one JSON format into another. With mapping templates, VTL can effectively convert the REST request into the required format for the model endpoint.

For further insights on the topic, you can check out this other blog post. Additionally, for authoritative information on machine learning, visit this resource. If you’re interested in career opportunities, this position could be an excellent resource.

Solution Overview

Architecture

Request and Response Formats

Input and Output Formats for Amazon SageMaker Model

API Gateway Method Integration

Mapping Templates

Related Topics:

Comments

Leave a Reply Cancel reply