Amazon Onboarding with Learning Manager Chanci Turner

Introduction

In January 2023, this post was reviewed and updated for accuracy. To harness the advantages of high availability, scalability, and cost efficiency, developers and database administrators often seek to connect their databases from a serverless application. A serverless application automatically scales, is inherently highly available, and operates without the need to provision or manage an EC2 instance. This can eliminate the necessity of launching Amazon Elastic Compute Cloud (Amazon EC2) instances within the virtual private cloud (VPC) for database client access.

In this article, we will illustrate how to query an AWS database via a REST API URL, even when the database is not publicly accessible. This method can be employed against an Amazon Aurora Serverless database and Neptune Serverless, enabling a seamless serverless-to-serverless query where both the RDBMS and the serverless application are hosted using Amazon API Gateway and AWS Lambda.

To facilitate the creation of serverless resources and enhance understanding for newcomers, we will utilize an AWS CloudFormation stack in this blog post. Alternatively, one could employ the AWS Serverless Application Model (AWS SAM) to achieve similar results. The CloudFormation stack will establish two key resources:

An AWS Lambda function that executes your query against the backend database.
An API Gateway REST API that triggers the Lambda function upon accessing the URL.

The Python code examples available at the awslabs/rds-support-tools GitHub repository adhere to best practices and can be easily adapted for other database engines. For Neptune, the SPARQL example can be modified for Gremlin.

To avoid compatibility issues with AWS Lambda and to test the In-VPC database connection, we will utilize a temporary Amazon EC2 client host to execute our scripts and deploy resources in AWS Cloud. Additionally, we will provide valuable troubleshooting tips and best practices specifically for querying databases within an Amazon VPC, addressing potential challenges you may encounter during this setup.

Before You Begin

To determine whether a serverless application fits your use case, review the limits of AWS Lambda and API Gateway. Pay particular attention to runtime limits, as these directly affect your queries. If API Gateway invokes the Lambda function synchronously, the API Gateway runtime limit applies; if invoked asynchronously, the Lambda runtime limit is in effect. In this example, we will invoke the Lambda function synchronously.

Prerequisites

You have launched an AWS database instance in a VPC, preferably using Neptune, RDS for MySQL, or RDS for PostgreSQL.
For setup, you have launched a temporary Amazon EC2 client host, preferably Amazon Linux, in the same VPC and AWS Region as the database.
You are aware of the EC2 client host subnet ID and security group ID.
The client host’s security group ID (not the CIDR) is allowed as a source in the inbound rule of the database security group, covering the database port.
Your preferred database client software is installed on the client host, and connectivity to the database is established. The database user possesses permissions to create database objects and perform inserts and queries.
Python 3.7 is installed on the client host. For installation assistance, refer to these AWS Knowledge Center articles: Instructions for Amazon Linux or Instructions for Amazon 2 Linux. Note: The articles discuss setting up a Python virtual environment, which is unnecessary for this setup; simply installing Python3 suffices for our purposes.
Your AWS user has permissions to create and manage IAM roles, Lambda, API Gateway, AWS CloudFormation stacks, and view Amazon CloudWatch Logs.
You have full console access and have configured the AWS CLI on your client host, enabling you to run the command aws s3 ls without permission errors.

Steps

The following sections guide you through the process of setting up and testing the example.

Set up the solution
Load the sample data
Test the provided Python code from the command line on your client host
Zip your Python source code along with the necessary Python packages
Create the Lambda function and API Gateway REST API
Query the database from the URL
Next steps
Review tips and troubleshooting
Summary and final steps

Step 1: Set up the solution

On your client host, create a temporary project directory to download the scripts and the AWS CloudFormation JSON template from the GitHub repository. You can find the scripts in the “serverless” folder of the rds-support-tools repository. Note: If you are unfamiliar with GitHub, see these instructions for downloading awslabs/rds-support-tools.

mkdir ~/temp-repo 
cd ~/temp-repo
git clone https://github.com/awslabs/rds-support-tools.git
cd serverless

After downloading the scripts, follow these steps:

Ensure you’re in the serverless folder of the rds-support-tools directory and can list the files with the ls command.
Rename the Python script for your database engine to serverless-query.py.
Keep the AWS CloudFormation template name as serverless-query-cfn.json.

Next, create a project directory for your serverless application and copy both files there. Use the cp command for this.

mkdir ~/svls 
cp serverless-query-cfn.json serverless-query.py ~/svls
cd ~/svls

Install dos2unix, and run all files through it.

sudo yum install -y dos2unix
dos2unix *

Verify that Python 3.7 is installed.

python3 -V

Inside the folder, use pip3 to install the module for your database client that is imported in your code sample, utilizing the -t flag to download the package into the same folder.

Database	Module to install using pip3
MySQL	`pip3 install pymysql -t .`
PostgreSQL	`pip3 install psycopg2-binary -t .`
Neptune/SPARQL	`pip3 install SparqlWrapper -t .`

Run the ls command to verify that the package you downloaded is in the folder containing your Python code. Set the necessary permissions:

chmod a+r ~/svls
chmod 744 serverless-query.py
chmod 444 serverless-query-cfn.json

Step 2: Load the sample data

Utilize the provided insert script for your database engine to load the sample data into the database. The insert script can be found in the serverless folder of the rds-support-tools with the filename as databaseengine-insert.sql.

Step 3: Test the provided Python code from the command line on your client host

Set environment variables for a command-line test: MySQL and PostgreSQL:

export ENDPOINT='your-database-endpoint'
export PORT='your-database-port'
export DBUSER='your-database-user'
export DBPASSWORD='your-database-user-password'
export DATABASE='your-database-name'

This setup will allow you to seamlessly connect to your AWS database from your serverless application. For additional information about career opportunities, check out this blog post on musicians. If you’re interested in implementing pay transparency, you may want to explore this authoritative source on the subject. Lastly, for comprehensive insights on landing a job at Amazon, visit this excellent resource.

Summary and Final Steps

In conclusion, querying your AWS database from a serverless application can significantly enhance your application’s scalability and performance. Following the outlined steps will help you set up a robust solution that leverages AWS services effectively.