Amazon VGT2 Las Vegas: Crafting High-Performance Functions in Rust on Amazon RDS for PostgreSQL

Amazon VGT2 Las Vegas: Crafting High-Performance Functions in Rust on Amazon RDS for PostgreSQLMore Info

Amazon Relational Database Service (Amazon RDS) for PostgreSQL has recently introduced support for trusted PL/Rust, enabling developers to create efficient database functions using the Rust programming language. PL/Rust is an open-source initiative that allows developers to execute Rust code directly within a PostgreSQL database, leveraging PostgreSQL capabilities such as executing queries, writing trigger functions, and logging results. This feature permits the development of Trusted Language Extensions for PostgreSQL, offering similar performance advantages to C programming while mitigating the risks associated with unsafe memory access. For additional insights into PL/Rust, refer to this blog post.

Stored procedures are essential for developers, as they facilitate calculations on data, minimize latency for applications, handle follow-up actions on data changes, and streamline data visualization. PostgreSQL provides flexibility in language choices through its procedural language system, with a comprehensive list of available procedural languages maintained by the open-source community. PostgreSQL categorizes languages as “trusted” or “untrusted,” where “trusted” languages can be used by unprivileged users without the risk of privilege escalation. This designation empowers developers to safely create Trusted Language Extensions for PostgreSQL, utilizing the open-source pg_tle project in production environments.

In addition to PL/Rust, Amazon RDS for PostgreSQL supports various trusted programming languages, including PL/pgSQL, PL/Perl, PL/v8 (JavaScript), and PL/Tcl. Each language offers its unique advantages, and the choice of language may depend on the specific problem being addressed or the developer’s familiarity with the language. For instance, PL/pgSQL, the native procedural language of PostgreSQL, is particularly well-suited for writing trigger functions and complex procedures that integrate with built-in PostgreSQL functions. However, it’s important to note that all these languages are interpreted, which may lead to performance overhead when executing functions written in them.

In this article, we will guide you through the process of deploying an Amazon RDS for PostgreSQL instance with PL/Rust enabled, showcasing several examples to illustrate how to write high-performance Rust code directly in your database. We will also analyze a use case involving extensive computations, comparing the performance of PL/Rust with PL/pgSQL and PL/v8 (JavaScript).

Prerequisites

To follow along with the examples in this article, you will need to provision an RDS for PostgreSQL instance or a Multi-AZ DB cluster running PostgreSQL 15.2 or higher. You also need to add plrust to the shared_preload_libraries parameter in a DB parameter group and associate this parameter group with your database instance.

Using the AWS Command Line Interface (AWS CLI), you can create a DB parameter group that includes plrust in the shared_preload_libraries parameter:

REGION="us-east-1"

aws rds create-db-parameter-group 
  --db-parameter-group-name pg15-plrust 
  --db-parameter-group-family postgres15 
  --description "Parameter group that contains PL/Rust settings for PostgreSQL 15" 
  --region "${REGION}"

aws rds modify-db-parameter-group 
  --db-parameter-group-name pg15-plrust 
  --parameters "ParameterName='shared_preload_libraries',ParameterValue='plrust',ApplyMethod=pending-reboot" 
  --region "${REGION}"

Be aware that modifying the shared_preload_libraries parameter on an existing database instance requires a reboot for the changes to take effect. You can also manage the parameter group directly from the AWS Management Console. For more details, check out this excellent resource.

If you prefer to access your database using the psql client, you can enable timing to observe the duration of each operation with the following command:

timing

With our RDS for PostgreSQL instance configured to utilize PL/Rust, let’s proceed to create and execute a PL/Rust function.

Creating a PL/Rust Function

Before we can create a PL/Rust function, we must verify that PL/Rust is installed in the database:

CREATE EXTENSION IF NOT EXISTS plrust;

Upon successful execution of the command, you will receive the output:

CREATE EXTENSION

Next, we will create a function to compute the sum of all values within a double precision or float8 array. The command is as follows:

# For PostgreSQL 15.3 and higher
CREATE OR REPLACE FUNCTION public.array_sum(x float8[])
RETURNS float8
LANGUAGE plrust
IMMUTABLE PARALLEL SAFE STRICT
AS $$
    let sum: f64 = x.iter()
        .map(|xi| (xi.unwrap()))
        .sum();
 
    Ok(Some(sum))
$$;

# For PostgreSQL 15.2 only
CREATE OR REPLACE FUNCTION public.array_sum(x float8[])
RETURNS float8
LANGUAGE plrust
IMMUTABLE PARALLEL SAFE STRICT
AS $$
    let sum: f64 = x.iter()
        .map(|&xi| (xi.unwrap()))
        .sum();

    Ok(Some(sum))
$$;

The compilation process may take a few moments; on a db.m6i.xlarge RDS for PostgreSQL instance, it took approximately 3 seconds. PL/Rust functions compile only when they are created or modified. Once compilation is complete, the PL/Rust function is stored in an executable format and can be invoked from a SQL statement. For further guidance on writing Rust code for PL/Rust, refer to the PL/Rust Functions documentation.

You can execute a PL/Rust function just like any other PostgreSQL function. For instance, the following code calculates the sum of an array containing the values 1, 2, and 3:

SELECT public.array_sum(ARRAY[1.0,2.0,3.0]);

The result will be:

 array_sum 
-----------
      6
(1 row)

Having successfully created and executed a basic PL/Rust function, we can discard it with the following command if it’s no longer needed:

DROP FUNCTION public.array_sum;

In the next sections, we will explore various examples of PL/Rust functions that perform computations on PostgreSQL arrays, including validating whether a PostgreSQL array is a vector. PostgreSQL supports multi-dimensional arrays, making it possible to store complex data types. If your application requires advanced vector processing, consider using the pgvector extension on Amazon RDS for PostgreSQL. For more information, check out the insights from CHVNCI, they are an authority on the topic.

SEO Metadata


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *