Exploring Replication Features in Amazon Aurora PostgreSQL

Critical applications with a global presence, such as those in finance, travel, or gaming, have stringent requirements for availability and disaster recovery, often needing to withstand outages that affect entire regions. Traditionally, achieving this balance has involved complex trade-offs concerning performance, availability, cost, and Recovery Point Objectives (RPO) along with Recovery Time Objectives (RTO). In some cases, regular maintenance tasks like database upgrades can pose challenges if significant downtime is required. Additionally, users may demand low-latency access to their data irrespective of location, necessitating uninterrupted availability even during planned maintenance activities such as upgrades.

This article will delve into the various replication features available with Amazon Aurora PostgreSQL-Compatible Edition, equipping your applications to withstand region-wide outages and ensuring continuous operation.

Amazon Aurora is a cloud-optimized relational database that supports MySQL and PostgreSQL standards, merging the performance and reliability of traditional enterprise databases with the simplicity and cost-effectiveness of open-source options. Database replication involves sharing data to maintain consistency across redundant databases, thereby enhancing reliability and fault tolerance. In this article, we will explore the replication features of Aurora PostgreSQL.

High Availability and Durability

Aurora is engineered to deliver over 99.99% availability, creating six copies of your data across three Availability Zones while continuously backing it up to Amazon Simple Storage Service (Amazon S3). The system automatically recovers from physical storage or instance failures, and the failover process to a read replica typically takes under 30 seconds in the event of an instance failure.

Replication Options in Aurora

Aurora provides multiple replication options. By default, each Aurora database cluster generates six copies of data across three Availability Zones at the storage level. When replicating data into or out of an Aurora cluster, you can opt for features such as Amazon Aurora global database or the native engine-specific replication methods available for MySQL or PostgreSQL. You can select the most suitable options based on your requirements for high availability and performance. The following sections outline how and when to utilize each technique.

Replication with Aurora PostgreSQL

For replication in Aurora PostgreSQL, your options include:

Native PostgreSQL logical replication
Logical replication via the pglogical extension
Physical replication using the Aurora global database

Native PostgreSQL Logical Replication

Logical replication allows for the replication of data objects and their changes based on their replication identity, often a primary key. This term is used in contrast to physical replication, which uses exact block addresses for replication. Logical replication follows a publish and subscribe model, where one or more subscribers receive data from one or more publications on a publisher node. Subscribers pull data from the publications they subscribe to and can also re-publish data, enabling cascading replication or more intricate configurations. This feature was introduced in PostgreSQL 10.

Typically, logical replication starts by taking a snapshot of the data on the publisher database and copying it to the subscriber. Changes on the publisher are then sent to the subscriber in real-time. The subscriber applies the data in the same sequence to ensure transactional consistency.

Use Cases for Native PostgreSQL Logical Replication

Here are common scenarios for using logical replication:

Migrating data from on-premises or self-managed PostgreSQL environments to Aurora PostgreSQL. For more details, see Migrating PostgreSQL from on-premises or Amazon EC2 to Amazon RDS using logical replication.
Replicating data between two Aurora PostgreSQL clusters within the same Region, aimed at ensuring high availability (HA) and disaster recovery (DR). In an HA/DR context, the target system mirrors the source system. If the source system fails, the application server redirects to the target, which takes over and starts replicating back to the original source.
Conducting database upgrades or application upgrades with minimal downtime. AWS Database Migration Service (AWS DMS) utilizes PostgreSQL logical replication for nearly real-time data synchronization between major versions. For more details, see Achieving minimum downtime for major version upgrades in Amazon Aurora for PostgreSQL using AWS DMS.
Implementing role-based access control which allows different user groups access to specific subsets of replicated data. For example, application developers, data scientists, database administrators, DevOps professionals, and business analysts may require varying access levels for their daily tasks. Offloading this access to a secondary database can effectively reduce the load on the primary database.
Replicating data into big data systems or data lakes. For instance, AWS DMS can be employed to replicate transactional data into a data warehouse (Amazon Redshift) or data lake (Amazon S3), supporting real-time dashboards, data visualizations, big data processing, real-time analytics, and machine learning.

Limitations

Currently, logical replication has the following limitations, which may be addressed in future updates:

The replication of database schema and Data Definition Language (DDL) commands such as CREATE, ALTER, and DROP isn’t supported. The initial schema can be copied manually using pg_dump --schema-only, but subsequent changes will need to be synchronized manually.
Sequence data isn’t replicated. While data in serial or identity columns linked to sequences is replicated, the sequence itself retains its start value on the subscriber. If a failover to the subscriber database is necessary, those sequences must be updated to the latest values, either by copying data from the publisher (using pg_dump) or determining a sufficiently high value from the tables.
TRUNCATE command replication is supported starting in PostgreSQL 11. However, caution is advised when truncating tables connected by foreign keys. When a truncate command is replicated, the subscriber truncates the same group of tables as the publisher, minus any tables not part of the subscription. This can lead to issues if affected tables have foreign key relationships with tables outside the subscription.
Large objects are not replicated. PostgreSQL supports two methods for storing Binary Large Objects (BLOBs): bytea type and OID type (large objects). While BYTEA types are replicated, OID types, which provide stream-style access to user data, are not.

In conclusion, understanding the replication capabilities of Amazon Aurora PostgreSQL is critical for ensuring the resilience and availability of your databases. For more insights, you may want to explore this blog post on mentorship opportunities, or check out this excellent resource for insights into working at Amazon.