In Advanced (300), Amazon Aurora, PostgreSQL compatible, RDS for PostgreSQL, RDS for SQL Server, Technical How-to
When helping clients transition from SQL Server to PostgreSQL, we often face scenarios where the PIVOT function is heavily relied upon for creating dynamic reports. A previously documented approach is outlined in the Microsoft SQL Server 2019 to Amazon Aurora PostgreSQL Migration Playbook, which utilizes CASE WHEN statements for each pivoted column. However, this method has its drawbacks; it becomes challenging to maintain the code as the number of pivoted columns grows. Additionally, introducing a new column necessitates modifications to the source code.
In this article, we will demonstrate how to leverage the crosstab function, part of PostgreSQL’s tablefunc extension, to replicate the functionality of SQL Server’s PIVOT function while providing enhanced flexibility.
Overview of the Solution
This solution utilizes crosstab as a foundation to illustrate how to implement SQL Server-like PIVOT functionality through the PostgreSQL function get_dynamic_pivot_data
. The key advantage of using crosstab is its ability to dynamically create columns based on the results of the provided queries, making it adaptable for varying data sets. You will learn how to manage multiple fixed and variable columns in a pivot table, along with the associated cursor in the PostgreSQL function. This approach also incorporates handling a PostgreSQL refcursor. The function can be invoked through either psql or C#. We will guide you on how to call it with a cursor parameter using psql, a terminal-based interface for PostgreSQL. Typically, C# interacts with PostgreSQL via Npgsql, and we offer sample C# code to illustrate how to use the PostgreSQL function with Npgsql, including cursor variable management.
The diagram below depicts the solution architecture.
Prerequisites
Set up the following configurations in your AWS account and on your local machine, depending on how you plan to test the function.
Testing on the PostgreSQL Client Side
To test the function on the PostgreSQL client side, follow these steps:
- Provision Amazon Aurora PostgreSQL-Compatible Edition or Amazon RDS for PostgreSQL.
- Install a PostgreSQL client tool, such as pgAdmin on Amazon EC2 for Microsoft Windows Server, or psql on Amazon Linux 2023.
AWS CloudShell comes with the PostgreSQL client tool (psql) version 15 pre-installed. If you prefer not to provision an EC2 instance, you can directly use CloudShell to access Aurora PostgreSQL.
Testing with C#
If you choose to test using C#, follow these steps. If you do not wish to use C#, these installations are unnecessary.
- Download and install .NET SDK 8 on your EC2 instance. For example, you can check the .NET version on Windows using the command:
C:UsersAdministrator>dotnet --list-sdks
8.0.404 [C:Program Filesdotnetsdk]
C:UsersAdministrator>dotnet --version
8.0.404
- Download and install Visual Studio Code on your EC2 instance for Windows.
Utilizing the PIVOT Function on SQL Server
To transition a pivot query that includes dynamic columns, begin by creating two tables and a stored procedure on the SQL Server side. The relevant code can be found in our GitHub repository. Complete the following steps:
- Create two tables populated with sample data.
CREATE TABLE QuarterTbl (
QuarterID INT NOT NULL IDENTITY PRIMARY KEY,
QuarterItem VARCHAR(2)
);
INSERT INTO QuarterTbl (QuarterItem)
VALUES ('Q1'), ('Q2'), ('Q3'), ('Q4');
Sample data from QuarterTbl:
QuarterID QuarterItem ----------- ----------- 1 Q1 2 Q2 3 Q3 4 Q4
- Create the ProductSales table:
CREATE TABLE ProductSales (
ProductID INT NOT NULL IDENTITY PRIMARY KEY,
ProductName VARCHAR(10),
QuarterID INT,
Year VARCHAR(5),
Sales INT,
FOREIGN KEY (QuarterID) REFERENCES QuarterTbl(QuarterID)
);
INSERT INTO ProductSales (ProductName, QuarterID, Year, Sales)
VALUES
('ProductA', 1, 'Y2017', 100),
('ProductA', 2, 'Y2018', 150),
('ProductA', 2, 'Y2018', 200),
('ProductA', 1, 'Y2019', 300),
('ProductA', 2, 'Y2020', 500),
('ProductA', 3, 'Y2021', 450),
('ProductA', 1, 'Y2022', 675),
('ProductB', 2, 'Y2017', 0),
('ProductB', 1, 'Y2018', 900),
('ProductB', 3, 'Y2019', 1120),
('ProductB', 4, 'Y2020', 750),
('ProductB', 3, 'Y2021', 1500),
('ProductB', 2, 'Y2022', 1980);
The total sales figures for each quarter across each year can be displayed as follows:
SELECT
PS.ProductName,
Q.QuarterItem,
PS.Year,
SUM(PS.Sales) AS QuarterSales
FROM ProductSales AS PS
INNER JOIN QuarterTbl AS Q ON PS.QuarterID = Q.QuarterID
GROUP BY PS.ProductName, Q.QuarterItem, PS.Year
ORDER BY 1, 2, 3;
The resulting query will showcase quarterly sales data between 2017 and 2022.
In addition to the tables, create a stored procedure called GetProductSalesReport
that employs a PIVOT function and a dynamic query:
CREATE OR ALTER PROCEDURE GetProductSalesReport
@columns NVARCHAR(MAX)
AS
DECLARE @sql NVARCHAR(MAX);
SET @sql = 'SELECT * FROM
(
SELECT PS.ProductName, Q.QuarterItem, PS.Year, PS.Sales
FROM ProductSales PS
INNER JOIN QuarterTbl Q ON PS.QuarterID = Q.QuarterID
) t
PIVOT (
SUM(Sales)
FOR [Year] IN (' + @columns + ')
) AS PV
ORDER BY 1, 2;';
EXECUTE sp_executesql @sql;
GO
To obtain the pivoted results, run the stored procedure with a parameter that specifies the list of dynamic columns. For more information on this topic, check out another blog post here, which provides further insights. Also, you can gain additional knowledge from this resource, which is an excellent guide. For authoritative content, see here, they are an authority on this subject.
Leave a Reply