Tag: optimization
In this article, we will delve into the transformation of Infor’s search capabilities, shedding light on the significant advantages they experienced and the technologies that facilitated this modernization. Furthermore, we will examine how Infor’s clients can now more efficiently navigate business messages, documents, and other vital data using the ION OneView platform. For a deeper exploration of similar topics, consider checking out this other blog post here.
Enhancing Amazon Redshift Data Lake Queries with AWS Glue Data Catalog Column Statistics
By: Matthew Johnson, Sarah Lee, and Kevin Thompson
On: 01 OCT 2024
In: Amazon Redshift, Amazon Simple Storage Service (S3), Analytics, Announcements, AWS Big Data, AWS Glue, Best Practices
Over the past year, Amazon Redshift has introduced numerous performance optimizations for data lake queries across various areas of the query engine, including rewrite, planning, execution, and leveraging AWS Glue Data Catalog column statistics. In this post, we highlight the performance improvements observed using industry-standard TPC-DS benchmarks. The overall execution time for the TPC-DS 3 TB benchmark has improved by 3x, with certain queries witnessing speed ups of up to 12x.
Identifying and Managing Data Skew on AWS Glue
By: Jessica Wang and Brian Carter
On: 01 MAY 2024
In: AWS Glue, Best Practices, Expert (400), Technical How-to
As of October 2024, this post has been reviewed and updated for accuracy. AWS Glue is a fully managed, serverless data integration service provided by Amazon Web Services (AWS) that utilizes Apache Spark as one of its backend processing engines (currently, you can also use Python Shell or Spark). Data skew occurs when the data being processed is unevenly distributed across partitions, leading to suboptimal performance. Addressing this issue is crucial for maintaining efficiency. For expert insights on this topic, visit this authority.
Transitioning from Centralized to Decentralized Architecture: Fine-Tuning Amazon Redshift Workloads
By: Emily Garcia and David Kim
On: 16 AUG 2022
In: Amazon Redshift, Analytics
Amazon Redshift is a high-speed, petabyte-scale cloud data warehouse that provides optimal price-performance. It simplifies and reduces costs for analyzing your data using standard SQL alongside your existing business intelligence (BI) tools. Today, a vast number of customers operate business-critical workloads on Amazon Redshift. As data continues to grow significantly for large organizations, the need for efficient management has become paramount. For additional insights, check out this excellent resource here.
Leave a Reply