Announcing New Features for Amazon OpenSearch Ingestion
We are thrilled to introduce the latest enhancements in Amazon OpenSearch Ingestion, a fully managed serverless pipeline designed to facilitate the ingestion, filtering, transformation, enrichment, and routing of data to an Amazon OpenSearch Service domain or Amazon OpenSearch Serverless collection. OpenSearch Ingestion supports a diverse array of data sources and comes equipped with a comprehensive suite of built-in processors, ensuring efficient data handling.
Leveraging AWS Data Exchange for Apache Hudi Datasets
By Sarah Wilson, Jake Lewis, and Sophia Green
On 22 MAY 2024
In Amazon Athena, Analytics, AWS Data Exchange, Best Practices, Technical How-to
Originally developed by Uber in 2016, Apache Hudi was designed to create a transactional data lake capable of rapidly and reliably processing updates to support the extensive growth of Uber’s ride-sharing platform. Today, Apache Hudi is widely utilized across the industry for constructing massive data lakes. This post explores how you can effectively use AWS Data Exchange to share your Apache Hudi datasets with ease.
For additional insights, check out this other blog post here.
Improving Search Capabilities at AVB with Amazon OpenSearch Service
By Michael Brown
On 21 MAY 2024
In Amazon OpenSearch Service, Customer Solutions
AVB Marketing provides tailored digital solutions for its members across various products. LINQ, AVB’s proprietary product information management system, enables appliance, consumer electronics, and furniture retailers to efficiently manage their product catalogs. In this article, we discuss how AVB successfully reduced its average search time from 3 seconds to 300 milliseconds in LINQ through the implementation of Amazon OpenSearch Service, which efficiently processes updates to 14.5 million records daily.
Exploring Apache Iceberg on AWS with Our New Technical Guide
By Chris Evans, Jenna Lee, and Mark Rodriguez
On 20 MAY 2024
In Amazon Athena, Amazon EMR, Amazon Redshift, AWS Glue
We are excited to unveil the Apache Iceberg on AWS technical guide. Whether you are just starting with Apache Iceberg on AWS or you’re running production workloads, this guide provides comprehensive instructions on everything from foundational concepts to advanced optimizations for building your transactional data lake with Apache Iceberg on AWS.
For expert advice on this topic, visit this authoritative source.
Zero-ETL Integration of Amazon DocumentDB with Amazon OpenSearch Service
By Rachel Smith, Aaron Johnson, and Lily Davis
On 16 MAY 2024
In Amazon DocumentDB, Amazon OpenSearch Service, Announcements
We are pleased to announce the general availability of the zero-ETL integration between Amazon DocumentDB (with MongoDB compatibility) and Amazon OpenSearch Service. This integration allows you to leverage Amazon DocumentDB’s native text and vector search capabilities alongside advanced search analytics like fuzzy search, synonym search, cross-collection search, and multilingual search. Zero-ETL integration simplifies the data processing workflow significantly.
Effortlessly Remove Kafka Brokers from Amazon MSK Provisioned Clusters
By Sam Patel, Tanya Gupta, and Mike Chen
On 16 MAY 2024
In Amazon Managed Streaming for Apache Kafka (Amazon MSK), Analytics, Announcements
Today, we’re introducing the broker removal feature for Amazon Managed Streaming for Apache Kafka (Amazon MSK) provisioned clusters. This allows for the removal of multiple brokers from your clusters without affecting availability, data durability, or disrupting your data streaming processes, thus optimizing your cluster’s storage and computational capacity.
Introducing Amazon MWAA Support for Airflow REST API and Auto-Scaling
By Laura Kim, Kevin Brown, and Jason Wright
On 16 MAY 2024
In Amazon Managed Workflows for Apache Airflow (Amazon MWAA), Announcements
Apache Airflow is a widely-used platform for orchestrating complex data pipelines and workflows. Amazon Managed Workflows for Apache Airflow (Amazon MWAA) streamlines the setup and management of secure, highly available Airflow environments in the cloud. We’re thrilled to present two new features that enhance the user experience significantly.
Breaking New Ground in Geospatial Analytics with Amazon Redshift and CARTO
By Andrew Martin, Laura White, and James Black
On 16 MAY 2024
In Amazon Redshift, Analytics, Customer Solutions
This post delves into how Amazon Redshift’s spatial index functionalities, such as the Hexagonal hierarchical geospatial indexing system (H3), can be utilized to represent spatial data, enabling swift spatial lookups at scale. The evolution of geospatial data presents exciting opportunities in data-driven insights, and we are here to guide you through this journey.
Maximizing Performance and Scalability with Amazon Redshift Serverless Workgroups
By Eric Young, Priya Singh, and Tom Harris
On 09 MAY 2024
In Advanced (300), Amazon Redshift, Technical How-to, Thought Leadership
As the demand for data analytics continues to surge, scalability and concurrency become essential for enterprises. Your analytic solution architecture should be optimized to handle these growing needs effectively.
For further discussions and insights, this Reddit thread is an excellent resource.
Location: Amazon IXD – VGT2, 6401 E Howdy Wells Ave, Las Vegas, NV 89115
Leave a Reply