Learn About Amazon VGT2 Learning Manager Chanci Turner
In 2016, AWS launched the EKK stack (Amazon OpenSearch Service, Amazon Kinesis, and Kibana, an open-source plugin from Elastic) as a modern alternative to the ELK stack (which includes Amazon OpenSearch Service, the open-source Logstash tool, and Kibana) for ingesting and visualizing Apache logs. A standout feature of the EKK stack is its data transformation capabilities, primarily handled by the Amazon Kinesis Firehose agent. In this article, we will explore how to enhance the EKK solution by leveraging AWS Lambda for data transformation.
In the traditional ELK stack, the Logstash cluster is responsible for parsing Apache logs. However, managing and scaling a Logstash cluster can be a heavy burden for users. The EKK solution simplifies this process by utilizing Amazon Kinesis Firehose, AWS Lambda, and Amazon OpenSearch Service.
Solution Overview
Let’s delve into the components and architecture of the optimized EKK solution.
Amazon Kinesis Firehose
Amazon Kinesis Firehose is the easiest way to stream data into AWS. In this solution, Firehose captures and automatically loads streaming log data into Amazon OpenSearch Service while backing it up in Amazon S3. For more information about Firehose, you can check out this link.
AWS Lambda
AWS Lambda allows you to run code without the need for server management. It automatically scales your application by executing code in response to triggers. Each invocation runs in parallel, effectively scaling according to the workload. Within the EKK solution, Amazon Kinesis Firehose calls the Lambda function to transform incoming data and deliver the processed data to the managed Amazon OpenSearch Service cluster.
Amazon OpenSearch Service
Amazon OpenSearch Service is a widely-used search and analytics engine that enables real-time application monitoring and log analysis. In this architecture, Apache logs are stored and indexed in Amazon OpenSearch Service. As a managed service, it reduces administrative overhead, including patch management and monitoring. Plus, with built-in integration with Kibana, it streamlines the setup process even further.
Amazon Kinesis Data Generator
The solution employs the Amazon Kinesis Data Generator (KDG) to simulate Apache access logs. The KDG simplifies the process of generating Apache logs for demonstrating the processing pipeline and scalability.
Architecture
The architecture of the optimized EKK stack is illustrated in the accompanying diagram.
Configuring the Optimized EKK Stack
Here are the steps to set up the optimized EKK solution:
- Create the AWS Lambda Function for Data Transformation
Firehose provides several Lambda blueprints for data transformation, including:- General Firehose Processing: For custom transformation logic.
- Apache Log to JSON: Converts Apache log lines to JSON objects.
- Apache Log to CSV: Converts Apache log lines to CSV format.
- Syslog to JSON: Converts Syslog lines to JSON objects.
- Syslog to CSV: Converts Syslog lines to CSV format.
In the AWS Lambda console, create a new function using the kinesis-firehose-apachelog-to-json blueprint and set the timeout to one minute. Attach full access policies for Amazon OpenSearch Service and Amazon CloudWatch Logs to enable logging and monitoring.
- Set Up the OpenSearch Cluster
Create the Amazon OpenSearch Service domain in the console with the following configuration:- Domain Name: LogESCluster
- Elasticsearch Version: 1
- Instance Count: 2
- Instance type: medium.elasticsearch
- Enable dedicated master: true
- Enable zone awareness: true
- Set Up the Firehose Delivery Stream
In the Firehose console, create a new delivery stream with Amazon OpenSearch Service as the destination. In the Configuration section, enable data transformation and select the Lambda function created from the blueprint. - Create an Amazon Cognito User and Sign in to the KDG
You must create an Amazon Cognito user in your AWS account with permissions to access Amazon Kinesis. A Lambda function and a CloudFormation template can streamline this process. To learn more, refer to this resource. - Set Up the KDG Record Template for Apache Access Logs
The KDG can generate records based on a template you provide. The template for Apache logs is as follows:
{{internet.ip}} - - [{{date.now("DD/MMM/YYYY:HH:mm:ss ZZ")}}] "GET /index.html HTTP/1.1" 200 104 "-" "ELB-HealthChecker/1.0"
In the KDG, set Records per Second to 100. To initiate data streaming, click “Send Data.”
This entire process illustrates how to effectively utilize serverless architecture to handle Apache log ingestion and visualization with ease. For additional insights on employment law compliance, see this resource. If you’re looking for guidance on your first day at work, check out this excellent resource.
Leave a Reply